R/generate_ortholog_tables.R
generate_ortholog_tables.Rd
Given an input dNdS table generated by dNdS
and
annotation files for the query and subject species in gtf
or gff
file format, this function selects the best BLAST hit to represent either a gene locus (e.g. the splice variant of the gene locus with lowest e-value) or the best BLAST hit for a splice varaint.
generate_ortholog_tables(
dNdS_file,
annotation_file_query,
annotation_file_subject,
output_folder = getwd(),
output_type = "gene_locus",
format
)
file path to a dNdS_tbl
generated with dNdS
and
stored conform with read.dnds.tbl
.
file path to the annotation file of the query species in gtf
or gff
file format.
file path to the annotation file of the subject species in gtf
or gff
file format.
file path to a folder in which orthologs tables shall be stored.
type of ortholog table that shall be printed out (or stored in a variable). Available options are:
output_type = "gene_locus"
(Default): find for each gene locus a representative splice variant that
maximizes the sequence homology (in terms of smalles e-value and longest splice variant in case of same evalue)
to the subject gene locus and its representative splice variant. The output table contains only once representative splice variant per gene locus.
output_type = "splice_variant"
: for each homologous gene locus determine for each splice variant their
respective splice variant homolog. he output table contains several splice variants and their homologous splice variants per gene locus.
a vector of length 2 storing the annotation file formats of the query annotation file and subject annotation file: either gtf
or gff
format. E.g. format = c("gtf","gtf")
.