Given an input dNdS table generated by dNdS and annotation files for the query and subject species in gtf or gff file format, this function selects the best BLAST hit to represent either a gene locus (e.g. the splice variant of the gene locus with lowest e-value) or the best BLAST hit for a splice varaint.

generate_ortholog_tables(
  dNdS_file,
  annotation_file_query,
  annotation_file_subject,
  output_folder = getwd(),
  output_type = "gene_locus",
  format
)

Arguments

dNdS_file

file path to a dNdS_tbl generated with dNdS and stored conform with read.dnds.tbl.

annotation_file_query

file path to the annotation file of the query species in gtf or gff file format.

annotation_file_subject

file path to the annotation file of the subject species in gtf or gff file format.

output_folder

file path to a folder in which orthologs tables shall be stored.

output_type

type of ortholog table that shall be printed out (or stored in a variable). Available options are:

  • output_type = "gene_locus" (Default): find for each gene locus a representative splice variant that maximizes the sequence homology (in terms of smalles e-value and longest splice variant in case of same evalue) to the subject gene locus and its representative splice variant. The output table contains only once representative splice variant per gene locus.

  • output_type = "splice_variant": for each homologous gene locus determine for each splice variant their respective splice variant homolog. he output table contains several splice variants and their homologous splice variants per gene locus.

format

a vector of length 2 storing the annotation file formats of the query annotation file and subject annotation file: either gtf or gff format. E.g. format = c("gtf","gtf").

Author

Hajk-Georg Drost