This function aims to retrieve a core set of best blast hits for each query sequence that is shared across all species. In other words, only query sequences that generated blast hits in all species in the input blast_tbl were retained. by filtering a blast_tbl using the following criteria. A best hit is defined as (fulfilling all three critaria):

  • max(alig_length): only the hit having the longest alignment length is retained.

  • qcovhsp >= min_qcovhsp: only hits that have a query coverage of at least min_qcovhsp are retained.

  • max(bit_score): only the hit having the highest bit-score is retained.

filter_homologs_core_set(blast_tbl, min_qcovhsp = 50)

Arguments

blast_tbl

a BLAST table generated with detect_homologs_proteome_to_proteome or detect_homologs_cds_to_cds.

min_qcovhsp

minimum query coverage of the hit in percent 0..100 that shall be retained. Default value is set to min_qcovhsp = 50 (= a best hit alignment must have at least 50 percent query coverage).

See also

Author

Hajk-Georg Drost