dnds_across_multiple_species.Rd
This function allows you to compute dnds maps between a query organism and a set subject organisms stored in the same folder. The corresponding dnds maps are then stored in an output folder.
dnds_across_multiple_species( query, subjects_folder, output_folder, format = "fasta", ortho_detection = "DIAMOND-RBH", delete_corrupt_cds = FALSE, store_locally = FALSE, sensitivity_mode = "ultra-sensitive", out_format = "csv", evalue = "1E-5", max_target_seqs = 5000, cores = 1, hard_mask = TRUE, diamond_exec_path = NULL, add_makedb_options = NULL, add_diamond_options = NULL, min_qry_coverage_hsp = 50, min_qry_perc_identity = 10, aa_aln_type = "pairwise", aa_aln_tool = "NW", codon_aln_tool = "pal2nal", dnds_estimation = "Li", progress_bar = TRUE, diamond_hit_tables = file.path(output_folder, "diamond_hit_tables"), sep = ";", ... )
query | a character string specifying the path to the CDS file of the query organism. |
---|---|
subjects_folder | a character string specifying the path to the folder where CDS files of the subject organisms are stored. |
output_folder | a character string specifying the path to the folder where output dnds maps should be stored. |
format | a character string specifying the file format of the sequence file, e.g. |
ortho_detection | a character string specifying the orthology inference method that shall be performed to detect orthologous genes. Options are:
|
delete_corrupt_cds | a logical value indicating whether sequences with corrupt base triplets should be removed from the input |
store_locally | a logical value indicating whether or not alignment files shall be stored locally rather than in |
sensitivity_mode | specify the level of alignment sensitivity. The higher the sensitivity level, the more deep homologs can be found, but at the cost of reduced computational speed.
|
out_format | a character string specifying the format of the file in which the DIAMOND results shall be stored. Available options are:
|
evalue | a character string specifying the e-value for DIAMOND based orthology inference that is performed in the process of dnds computations. Please use the scientific notation. |
max_target_seqs | maximum number of aligned sequences that shall be retained. Please be aware that |
cores | number of computing cores that shall be used to perform parallelized computations. |
hard_mask | shall low complexity regions be hard masked with TANTAN? Default is |
diamond_exec_path | a path to the DIAMOND executable or |
add_makedb_options | a character string specifying additional makedb options that shall be passed on to the diamond makedb command line call, e.g. |
add_diamond_options | a character string specifying additional diamond options that shall be passed on to the diamond command line call, e.g. |
min_qry_coverage_hsp | minimum |
min_qry_perc_identity | minimum |
aa_aln_type | a character string specifying the amino acid alignement type: |
aa_aln_tool | a character string specifying the program that should be used e.g. "clustalw". |
codon_aln_tool | a character string specifying the codon alignment tool that shall be used.
Default is |
dnds_estimation | a character string specifying the dnds estimation method, e.g. |
progress_bar | should a progress bar be shown. Default is |
diamond_hit_tables | a file path to a folder where DIAMOND hit tables internally generated with |
sep | a file separator that is used to store maps as csv file. |
... | additional parameters that shall be passed to |
Given a query organism and a set of subject organisms that are stored in the same folder, this function crawls through all subject organism and as a first step computes the pairwise dnds Maps between query and subject organism and as a second step stores the corresponding Map in an output folder.
Hajk-Georg Drost
if (FALSE) { # running dnds across several species dnds_across_multiple_species( query = system.file('seqs/ortho_thal_cds.fasta', package = 'homologr'), subjects_folder = system.file('seqs/map_gen_example', package = 'homologr'), aa_aln_type = "pairwise", aa_aln_tool = "NW", codon_aln_tool = "pal2nal", dnds_estimation = "Li", output_folder = "homologr_dnds_maps", quiet = TRUE, cores = 1 ) # running dnds across several species using DIAMOND executable # path '/opt/miniconda3/bin/' dnds_across_multiple_species( query = system.file('seqs/ortho_thal_cds.fasta', package = 'homologr'), subjects_folder = system.file('seqs/map_gen_example', package = 'homologr'), diamond_exec_path = "/opt/miniconda3/bin/", aa_aln_type = "pairwise", aa_aln_tool = "NW", codon_aln_tool = "pal2nal", dnds_estimation = "Li", output_folder = "homologr_dnds_maps", quiet = TRUE, cores = 1 ) }