R/detect_homologs_proteome_to_proteome.R
detect_homologs_proteome_to_proteome.Rd
Run proteome to proteome BLAST searches to detect homologous protein sequences in a set of subject proteomes.
detect_homologs_proteome_to_proteome(
query,
subject_proteomes,
task = "blastp",
blast_output_path = "blast_output",
min_alig_length = 20,
evalue = 1e-05,
max.target.seqs = 5000,
cores = 1,
update = FALSE,
...
)
path to input file in fasta format.
a character vector containing paths to subject files in fasta format.
protein search task option. Options are:
task = "blastp"
: Standard protein-protein comparisons (default).
task = "blast-fast"
: Improved BLAST searches using longer words for protein seeding.
task = "blastp-short"
: Optimized protein-protein comparisons for query sequences shorter than 30 residues.
a path to a folder that will be created to store BLAST output tables for each individual query-proteome search.
minimum alignment length that shall be retained in the result dataset. All hit alignments with smaller hit alignment length will be removed automatically.
Expectation value (E) threshold for saving hits (default: evalue = 1E-5).
maximum number of aligned sequences that shall be kept. Default is max.target.seqs = 500
.
number of cores for parallel BLAST searches.
a logical value indicating whether or not pre-computed BLAST tables should be removed and re-computed (update = TRUE
) or imported from existing file (update = FALSE
) (Default).
additional arguments passed to blast_protein_to_protein
.