This function takes a vector containing the query amino acid sequence, subject amino acid sequence, query CDS sequence, and subject CDS sequence and then runs the following pipieline:

  • 1) Multiple-Alignment of query amino acid sequence and subject amino acid sequence

  • 2) Codon-Alignment of the amino acid alignment returned by 1) and query CDS sequence + subject CDS sequence

  • 3) dNdS estimation of the codon alignment returned by 2)

internal_dnds(
  complete_tbl,
  aa_aln_type,
  aa_aln_tool,
  codon_aln_tool,
  dnds_estimation,
  store_locally,
  quiet,
  cores,
  clean_folders
)

Arguments

complete_tbl

a data.table object storing the query_id, subject_id, query_cds (sequence), subject_cds (sequence), query_aa (sequence), and subject_aa (sequence) of the organisms that shall be compared.

aa_aln_type

a character string specifying the amino acid alignment type: aa_aln_type = "pairwise". Default is aa_aln_type = "multiple".

aa_aln_tool

a character string specifying the multiple alignment tool that shall be used for pairwise protein alignments.

codon_aln_tool

a character string specifying the codon alignment tool that shall be used for codon alignments. Default is codon_aln_tool = "pal2nal".

dnds_estimation

a character string specifying the dNdS estimation method, e.g. "Comeron". See Details for all options.

store_locally

a logical value indicating whether or not alignment files shall be stored locally rather than in tempdir().

quiet

a logical value specifying whether a successful interface call shall be printed out.

cores

a numeric value specifying the number of cores that shall be used to perform parallel computations on a multicore machine.

clean_folders

a boolean value spefiying whether all internall folders storing the output of used programs shall be removed. Default is clean_folders = FALSE.

Details

This function takes the amino acid and CDS sequences two orthologous genes and writes the corresponding amino acid and CDS sequences as fasta file into the internal folder environment. The resulting fasta files (two files) store the amino acid sequence of the query_id and subject_id (file one) and the CDS sequence of the query_id and subject_id (file two). These fasta files are then used to pass through the following pipeline:

1) Multiple-Alignment or Pairwise-Alignment of query amino acid sequence and subject amino acid sequence

2) Codon-Alignment of the amino acid alignment returned by 1) and query CDS sequence + subject CDS sequence

3) dNdS estimation of the codon alignment returned by 2)

References

http://www.r-bloggers.com/the-wonders-of-foreach/

See also

Author

Hajk-Georg Drost