R/quality.filter.meta.R
quality.filter.meta.RdThis function takes the file paths to the genomes folder and LTRpred.meta output folder as input and eliminates false positive retrotransposon predictions on a metagenomic scale.
quality.filter.meta( kingdom, genome.folder, ltrpred.meta.folder, sim, cut.range = 2, n.orfs, strategy, update = FALSE )
| kingdom | a character string specifying the kingdom of life to which genomes annotated with |
|---|---|
| genome.folder | path to folder storing the genome assembly files that were used for |
| ltrpred.meta.folder | path to folder storing the |
| sim | LTR similarity threshold. Only putative LTR transposons that fulfill this LTR similarity threshold will be retained. |
| cut.range | a numeric number indicating the interval size for binning LTR similarities. |
| n.orfs | minimum number of ORFs detected in the putative LTR transposon. |
| strategy | quality filter strategy. Options are
|
| update | shall already existing |
A list with to list elements sim_file and gm_file. Each list element stores a data.frame:
sim_file (similarity file)
gm_file (genome metrics file)
Quality Control
ltr.similarity: Minimum similarity between LTRs. All TEs not matching this
criteria are discarded.
n.orfs: minimum number of Open Reading Frames that must be found between the
LTRs. All TEs not matching this criteria are discarded.
PBS or Protein Match: elements must either have a predicted Primer Binding
Site or a protein match of at least one protein (Gag, Pol, Rve, ...) between their LTRs. All TEs not matching this criteria are discarded.
The relative number of N's (= nucleotide not known) in TE <= 0.1. The relative number of N's is computed as follows: absolute number of N's in TE / width of TE.
Hajk-Georg Drost