R/extract_upstream_promotor_seqs.R
    extract_upstream_promotor_seqs.RdGiven a genome assembly file and an corresponding annotation file users can retrieve all upstream promotor sequences of all genes from a genome.
extract_upstream_promotor_seqs(
  organism,
  genome_file,
  annotation_file,
  annotation_format,
  file_name = NULL,
  promotor_width,
  replaceUnstranded = "+"
)a character string specifying the scientific name of the organism.
file path to the genome assembly file.
file path to the annotation file of the genome assembly
in gtf format.
format of the annotation file. Options are:
annotation_format = "gtf"
annotation_format = "gff"
annotation_format = "gff3"
file path to the output file storing the promotor sequences.
width of upstream promotors. This is -promotor_width bp from the
transcription start site (TSS) of the gene.
logical value indicating whether or not unstranded sequences shall receive a default strand. Default is replaceUnstranded = TRUE.
This function extracts genomic sequences of a specified promotor_width upstream of the transcription start sites of all genes annotated in the corresponding 
annotation_file file. The promotor sequenes are then
if (FALSE) {
# download genome assembly of Arabidopsis lyrata
Aly_genome <- biomartr::getGenome(db = "refseq", 
                                 organism = "Arabidopsis lyrata",
                                 path = file.path("refseq", "genome"),
                                 gunzip = TRUE)
# download annotation file of genome assembly of Arabidopsis lyrata
Aly_gff <- biomartr::getGFF(db = "refseq", 
                           organism = "Arabidopsis lyrata",
                           path = file.path("refseq", "annotation"),
                           gunzip = TRUE)
                           
# retrieve upstream promotor sequences of length 1000bp
promotor_seqs <- extract_upstream_promotor_seqs(
                               organism = "Arabidopsis lyrata",
                               genome_file = Aly_genome,
                               annotation_file = Aly_gff,
                               annotation_format = "gff",
                               promotor_width = 1000)
}