Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/creationFunctions.R
Wrapper around matchPDict that will accept a list of sequences and check if they are present in a given genome. Takes a long time to run.
1 2 3 4 | findSequenceInGenome(sequences,
genome="BSgenome.Hsapiens.UCSC.hg19", verbose=TRUE,
directions=c("matchForwardSense", "matchForwardAntisense",
"matchReverseSense", "matchReverseAntisense"))
|
sequences |
vector of character strings to scan. Should only contain A, C, G and T. Will be converted to DNAString. |
genome |
character string with the name of the BSGenome in which sequences should be found. Defaults to the human genome. |
verbose |
TRUE or FALSE. |
directions |
character string with elements from c("matchForwardSense", "matchForwardAntisense", "matchReverseSense", "matchReverseAntisense"). Defines which directions (complementary and reverse mirrorings) that should be scanned. Defaults to all directions. |
This function will take quite a while to run, so if you have a many sequences, overnight runs are recommended. BSgenome contains some alternative versions of chromosomes. They are marked with an underscore. This function automatically disregards chromosome names with an underscore, and this is known to work for the human genome. Nevertheless, check the output printed to terminal if all chromosomes are included.
A data frame with a row for each identified match. Columns are "entrynumber","hitposInChr","chr", and "sequence" describing, respectively: the index of the sequence match, the position in the chromosome at which it was found, which chromosome it was found on, the sequence itself
Lasse Folkersen
BSgenome
,
matchPDict
,
excludeDoubleMatchingProbes
1 2 3 4 5 | ## Not run:
#you can run this, but it takes quite a lot of time
example<-findSequenceInGenome("CTGGCGAGCAGCGAATAATGGTTT")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.