Description Usage Arguments Details Value Note Author(s) Examples
Basic4Cseq can create virtual fragment libraries from any BSgenome package or DNAString object. Two restriction enzymes have to be specified to cut the DNA, the read length is needed to check the fragment ends of corresponding length for uniqueness. Filter options (minimum and maximum size) are provided on fragment level and on fragment end level.
1 | createVirtualFragmentLibrary(chosenGenome, firstCutter, secondCutter, readLength, onlyNonBlind = TRUE, useOnlyIndex = FALSE, minSize = 0, maxSize = -1, minFragEndSize = 0, maxFragEndSize = 10000000, useAllData = TRUE, chromosomeName = "chr1", libraryName = "default")
|
chosenGenome |
The genome that is to be digested in silico with the provided enzymes; can be an instance of BSgenome or DNAString |
firstCutter |
First of two restriction enzymes |
secondCutter |
Second of two restriction enzymes |
readLength |
Read length for the experiment |
onlyNonBlind |
Variable that is TRUE (default) if only non-blind fragments are considered (i.e. all blind fragments are removed) |
useOnlyIndex |
Convenience function to adapt the annotation style of the chromosomes ("chr1", ... "chrY" or "1", ..., "Y"); parameter has to be set to match the BAM file in question |
minSize |
Filter option that allows to delete fragments below a certain size (in bp) |
maxSize |
Filter option that allows to delete fragments above a certain size (in bp) |
minFragEndSize |
Filter option that allows to delete fragment ends below a certain size (in bp) |
maxFragEndSize |
Filter option that allows to delete fragment ends above a certain size (in bp) |
useAllData |
Variable that indicates if all data of a BSgenome package is to be used. If FALSE, chromosome names including a "_" are removed, reducing the set of chromosomes to (1 ... 19, X, Y, MT) for the mouse genome or (1 ... 22, X, Y, MT) for the human genome |
chromosomeName |
Chromosome name for the virtual fragment library if a |
libraryName |
Name of the file the created virtual fragment library is written to. Per default the file is called "fragments_firstCutter_secondCutter.csv". The fragment data is returned as a data frame if and only if an empty character string is chosen as |
readLength
is relevant for the creation of the virtual fragment library to differenciate between unique and non-unique fragment ends. While two fragments can be unique, their respective ends may be repetitive if only the first few bases are considered. For 4C-seq data, reads can only map to the start (or end, respectively) of a 4C-seq fragment, the remaining fragment part is not covered. The length of a fragment end that has to be checked for uniqueness therefore depends on the read length of the experiment.
useAllData
uses the lengths of the chromosomes to identify relevant ones, based on the current BSgenome packages for mm10 or hg19, and may therefore provide undesirable results for smaller genomes with different lengths (i.e. discard all chromosomes).
The length of a fragment influences the expected read count of a 4C-seq fragment. Per default, Basic4Cseq uses the experiment's read length as minimum fragment end size and places virtually no limit on the maximum fragment end size.
A tab-separated file with the specified virtual fragment library (containing fragment position, length, presence of second restriction enzyme and uniqueness of the fragment ends)
It is strongly recommended to preprocess and store the virtual fragment library if a number of experiments with the same restriction enzyme combination, read length and underlying genome have to be analyzed.
Processing one of the larger BSgenome packages takes some time and computer data storage.
If no library name for the virtual fragment library is specified, the fragment data is returned as a data frame. If the library name "default" is chosen, the tab-separated file is named "fragments_firstCutter_secondCutter" (with variable cutter sequences).
Carolin Walter
1 2 3 4 | if(interactive()) {
library(BSgenome.Ecoli.NCBI.20080805)
fragmentData = createVirtualFragmentLibrary(chosenGenome = Ecoli$NC_002655, firstCutter = "catg", secondCutter = "gtac", readLength = 30, onlyNonBlind = TRUE, chromosomeName = "NC_002655", libraryName = "fragments_Ecoli.csv")
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.