View source: R/read_bedgraphs.R
read_bedgraphs | R Documentation |
Versatile BedGraph reader.
read_bedgraphs(
files = NULL,
pipeline = NULL,
zero_based = TRUE,
stranded = FALSE,
collapse_strands = FALSE,
ref_cpgs = NULL,
ref_build = NULL,
contigs = NULL,
vect = FALSE,
vect_batch_size = NULL,
coldata = NULL,
chr_idx = NULL,
start_idx = NULL,
end_idx = NULL,
beta_idx = NULL,
M_idx = NULL,
U_idx = NULL,
strand_idx = NULL,
cov_idx = NULL,
synced_coordinates = FALSE,
n_threads = 1,
h5 = FALSE,
h5_dir = NULL,
h5temp = NULL,
verbose = TRUE
)
files |
bedgraph files. |
pipeline |
Default NULL. Currently supports "Bismark_cov", "MethylDackel", "MethylcTools", "BisSNP", "BSseeker2_CGmap" If not known use idx arguments for manual column assignments. |
zero_based |
Are bedgraph regions zero based ? Default TRUE |
stranded |
Default FALSE |
collapse_strands |
If TRUE collapses CpGs on different crick strand into watson. Deafult FALSE |
ref_cpgs |
BSgenome object, or name of the installed BSgenome package, or an output from |
ref_build |
reference genome for bedgraphs. Default NULL. Only used for additional details. Doesnt affect in any way. |
contigs |
contigs to restrict genomic CpGs to. Default all autosomes and allosomes - ignoring extra contigs. |
vect |
To use vectorized code. Default FALSE. Set to TRUE if you don't have large number of BedGraph files. |
vect_batch_size |
Default NULL. Process samples in batches. Applicable only when vect = TRUE |
coldata |
An optional DataFrame describing the samples. Row names, if present, become the column names of the matrix. If NULL, then a DataFrame will be created with basename of files used as the row names. |
chr_idx |
column index for chromosome in bedgraph files |
start_idx |
column index for start position in bedgraph files |
end_idx |
column index for end position in bedgraph files |
beta_idx |
column index for beta values in bedgraph files |
M_idx |
column index for read counts supporting Methylation in bedgraph files |
U_idx |
column index for read counts supporting Un-methylation in bedgraph files |
strand_idx |
column index for strand information in bedgraph files |
cov_idx |
column index for total-coverage in bedgraph files |
synced_coordinates |
Are the start and end coordinates of a stranded bedgraph are synchronized between + and - strands? Possible values: FALSE (default), TRUE if the start coordinates are the start coordinates of the C on the plus strand. |
n_threads |
number of threads to use. Default 1. Be-careful - there is a linear increase in memory usage with number of threads. This option is does not work with Windows OS. |
h5 |
Should the coverage and methylation matrices be stored as 'HDF5Array' |
h5_dir |
directory to store H5 based object |
h5temp |
temporary directory to store hdf5 |
verbose |
Be little chatty ? Default TRUE. |
Reads BedGraph files and generates methylation and coverage matrices. Optionally arrays can be serialized as on-disk HDFS5 arrays.
An object of class methrix
## Not run:
bdg_files = list.files(path = system.file('extdata', package = 'methrix'),
pattern = '*\\.bedGraph\\.gz$', full.names = TRUE)
hg19_cpgs = methrix::extract_CPGs(ref_genome = 'BSgenome.Hsapiens.UCSC.hg19')
meth = methrix::read_bedgraphs( files = bdg_files, ref_cpgs = hg19_cpgs,
chr_idx = 1, start_idx = 2, M_idx = 3, U_idx = 4,
stranded = FALSE, zero_based = FALSE, collapse_strands = FALSE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.