Description Usage Arguments Details Value Examples
Versatile BedGraph reader.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | read_beds(
files,
ref_cpgs = NULL,
colData = NULL,
genome_name = "hg19",
batch_size = min(20, length(files)),
n_threads = 1,
h5 = FALSE,
h5_dir = NULL,
h5_temp = NULL,
desc = NULL,
verbose = TRUE,
zero_based = FALSE,
replace = FALSE,
fill = TRUE,
pipeline = c("Custom", "Bismark_cov", "MethylDackel", "MethylcTools", "BisSNP",
"BSseeker2_CGmap"),
stranded = FALSE,
strand_collapse = FALSE,
chr_idx = NULL,
start_idx = NULL,
end_idx = NULL,
beta_idx = NULL,
M_idx = NULL,
U_idx = NULL,
strand_idx = NULL,
cov_idx = NULL
)
|
files |
list of strings; file.paths of BED files |
ref_cpgs |
data.table; list of CpG sites in the tab-delimited format of chr-start-end. Must be zero-based genome. |
colData |
list of strings; Sample names. Will be derived from filenames if not provided |
genome_name |
string; Name of genome. Default hg19 |
batch_size |
integer; Max number of files to hold in memory at once. Default 20 |
n_threads |
integer; number of threads to use. Default 1. Be-careful - there is a linear increase in memory usage with number of threads. This option is does not work with Windows OS. |
h5 |
boolean; Should the coverage and methylation matrices be stored as |
h5_dir |
string; directory to store H5 based object. This can be NULL and the experiment can be manually saved later |
h5_temp |
string; temporary directory to store hdf5 |
desc |
string; Description of the experiment |
verbose |
boolean; flag to output messages or not. |
zero_based |
boolean; flag for whether the input data is zero-based or not |
replace |
boolean; flag for whether to delete the contents of h5_dir before saving |
fill |
boolean; flag whether to fill the output matrixes with all CpGs in ref_cpgs. This must be TRUE for HDF5-based experiments. |
pipeline |
string; Default NULL. Currently supports "Bismark_cov", "MethylDackel", "MethylcTools", "BisSNP", "BSseeker2_CGmap" If not known use idx arguments for manual column assignments. |
stranded |
boolean; Whether in input data is stranded. Default FALSE |
strand_collapse |
boolean; whether to collapse the crick strand into watson strand. Default FALSE |
chr_idx |
integer; column index for chromosome in bedgraph files |
start_idx |
integer; column index for start position in bedgraph files |
end_idx |
integer; column index for end position in bedgraph files |
beta_idx |
integer; column index for beta values in bedgraph files |
M_idx |
integer; column index for read counts supporting Methylation in bedgraph files |
U_idx |
integer; column index for read counts supporting Un-methylation in bedgraph files |
strand_idx |
integer; column index for strand information in bedgraph files |
cov_idx |
integer; column index for total-coverage in bedgraph files |
Reads BED files and generates methylation matrices. Optionally arrays can be serialized as on-disk HDFS5 arrays.
colData should be input as a headered data.table with a column called "Sample" with names matching the input filenames. Any other columns may be added to include relevant data (e.g. cell type, collection date, etc). During input, this is done as a left join on the inputted files, so the input colData may contain rows for samples that are not actually included in the analysis. This data will be updated on any relevant subsets or merges, etc.
There is an assumption that the first input file will contain the maximum methylation score. It would be extremely unlikely that this assumption is invalid.
An object of class scMethrix
1 2 3 4 | ## Not run:
#Do Nothing
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.