gpatterns.import_from_bam: Create a track from bam files.
In tanaylab/gpatterns: Single Molecule Methylation Patterns Analysis Suite

gpatterns.import_from_bam

R Documentation

Create a track from bam files.

Description

Creates a track from bam files.

Usage

gpatterns.import_from_bam(
  bams,
  workdir = NULL,
  track = NULL,
  steps = "all",
  paired_end = TRUE,
  cgs_mask_file = NULL,
  trim = NULL,
  umi1_idx = NULL,
  umi2_idx = NULL,
  use_seq = FALSE,
  only_seq = FALSE,
  frag_intervs = NULL,
  maxdist = 0,
  rm_off_target = TRUE,
  add_chr_prefix = FALSE,
  bismark = FALSE,
  nbins = nrow(gintervals.all()),
  groot = GROOT,
  import_raw_tcpgs = FALSE,
  use_sge = FALSE,
  max_jobs = 400,
  parallel = getOption("gpatterns.parallel"),
  cmd_prefix = "",
  run_per_interv = TRUE,
  min_qual = 20,
  ...
)

Arguments

`bams`	character vector with path of bam files
`workdir`	directory in which the files would be saved (please provide full path)
`track`	name of the track to generate
`steps`	steps of the pipeline to do. Possible options are: 'bam2tidy_cpgs', 'filter_dups', 'bind_tidy_cpgs', 'pileup', 'pat_freq'
`paired_end`	bam files are paired end, with R1 and R2 interleaved
`cgs_mask_file`	comma separated file with positions of cpgs to mask (e.g. MSP1 sticky ends). Needs to have chrom and start fields with the position of 'C' in the cpgs to mask
`trim`	trim cpgs that are –trim bp from the beginning/end of the read
`umi1_idx`	position of umi1 in index (0 based)
`umi2_idx`	position of umi2 in index (0 based)
`use_seq`	use UMI sequence (not only position) to filter duplicates
`only_seq`	use only UMI sequence (without positions) to filter duplicates
`frag_intervs`	intervals set of the fragments to change positions to.
`maxdist`	maximal distance from fragments
`rm_off_target`	if TRUE - remove reads with distance > maxdist from frag_intervs if FALSE - those reads would be left unchanged
`add_chr_prefix`	add "chr" prefix for chromosomes (in order to import to misha)
`bismark`	bam was aligned using bismark
`nbins`	number of genomic bins to separate the analysis.
`groot`	root of misha genomic database to save the tracks
`import_raw_tcpgs`	import raw tidy cpgs to misha (without filtering duplicates)
`use_sge`	use sun grid engine for parallelization
`max_jobs`	maximal number of jobs for sge parallelization
`parallel`	parallelize using threads (number of threads is determined by gpatterns.set_parallel)
`cmd_prefix`	prefix to run on 'system' commands (e.g. source ~/.bashrc)
`run_per_interv`	split run of bam2tidy_cpgs scripts separatly for each interval.
`min_qual`	minial base quality
`...`	gpatterns.import_from_tidy_cpgs parameters