Description Usage Arguments Value Examples
View source: R/contactsUMI4C.R
Using demultiplexed FastQ files as input, performs all necessary steps to end up with a tsv file summarizing the restriction enzyme fragments and the number of UMIs supporting that specific contact with the viewpoint (bait) of interest.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  | contactsUMI4C(
  fastq_dir,
  wk_dir,
  file_pattern = NULL,
  bait_seq,
  bait_pad,
  res_enz,
  cut_pos,
  digested_genome,
  bowtie_index,
  threads = 1,
  numb_reads = 1e+11,
  rm_tmp = TRUE,
  min_flen = 20,
  filter_bp = 1e+07,
  ref_gen,
  sel_seqname = NULL
)
 | 
fastq_dir | 
 Path of the directory containing the FastQ files (compressed or uncompressed).  | 
wk_dir | 
 Working directory where to save the outputs generated by the UMI-4c analysis.  | 
file_pattern | 
 Character that can be used to filter the files you want
to analyze in the   | 
bait_seq | 
 Character containing the bait primer sequence.  | 
bait_pad | 
 Character containing the pad sequence (sequence between the bait primer and the restriction enzyme sequence).  | 
res_enz | 
 Character containing the restriction enzyme sequence.  | 
cut_pos | 
 Numeric indicating the nucleotide position where restriction enzyme cuts (zero-based) (for example, for DpnII is 0).  | 
digested_genome | 
 Path for the digested genome file generated using the
  | 
bowtie_index | 
 Path and prefix of the bowtie index to use for the alignment.  | 
threads | 
 Number of threads to use in the analysis. Default=1.  | 
numb_reads | 
 Number of lines from the FastQ file to load in each loop. If having memory size problems, change it to a smaller number. Default=10e10.  | 
rm_tmp | 
 Logical indicating whether to remove temporary files (sam and intermediate bams). TRUE or FALSE. Default=TRUE.  | 
min_flen | 
 Minimal fragment length to use for selecting the fragments. Default=20  | 
filter_bp | 
 Integer indicating the bp upstream and downstream of the viewpoint to select for further analysis. Default=10e6  | 
ref_gen | 
 A BSgenome object of the reference genome.  | 
sel_seqname | 
 A character with the chromosome name to focus the search for the viewpoint sequence.  | 
This function is a combination of calls to other functions that perform the necessary steps for processing UMI-4C data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31  | if (interactive()) {
path <- downloadUMI4CexampleData()
hg19_dpnii <- digestGenome(
    cut_pos = 0,
    res_enz = "GATC",
    name_RE = "DpnII",
    ref_gen = BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19,
    out_path = file.path(path, "digested_genome")
)
raw_dir <- file.path(path, "CIITA", "fastq")
contactsUMI4C(
    fastq_dir = raw_dir,
    wk_dir = file.path(path, "CIITA"),
    bait_seq = "GGACAAGCTCCCTGCAACTCA",
    bait_pad = "GGACTTGCA",
    res_enz = "GATC",
    cut_pos = 0,
    digested_genome = hg19_dpnii,
    bowtie_index = file.path(path, "ref_genome", "ucsc.hg19.chr16"),
    threads = 1,
    numb_reads = 10e10,
    ref_gen = BSgenome.Hsapiens.UCSC.hg19::BSgenome.Hsapiens.UCSC.hg19,
    sel_seqname = "chr16"
)
unlink(path, recursive=TRUE)
}
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.