View source: R/contactsUMI4C.R
prepUMI4C | R Documentation |
Prepare the FastQ files for the further analysis by selecting reads with bait and adding the respective UMI identifier for each read in its header.
prepUMI4C(
fastq_dir,
wk_dir,
file_pattern = NULL,
bait_seq,
bait_pad,
res_enz,
numb_reads = 1e+09
)
fastq_dir |
Path of the directory containing the FastQ files (compressed or uncompressed). |
wk_dir |
Working directory where to save the outputs generated by the UMI-4c analysis. |
file_pattern |
Character that can be used to filter the files you want
to analyze in the |
bait_seq |
Character containing the bait primer sequence. |
bait_pad |
Character containing the pad sequence (sequence between the bait primer and the restriction enzyme sequence). |
res_enz |
Character containing the restriction enzyme sequence. |
numb_reads |
Number of lines from the FastQ file to load in each loop. If having memory size problems, change it to a smaller number. Default=1e9. |
Creates a compressed FASTQ file in wk_dir/prep
named
basename(fastq)).fq.gz
, containing the filtered reads with the UMI
sequence in the header. A log file with the statistics is also generated
in wk_dir/logs
named umi4c_stats.txt
.
contactsUMI4C
.
if (interactive()) {
path <- downloadUMI4CexampleData(reduced = TRUE)
raw_dir <- file.path(path, "CIITA", "fastq")
prepUMI4C(
fastq_dir = raw_dir,
wk_dir = file.path(path, "CIITA"),
bait_seq = "GGACAAGCTCCCTGCAACTCA",
bait_pad = "GGACTTGCA",
res_enz = "GATC"
)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.