intTrim: Reads filtration and trimming

Description Usage Arguments

View source: R/intTrim.R

Description

Filter reads matching primer and linker, dump others. Adjust reads to same direction that linker is on the left side. Trim off linker and primer sequence followed by adding UMI, barcode to reads ID which has been simplied but informative enough to distinguish each reads. Save fasta files for further analysis.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
intTrim(
  path2fq1,
  path2fq2,
  path2mg,
  outdir = NULL,
  LTR = "AGTCAGTGTGGAAAATCTCTAGCA",
  linker = "CTCCGCTTAAGGGACT",
  avoidseq1 = "GTGGCGCCCGAACAGGGACTTGAAAGCGAAAGGGAAACCAGAGGAGCTCT",
  avoidseq2 = "GTAGTAGTTCATGTCATCTTATTATTCAGTATTTATAACT"
)

Arguments

path2fq1

the file path to fq1 file.

path2fq2

the file path to fq2 file.

path2mg

the file path to merged fastq file.

outdir

the file fold where output files put in

LTR

the sequence of vector closest to integrated host genome. Recommended 18~28bp. Default value is for lentiviral vector. If you use other vector, change to corresponding sequence.

linker

the sequence of adaptor linker at the end, near the genome part. Default linker is from INSPIIRED pipeline. If you use other linker, change to corresponding sequence.

avoidseq1

Reads coontaining this sequence 1 will bring false positive integration site results. Default value is vector sequence close to 5'LTR tail. Change to NULL if you don't need it.

avoidseq2

Reads coontaining this sequence 2 will bring false positive integration site results Default value is vector sequence close to 3'LTR tail. Change to NULL if you don't need it.


Heath1210/intSiteR documentation built on Dec. 17, 2021, 10:32 p.m.