View source: R/intersect.bedtools.R
intersect.bedtools | R Documentation |
bedtools intersect
function).This function runs a command line that uses bedtools intersect
to intersect one or more .bed files.
intersect.bedtools(
a,
b,
outputFileName = paste(getwd(), "intersected.bed", sep = "/"),
abam = FALSE,
ubam = FALSE,
bed = FALSE,
wa = FALSE,
wb = FALSE,
loj = FALSE,
wo = FALSE,
wao = FALSE,
u = FALSE,
c = FALSE,
C = FALSE,
v = FALSE,
f = NULL,
F. = NULL,
r = FALSE,
e = FALSE,
s = FALSE,
S = FALSE,
split = FALSE,
sorted = FALSE,
g = NULL,
srun = FALSE,
intersect.bedtools.command = paste0("/home/", Sys.getenv("USERNAME"),
"/anaconda3/bin/intersectBed"),
return.command = FALSE,
return.bed = FALSE,
delete.output = FALSE,
run.command = TRUE
)
a |
A single string defining the BAM/BED/GFF/VCF file "A". Each feature in A is compared to B in search of overlaps. Use "stdin" if passing A with a UNIX pipe. |
b |
A character vector with one or more BAM/BED/GFF/VCF file(s) "B". It could be also a single string containing wildcard (*) character(s). |
outputFileName |
Full path to output file name. By default |
abam |
Logic value to define if file A is a BAM. Each BAM alignment in A is compared to B in search of overlaps. By default |
ubam |
Logic value to define if to write the output as uncompressed BAM. The default is to write compressed BAM output ( |
bed |
Logic value to define whether to write output as BED when using a BAM input |
wa |
Logic value to define if to write the original entry in A for each overlap. By default |
wb |
Logic value to define if to write the original entry in B for each overlap. Useful for knowing what A overlaps. Restricted by -f and -r. By default |
loj |
Logic value to define if to perform a "left outer join". That is, for each feature in A report each overlap with B. If no overlaps are found, report a NULL feature for B. By default |
wo |
Logic value to define if to write the original A and B entries plus the number of base pairs of overlap between the two features. Only A features with overlap are reported. Restricted by -f and -r. By default |
wao |
Logic value to define if to write the original A and B entries plus the number of base pairs of overlap between the two features. However, A features w/o overlap are also reported with a NULL B feature and overlap = 0. Restricted by -f and -r. By default |
u |
Logic value to define if to write original A entry once if any overlaps found in B. In other words, just report the fact at least one overlap was found in B. Restricted by -f and -r. By default |
c |
Logic value to define if to for each entry in A, report the number of hits in B while restricting to -f. Reports 0 for A entries that have no overlap with B. Restricted -f, -F, -r, and -s. By default |
C |
Logic value to define if to for each entry in A, separately report the number of overlaps with each B file on a distinct line. Reports 0 for A entries that have no overlap with B. Overlaps restricted by -f, -F, -r, and -s. By default |
v |
Logic value to define if to only report those entries in A that have no overlap in B. Restricted by -f and -r. |
f |
Numeric value defining the minimum overlap required as a fraction of A. Default is 1E-9 (i.e. 1bp). By default |
F. |
Numeric value defining the minimum overlap required as a fraction of B. Default is 1E-9 (i.e., 1bp). By default |
r |
Logic value defining if the fraction (parameter |
e |
Logic value defining if the fraction (parameter |
s |
Logic value to define if to force "strandedness". That is, only report hits in B that overlap A on the same strand. By default, overlaps are reported without respect to strand. By default |
S |
Logic value to define if to require different strandedness. That is, only report hits in B that overlap A on the _opposite_ strand. By default, overlaps are reported without respect to strand. By default |
split |
Logic value to define if to treat "split" BAM (i.e., having an "N" CIGAR operation) or BED12 entries as distinct BED intervals. By default |
sorted |
Logic value to define, for very large B files, if to invoke a "sweeping" algorithm that requires position-sorted input. When using -sorted, memory usage remains low even for very large files. By default |
g |
Specify a genome file the defines the expected chromosome order in the input files for use with the -sorted option. By default |
srun |
Logic value to define whether the command should be run in |
intersect.bedtools.command |
String to define the command to use to recall the |
return.command |
Logic value to define whether to return the string corresponding to the command for bedtools. By default |
return.bed |
Logic value to define whether to return the resulting bed as data.frame. By default |
delete.output |
Logic value to define whether to delete the exported intersected bed file. By default |
run.command |
Logic value to define whether to run the the command line on system terminal and generate the bed resulting from the intersection. By default |
To know more about the bedtools intersect
function see the package manual at the following link:
https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html.
The function generates the files indicated by the output parameters. If required the command line used and/or the resulting intersected bed file. If both outputs are required, the output will be a named list with two values: "command" and "intersected.bed".
intersect.bedtools(a = bed_file1.bed,
b = c("bed_file2.bed", "bed_file3.bed"),
wb = TRUE,
intersect.bedtools.command = "/home/user/anaconda3/bin/intersectBed")
intersect.bedtools(a = bed_file1.bed,
b = c("bed_file2.bed", "bed_file3.bed"),
wa = TRUE,
return.bed = TRUE,
delete.output = T,
intersect.bedtools.command = "/home/user/anaconda3/bin/intersectBed")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.