Main bedtools wrapper function.

Description

Main bedtools wrapper function.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
bedr(engine = "bedtools", 
	params = NULL,
	input = list(),
	method = NULL,
	tmpDir = NULL,
	deleteTmpDir = TRUE,
	outputDir = NULL,
	outputFile = NULL,
	check.chr = TRUE,
	check.zero.based = TRUE,
	check.valid = TRUE,
	check.sort = TRUE,
	check.merge = TRUE,
	verbose = TRUE
	);

Arguments

engine

What analytical engine to use i.e. bedtools, bedops, tabix, unix

params

A string that includes all the extra parameters and arguments to the bedtools commmand. For example if you wanted to do a left outer join you would specificy method as intersect and use params = c("-loj -header"). If you leave input and method as defaults then this is this string represents the full command.

input

A list of input items to be used by bedtools. Each item should be named by its parameter name in bedtools for example input = list(a=xxx, b=yyy, i=zzz). Items can be R objects or external files. R objects need to be in bed format i.e. have chr, start, stop as the first three columns, or, have an position index as the first column or rowname i.e. chr1:100-1000.

method

What bedtools method. This can be intersect, sort, merge etc. See bedtools documentation for specifcis.

tmpDir

The directory to be used for writing files

deleteTmpDir

Should tmp files be deleted. helpful for diagnostics.

outputDir

The output directory. Only used if outputFile is specified. It defaults to the current working directory.

outputFile

The name of the output file. If this is specified the output will be sent to a file not an R object

check.chr

check for chr prefix

check.zero.based

check for zero based coordinates

check.valid

do all region integrity checks

check.sort

check if region is sorted

check.merge

check if region is merged

verbose

Should messages be printed to screen.

Value

The output of command with some parsing to keep it consistent with the input.

Author(s)

Daryl Waggott

See Also

iranges

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
if (check.binary("bedtools")) {
set.seed(666)

index <- get.example.regions();
a <- index[[1]];
b <- index[[2]];

### check
is.a.valid  <- is.valid.region(a);
is.b.valid  <- is.valid.region(b);
a <- a[is.a.valid];
b <- b[is.b.valid];

### sort
is.sorted <- is.sorted.region(a);
a.sort1 <- bedr(engine = "bedtools", input = list(i = a), method = "sort", params = "");
b.sort1 <- bedr(engine = "bedtools", input = list(i = b), method = "sort", params = "");
a.sort2 <- bedr(engine = "bedops",   input = list(i = a), method = "sort", params = "");

a.sort3 <- bedr.sort.region(a);
a.sort4 <- bedr.sort.region(a, engine = "unix", method = "natural");
a.sort5 <- bedr.sort.region(a, engine = "R", method = "natural");

### merge
is.merged <- is.merged.region(a.sort1);
a.merge1 <- bedr(engine = "bedtools", input = list(i = a.sort1), method = "merge", params = "");
b.merge1 <- bedr(engine = "bedtools", input = list(i = b.sort1), method = "merge", params = "");
a.merge2 <- bedr(engine = "bedops",   input = list(i = a.sort1), method = "merge", params = "");
# a.merge3 <- bedr.merge.region(a); this will throw an error b/c region is not sorted

### subtract
a.sub1 <- bedr(input = list(a = a.merge1, b = b.merge1), method = "subtract", params="");
a.sub2 <- bedr.subtract.region(a.merge1, b.merge1);

### in.region
is.region <- in.region(a.merge1, b.merge1);
#is.region <- a.merge1 %in.region% b.merge1
### intersect 
# note for bedtools its recommended to bedr.sort.regions before intersect for faster processing
# also if regions are not merged this can cause unexpected behaviour
a.int1 <- bedr(input = list(a = a.sort1, b = b.sort1), method = "intersect", params = "-loj");
a.int1 <- bedr(input = list(a = a.sort1, b = b.sort1), method="intersect",params ="-loj -sorted");
a.int2 <- bedr(input = list(a = a.merge1, b = b.merge1), method="intersect",params="-loj -sorted");
a.int3 <- bedr.join.region(a.merge1, b.merge1);

### multiple join
d <- get.random.regions(100, chr="chr1", sort = TRUE);
a.mult <- bedr.join.multiple.region(x = list(a.merge1,b.merge1,bedr.sort.region(d)));

### groupby 
# note the "g" column number is based on bed format i.e. first three columns chr, start, stop
# note the use of first, first, last on the region columns i.e. the union of the regions
# note currently missing values are not dealt with in bedtools.  also the 5th column is 
# assumed to be "score" and gets a default "-1" not a "."
#cnv.gene <- bedr(input = list(i=cnv.gene), method = "groupby", params = paste("-g 16 -c ", 
#paste(1:15, collapse = ","), " -o ", "first,first,last,", 
#paste(rep("sum",12), collapse = ","), sep = ""));

### example 1
###  workflow adding gene names to exome sequencing target file
# download refseq genes from ucsc or query biomart for ensemble gene names.  
#format them in basic bed format.
# sort, merge target
# sort, merge -nms target.  Overlapping genes/features get merged.  
#this may not be ideal if there are some really big features.
# intersect -loj target, genes.
# alternatively, do not merge the target and apply the merge after the intersect.  
#this will provide precision at the level of the exon.
}

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.