Description Usage Arguments Details Examples
Seperate a data matrix into list elements based on coordinates from bed format data.
1 | bedify(myBed, myData, fill_missing = 0L, verbose = 0L)
|
myBed |
matrix of bed format data |
myData |
StringMatrix or IntegerMatrix to be sorted |
fill_missing |
include records for when there is no data (0, 1). By default these records are omitted. |
verbose |
should verbose output be generated (0, 1) |
Bed format data contain at least three columns. The first column indicates the chromosome (i.e., supercontig, scaffold, contig, etc.). The second cotains the starting positions. The third the ending positions. Optional columns are in columns four through nine. For example, the fourth column may contain the names of features. All subsequent columns are ignored here. In an attempt to optimize performance the data are expected to be formatted as a character matrix. The starting and end positions are converted to numerics internally.
The matrix format used here is based on vcf type data. Typically these data have a chromosome as the first column. Each chromosome has its own coordinate system which begins at one. This means that using multiple chromosomes will necessitate some fix to the coordinate systems. Here I take the perspective that you should simply work on one chromosome at a time, so the chromosome information is ignored. The first column is the chromosome, which I ignore. The second column is the position, which is used for sorting. Subsequent columns are not treated but are brought along with the subset.
When the matrix is of numeric form the first column, which contains the chromosome identifier (CHROM), must also be numeric. This is because matrix elements must all be of the same type.
Bed format at UCSC
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | bed <- structure(c("chr_290", "chr_4176", "chr_126921", "chr_126921",
"chr_125157", "chr_125157", "chr_125157", "chr_125157", "chr_126888",
"chr_126888", "47", "400", "4344", "1", "3712", "6025", "2269",
"1779", "7930", "4637", "80", "500", "4967", "9066", "6566",
"6450", "2933", "2226", "11939", "7913", "gene_1", "gene_2",
"gene_3", "gene_4", "gene_5", "gene_6", "gene_7", "gene_8", "gene_9",
"gene_10"), .Dim = c(10L, 4L), .Dimnames = list(NULL, c("chrom",
"chromStart", "chromEnd", "name")))
vcf.matrix <- structure(c("chr_290", "chr_290", "chr_4176", "chr_4176", "chr_50514",
"chr_64513", "chr_107521", "chr_121987", "chr_122006", "chr_122006",
"78", "96", "406", "425", "863", "2853", "77", "103", "243",
"636", "0/1:5,4:9:99:117,0,153", "0/0:9,0:9:99:0,27,255", "0/1:10,11:21:99:255,0,255",
"0/1:10,11:21:99:255,0,255", "0/1:14,14:28:99:255,0,255", "0/1:29,13:42:99:255,0,255",
"0/1:26,11:37:99:255,0,255", "0/1:21,14:35:99:255,0,255", "0/0:12,1:13:67:0,4,255",
"0/1:55,8:63:99:99,0,255", "0/1:10,8:18:99:234,0,255", "0/0:17,0:17:99:0,51,255",
"0/1:16,13:29:99:255,0,255", "0/1:16,13:29:99:255,0,255", "0/1:26,19:45:99:255,0,255",
"0/1:50,19:69:99:255,0,255", "0/1:62,17:79:99:255,0,255", "0/1:95,22:117:99:255,0,255",
"0/1:32,5:37:99:68,0,255", "0/1:69,21:90:99:255,0,255"), .Dim = c(10L,
4L), .Dimnames = list(NULL, c("CHROM", "POS", "sample_1", "sample_2"
)))
class(bed)
is.character(bed)
class(vcf.matrix)
is.character(vcf.matrix)
var.list <- bedify(bed, vcf.matrix)
table(unlist(lapply(var.list, nrow)))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.