vignettes/rawData.md

library("trioClasses")
library("GWASTools")
library("CleftCNVAssoc")

First we create a vector of offspring IDs that we want plotted.

offspring.vec <- as.character(completeTrios(fe.beaty)$id)

Now we incorporate it into a GRange object with the ranges in this case being the chr16 region for each of the offspring.

gr <- GRanges(seqnames = rep("chr16", length(offspring.vec)), ranges = IRanges(start = 32404517, 
    end = 32530051), id = offspring.vec)

Now we apply a function from CleftCNVAssoc to retrieve data from GWASTools objects. These data exist only on an encrypted hard drive and enigma.

raw.df.list <- getRaw(gr + 1e+06, intensfile = intensfile, snpAnnot = beaty_snpAnnot, 
    scan.id = scan.ids, fa.id = fa.id, ma.id = ma.id, genofile = genofile, xyfile = xyfile)

Here are the data for the first offspring's first 5 markers.

head(raw.df.list[[1]], 5)
    logr    baf      pos logr.fa logr.ma baf.fa baf.ma geno geno.fa
1 0.1662 1.0000 31405309  0.2031  0.0326 1.0000 1.0000    0       0
2 0.3557 1.0000 31405382  0.4870  0.2167 1.0000 1.0000    0       0
3 0.0232 0.9853 31411252  0.0954 -0.0743 0.4786 0.9795    0       1
4 0.1978 0.9981 31428777  0.1213  0.1274 1.0000 1.0000    0       0
5 0.0435 0.0057 31435321 -0.1137  0.1395 0.0071 0.5640    2       2
  geno.ma     x     y  x.fa  y.fa  x.ma  y.ma   snpname
1       0 0.000 1.310 0.000 1.344 0.000 1.194 rs3813007
2       0 0.031 0.734 0.036 0.802 0.025 0.668 rs3813008
3       0 0.194 1.769 1.349 1.270 0.194 1.647 rs4536493
4       0 0.020 0.875 0.005 0.837 0.013 0.836 rs4889545
5       1 1.246 0.058 1.116 0.055 0.763 1.003 rs4477723

Plot the logR values for everyone stratified by F,M,O. Purple is offspring, red is father, and blue is mother.

Not very informative so we turn to individual trios with an untransmitted deletion. First, we need to find a vector offspring IDs with an untransmitted deletion. This is a property of the CNVMatrix within the FamilyExperiment object and can be manipulated with the non-exported method TrioAssay. To begin we first subset the CNVMatrix on the chr16 region.

chr16.gr <- GRanges(seqnames = "chr16", ranges = IRanges(start = 32404517, end = 32530051))
(fe.beaty.chr16 <- fe.beaty[queryHits(findOverlaps(rowData(fe.beaty), chr16.gr))])
class: FamilyExperiment 
dim: 123 1339 
exptData(0):
assays(1): cnv
rownames(123): comp5899 comp5900 ... comp6020 comp6021
rowData metadata column names(0):
colnames(1339): 11005_01@1008472480 11005_02@1008472482 ...
  18117_02@0070298660 18117_03@0070298657
colData names(1): id
pedigree(2082): famid id fid mid sex dx
complete trios(445):

Now with the smaller FE object we can easily construct the trio-states.

trioAssay.chr16 <- trioClasses:::TrioAssay(fe.beaty.chr16, type = "cnv")
trioStates.chr16 <- with(trioAssay.chr16, matrix(paste0(F, M, O), nrow = nrow(O), 
    ncol = ncol(O)))
dimnames(trioStates.chr16) <- dimnames(trioAssay.chr16$O)
head(trioStates.chr16[, 1:5], 10)
                    comp5899 comp5900 comp5901 comp5902 comp5903
11005_01@1008472480 "000"    "000"    "000"    "000"    "000"   
11021_01@1008472417 "000"    "000"    "000"    "000"    "000"   
11035_01@1008471376 "000"    "000"    "000"    "000"    "000"   
12002_01@1008489061 "000"    "000"    "000"    "000"    "000"   
12004_01@1008489060 "000"    "000"    "000"    "000"    "000"   
12005_01@1008490117 "000"    "000"    "000"    "000"    "000"   
12008_01@1008490140 "010"    "010"    "010"    "010"    "010"   
12014_01@1008490162 "000"    "000"    "000"    "000"    "000"   
12015_01@1008490100 "001"    "001"    "001"    "001"    "001"   
12017_01@1008489083 "000"    "000"    "000"    "000"    "000"   

Now we identify trio-cnv pairs with an untransmitted deletion, i.e., trio-states 100, 010, or 110. (This is not a complete list of trio-states with a non-transmission.)

untrans.mat <- matrix(trioStates.chr16 %in% c("100", "010", "110"), nrow = nrow(trioStates.chr16), 
    ncol = ncol(trioStates.chr16), byrow = FALSE, dimnames = dimnames(trioStates.chr16))
head(untrans.mat[, 1:10], 10)
                    comp5899 comp5900 comp5901 comp5902 comp5903 comp5904
11005_01@1008472480    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
11021_01@1008472417    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
11035_01@1008471376    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
12002_01@1008489061    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
12004_01@1008489060    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
12005_01@1008490117    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
12008_01@1008490140     TRUE     TRUE     TRUE     TRUE     TRUE     TRUE
12014_01@1008490162    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
12015_01@1008490100    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
12017_01@1008489083    FALSE    FALSE    FALSE    FALSE    FALSE    FALSE
                    comp5905 comp5906 comp5907 comp5908
11005_01@1008472480    FALSE    FALSE    FALSE    FALSE
11021_01@1008472417    FALSE    FALSE    FALSE    FALSE
11035_01@1008471376    FALSE    FALSE    FALSE    FALSE
12002_01@1008489061    FALSE    FALSE    FALSE    FALSE
12004_01@1008489060    FALSE    FALSE    FALSE    FALSE
12005_01@1008490117    FALSE    FALSE    FALSE    FALSE
12008_01@1008490140     TRUE     TRUE     TRUE     TRUE
12014_01@1008490162    FALSE    FALSE    FALSE     TRUE
12015_01@1008490100    FALSE    FALSE    FALSE    FALSE
12017_01@1008489083     TRUE     TRUE     TRUE     TRUE

And finally we find the IDs of those with more than zero untransmitted deletions.

offspring.chr16 <- rownames(untrans.mat)[which(rowSums(untrans.mat) > 0)]
length(offspring.chr16)
[1] 140
head(offspring.chr16)
[1] "12008_01@1008490140" "12014_01@1008490162" "12017_01@1008489083"
[4] "12021_01@1008490126" "12024_01@1008490151" "12027_01@1008490157"

Finally, the raw intensity and baf plots per trio. plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2 plot of chunk logrplot2

Now for the transmitted deletions in the chr16 region.

trans.mat <- matrix(trioStates.chr16 %in% c("101", "011", "112"), nrow = nrow(trioStates.chr16), 
    ncol = ncol(trioStates.chr16), byrow = FALSE, dimnames = dimnames(trioStates.chr16))
offspring.chr16 <- rownames(trans.mat)[which(rowSums(trans.mat) > 0)]
length(offspring.chr16)
[1] 69
head(offspring.chr16)
[1] "12008_01@1008490140" "12014_01@1008490162" "12054_01@1008494951"
[4] "12062_01@0067868215" "12064_01@0067868240" "12071_01@0067868170"

plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3 plot of chunk logrplot3

So we see an increase in density in the region that likely has an increase in false positive rate. Is it the case that false positive rate and marker density are correlated? Maybe in an effort to better identify known CNPs the manufacturers increased marker density which increased both false positive and negative rates. And now, with trio data, we can identify the regions by under-transmission of deletions.

So, now we must find the marker density. Let's find



syounkin/Trioconductor documentation built on May 31, 2019, 12:47 a.m.