1. Use Likelihood to Generate Genotype Probabilities

The likelihood can be used to only calculate genotype probabilities if the gt is omitted as shown below.

# generates likelihood probabilities given allele frequencies
library(SNPcontam)
af <- c(SNP1 = .2, SNP2 = .6, SNP3 = .25, SNP4 = 0.8)
likelihood(af)

The list componenets contam and clean are Null because no genotype was entered.

2. Use Function to Randomly Generate Genotypes

The random_gene can next be used to generate genotypes based on the genotype frequencies computed usking likelihood.

# generates 5 random genotypes
Ran <- random_gene(5,af)

rclean includes genotypes generated assuming no contamination while rcontam includes those generated assuming contamination.

3. Use Function to Compute Lilkelihoods of Random Genotypes

The likehood function can be used to compute the likelihoods of genotypes.

# likelihood for the non contaminated samples
L <- likelihood(af,Ran$rclean)
L
# likelihood for the contaminated samples
LC <- likelihood(af, Ran$rcontam)

LC will have a similar output to L, but for the contaminated assumption.

4. Use Function to Compute Likelihood Ration

The function lratio can be used to compute the likelihood ratio for each individual.

# ratios for the non contaminated individuals
ratio <- lratio(L$clean, L$contam,af)
ratio

# ratios for the contaminated individuals
ratio_c <- lratio(LC$clean, LC$contam,af)
ratio_c

5. Use Function to Compute Heterozygousity

The function hetero can be used to compute the proportion of heterozygous loci in an individual.

# proportion for non contaminated individuals
hetero <- hetero(Ran$rclean,af)
hetero

# proportion of contamintated individuals
hetero_c <- hetero(Ran$rcontam, af)
hetero_c

6. Use Function to Plot ROC Curves

#subplots
par(mfrow=c(1,2))
# plots the ROC for the ratio test
ROC(10,ratio[[1]], ratio_c[[1]])

#plots the ROC for the heterozygosity test
ROC(10,hetero[[1]],hetero_c[[1]])

Histograms of the different likelihood ratio and heterozygosity distributions can also be plotted:

par(mfrow=c(1,2))
# likelihood ratio test
hist(ratio[[1]],col=rgb(1,0,0,0.5),xlim = c(-5,10),main="Likelihood Ratio Histogram")
hist(ratio_c[[1]],col=rgb(0,0,1,0.5),add=T)

# heterozygousity proportions
hist(hetero[[1]],col=rgb(1,0,0,0.5),xlim = c(0,1),main="Heterozygousity Proportion Histogram")
hist(hetero_c[[1]],col=rgb(0,0,1,0.5),add=T)

Example of Output

Below is code that synthesizes the steps above. There are 20 randomly generated alle frequencies and 1000 randomly generated individuals.

af <- runif(20,0,1)
Ran <- random_gene(1000,af)
L <- likelihood(af,Ran$rclean)
LC <- likelihood(af, Ran$rcontam)
ratio <- lratio(L$clean, L$contam,af)
ratio_c <- lratio(LC$clean, LC$contam,af)
hetero <- hetero(Ran$rclean,af)
hetero_c <- hetero(Ran$rcontam, af)
par(mfrow=c(2,2))
ROC(100,ratio[[1]], ratio_c[[1]])
hist(ratio[[1]],col=rgb(1,0,0,0.5),xlim = c(-40,20),main="Likelihood Ratio Histogram")
hist(ratio_c[[1]],col=rgb(0,0,1,0.5),add=T)
ROC(100,hetero[[1]],hetero_c[[1]])
hist(hetero[[1]],col=rgb(1,0,0,0.5),xlim = c(0,1),main="Heterozygousity Histogram")
hist(hetero_c[[1]],col=rgb(0,0,1,0.5),add=T)


eriqande/SNPcontam documentation built on May 16, 2019, 8:44 a.m.