View source: R/miss_plot_hist.R
miss_plot_hist | R Documentation |
Use to visualise missing data with respect to samples and their associated populations.
miss_plot_hist(
dat,
plotBy,
look = "ggplot",
sampCol = "SAMPLE",
locusCol = "LOCUS",
genoCol = "GT",
popCol = NA,
plotColours = "white",
plotNCol = 2
)
dat |
Data table: Contains genetic information and must have the following columns,
|
plotBy |
Character: One of 'samples' or 'loci', the focus of missing data. |
look |
Character: The look of the plot. Default = |
sampCol |
Character: The column name with the sampled
individual ID. Default = |
locusCol |
Character: The column name with the locus ID.
Default = |
genoCol |
Character: The column name with the genotype info.
Default = |
popCol |
Character: The column name with the population ID.
Optional parameter. Default = |
plotColours |
Character: The fill colour for histogram bars. |
plotNCol |
Integer: The number of columns to arrange indiviudal
population plots into. Only takes effect when |
When popCol
is unspecified, then all samples are used to create the plots.
If it is specified, then that column name is used to make one plot for
each population. These are arranged in rows and columns, and the
user can specify the number of columns with the argument plotNCol
.
Returns a ggplot object.
library(genomalicious)
#### MISSING GENOTYPE DATA ####
data(data_Genos)
datGt <- data_Genos
# Add missing values
datGt <- do.call(
'rbind',
# Split data table by sample, and iterate through samples, X
split(datGt, by='POP') %>%
lapply(., function(Dpop){
pop <- Dpop$POP[1]
if(pop=='Pop1'){
pr <- 0.1
} else if(pop=='Pop2'){
pr <- 0.2
} else if(pop %in% c('Pop3','Pop4')){
pr <- 0.05
}
# Numbers and unique loci and samples
num.loc <- Dpop$LOCUS %>% unique %>% length
uniq.loc <- Dpop$LOCUS %>% unique
num.samp <- Dpop$SAMPLE %>% unique %>% length
uniq.samp <- Dpop$SAMPLE %>% unique
# Vector of missingness
num.miss <- rbinom(n=num.samp, size=num.loc, prob=pr)
# Iterate through samples and add unique loci
for(i in 1:num.samp){
locs <- sample(uniq.loc, size=num.miss[i], replace=FALSE)
Dpop[SAMPLE==uniq.samp[i] & LOCUS%in%locs, GT:=NA]
}
# Return
return(Dpop)
}
)
)
head(datGt, 10)
#### PLOT MISSING BY SAMPLES ####
# Histograms, ggplot and classic looks
miss_plot_hist(datGt, plotBy='samples', look='ggplot')
miss_plot_hist(datGt, plotBy='samples',, look='classic')
# Histograms, by population, specifying colour
miss_plot_hist(datGt, plotBy='samples',, look='ggplot'
, popCol='POP' , plotColours='deeppink2')
#### PLOT MISSING BY LOCI ####
miss_plot_hist(datGt, plotBy='loci',, look='classic'
, popCol='POP' , plotColours='deeppink2')
#### CATCH PLOT OUTPUT FOR LATER USE ####
gg4pops <- miss_plot_hist(datGt, plotBy='samples', popCol='POP')
plot(gg4pops)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.