ghap.ancsvm | R Documentation |
This function uses Support Vector Machines (SVM) to predict ancestry of haplotype alleles in test samples.
ghap.ancsvm(object, blocks, test = NULL, train = NULL, cost = 1, gamma = NULL, tune = FALSE, only.active.samples = TRUE, only.active.markers = TRUE, ncores = 1, verbose = TRUE)
object |
A GHap.phase object. |
blocks |
A data frame containing block boundaries, such as supplied by the |
test |
Character vector of individuals to test. |
train |
Character vector of individuals to use as reference samples. |
cost |
A numeric value specifying the C constant of the regularization term in the Lagrange formulation. |
gamma |
A numeric value specifying the gamma parameter of the RBF kernel (default = 1/blocksize). |
tune |
A logical value specfying if a grid search is to be performed for parameters (default = FALSE). |
only.active.samples |
A logical value specifying whether only active samples should be included in predictions (default = TRUE). |
only.active.markers |
A logical value specifying whether only active markers should be used for predictions (default = TRUE). |
ncores |
A numeric value specifying the number of cores to be used in parallel computing (default = 1). |
verbose |
A logical value specfying whether log messages should be printed (default = TRUE). |
This function predicts haplotype allele ancestry using Support Vector Machines (SVM) together with a Gaussian Radial Basis Function (RBF) kernel. The user is required to specify the C constant of the regularization term in the Lagrange formulation (default cost = 1) and the gamma parameter (default gamma = 1/blocksize) of the RBF kernel.
If ran with tune = FALSE, the function returns a dataframe with the following columns:
BLOCK |
Block alias. |
CHR |
Chromosome name. |
BP1 |
Block start position. |
BP2 |
Block end position. |
POP |
Original population label. |
ID |
Individual name. |
HAP1 |
Predicted ancestry of haplotype 1. |
HAP2 |
Predicted ancestry of haplotype 2. |
If tune = TRUE, the function returns a dataframe with the following columns:
cost |
The candidate value of the C constant. |
gamma |
The canidate value of the gamma parameter. |
accuracy |
The percentage of correctly assigned ancestries. |
Yuri Tani Utsunomiya <ytutsunomiya@gmail.com>
R. J. Haasl et al. Genetic ancestry inference using support vector machines, and the active emergence of a unique American population. Eur J Hum Genet. 2013. 21(5):554-62.
D. Meyer et al. e1071: Misc Functions of the Department of Statistics, Probability Theory Group (e1071). TU Wien. 2019 R Package Version 1.7-0.1. http://cran.r-project.org/web/packages/e1071/index.html.
svm
, ghap.ancsmooth
, ghap.ancplot
, ghap.ancmark
# #### DO NOT RUN IF NOT NECESSARY ### # # # Copy phase data in the current working directory # exfiles <- ghap.makefile(dataset = "example", # format = "phase", # verbose = TRUE) # file.copy(from = exfiles, to = "./") # # # Load phase data # # phase <- ghap.loadphase("example") # # ### RUN ### # # # Calculate marker density # mrkdist <- diff(phase$bp) # mrkdist <- mrkdist[which(mrkdist > 0)] # density <- mean(mrkdist) # # # Generate blocks for admixture events up to g = 10 generations in the past # # Assuming mean block size in Morgans of 1/(2*g) # # Approximating 1 Morgan ~ 100 Mbp # g <- 10 # window <- (100e+6)/(2*g) # window <- ceiling(window/density) # step <- ceiling(window/4) # blocks <- ghap.blockgen(phase, windowsize = window, # slide = step, unit = "marker") # # # Tune supervised analysis # train <- unique(phase$id[which(phase$pop != "Cross")]) # ranblocks <- sample(x = 1:nrow(blocks), size = 5, replace = FALSE) # tunesvm <- ghap.ancsvm(object = phase, blocks = blocks[ranblocks,], # train = train, gamma = 1/window*c(0.1,1,10), # tune = TRUE) # # # Supervised analysis with default parameters # hapadmix <- ghap.ancsvm(object = phase, blocks = blocks, # train = train) # anctracks <- ghap.ancsmooth(object = phase, admix = hapadmix) # ghap.ancplot(ancsmooth = anctracks)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.