Description Usage Arguments Value Author(s) See Also Examples
This function runs JAMPred by analysing many SNP blocks independently in parallel.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | JAMPred(
marginal.betas = NULL,
n.training = NULL,
marginal.logor.ses = NULL,
p.cases.training = NULL,
ref.geno = NULL,
total.snps.genome.wide = NULL,
n.cores = NULL,
n.mil = 0.2,
beta.binom.b.lambda = 1,
beta.binom.a = 1,
snps.blocks = NULL,
initial.block.size = 100,
initial.snps.blocks = NULL,
seed = NULL,
thinning.factor = round(1000/n.mil),
effect.sd.uniform.prior = c(0.05, 2),
residual.var.invgamma.prior = c(0.01, 0.01),
save.path = NULL,
debug = FALSE
)
|
marginal.betas |
Vector of named one-at-a-time SNP effects. NB: This must be a named vector; it is where. the SNP names are derived from. These could be from linear regressions, or log-Odds Ratios. NB: If providing log-ORs, you must also provide marginal.logor.ses and p.cases.training so that a linear transformaiton can be applied. |
n.training |
The sample size in which the marginal.betas were calculated. |
marginal.logor.ses |
IF marginal log-ORs were provided, please provide a named vector of standard errors here. These are necessary to convert them to a linear scale. Do not set if providing linear regression estimates. |
p.cases.training |
IF marginal log-ORs were provided, please provide the proportion of cases in the sample in which they were calculated |
ref.geno |
Reference genotype matrix which will be used to calculate SNP-SNP correlations. Individual's genotype must be coded as a numeric risk allele count 0/1/2. Non-integer values reflecting imputation uncertaimnty may be given. NB: The risk allele coding MUST correspond to that used in marginal.betas and column names must match the names of marginal.betas elements. |
total.snps.genome.wide |
Total number of SNPs being analysed across the genome. Used to determine the level of prior sparsity. |
n.cores |
Number of CPU cores to parralelise the SNP blocks over. If unspecified the number available will be detected. NB: On Windows systems this will be forced to 1, since mclapply does not yet support forking. |
n.mil |
Number of million iterations to run. In the paper we found 0.2 was sufficient to reach adequeate convergence. |
beta.binom.b.lambda |
Spartisty tuning parameter. Suggest a range of values is tried, e.g. 0.01, 0.1, 1, 10. See the paper for more detail. Default is 1. |
beta.binom.a |
First hyper-parameter of the beta-binomial prior on model sparsity. See the paper for more detail. Default 1 can usually be used. |
snps.blocks |
OPTIONAL: A partitioning of the SNPs into blocks, created using the function
|
initial.block.size |
Block size for the initial partitioning. Default is 100. This is ignored when initial.snps.blocks is given. |
initial.snps.blocks |
A user specficied initial partitioning of the SNPs. This must be a list, each element of which is a vector of SNP names that correspond to the columns of ref.geno. This could be defined, for example, according to LD information. |
seed |
An integer specifying the RJMCMC seed. If not set a random number will be used every time this is run. |
thinning.factor |
Determines every ith iteration to use from the RJMCMC sample. The more thinning that is applied the quicker the analysis will be, for the same number of iterations. Leaving at the default should be fine. |
effect.sd.uniform.prior |
Upper and lower uniform hyper-parameters for the prior on the standard deviation of SNP effects. Default is what we used in the paper, a Uniform(0.05, 2) distribution. |
residual.var.invgamma.prior |
Hyper-parameters for the inversegamma prior on the residual variance. Default is what we used in the paper, a Inverse-Gamma(0.01, 0.01) distribution. |
save.path |
Optional path for JAM to store and save its temporary files in. Can help with debugging (default NULL). |
debug |
An option passed to JAM requesting more verbose output (default FALSE). |
A JAMPred results object, which is a list including as elements step1.posterior.mean.snp.weights (which do not adjust for long range LD) and step2.posterior.mean.snp.weights (which do adjust for long range LD). These SNP weights can be used to generate predictions from individual level genotype data.
Paul Newcombe
See JAMPred_SplitIntoPositiveDefiniteBlocks
; this is the function used to create an initial
partitioning of SNPs into blocksm which satisfy the property of all being positvie definite with respect to the
reference genotype matrix (also required by JAMPred). By default this function is used to generate the snps.blocks
list input, if not provided by the user.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 | # --- Load libraries
library(R2BGLiMS)
data("JAMPred_Example")
###########################################################
# --- Simple demonstation of syntax for binary traits --- #
###########################################################
# NB: See third example for modelling different chromosomes and sparsities
# Run JAMPred
snps <- chromosome.snps[[1]] # Only use chromosome 1 data
jampred.res.bin <- JAMPred(
marginal.betas = marginal.logors[snps],
n.training = n.training,
marginal.logor.ses = marginal.logor.ses, # Only necessary for a binary trait
p.cases.training = n.cases.training/n.training, # Only necessary for a binary trait
ref.geno = data.validation[,snps],
total.snps.genome.wide = 500000, # Total SNPs across all chromosomes
n.mil = 0.2,
seed = 1, # For re-producibility. If not set a random seed is used
debug=T
)
# Generate predictions
out.of.sample.predictions <-
data.validation[,jampred.res.bin$snps] %*%
jampred.res.bin$step2.posterior.mean.snp.weights
# Predictive r2
cor(out.of.sample.predictions, data.validation[,"d"])^2 # Should be 0.18
###########################################################
# --- Simple demonstation of syntax for binary traits --- #
###########################################################
# NB: See third example for modelling different chromosomes and sparsities
# Run JAMPred
snps <- chromosome.snps[[1]] # Only use chromosome 1 data
jampred.res.lin <- JAMPred(
marginal.betas = marginal.ctsbetas[snps],
n.training = n.training,
ref.geno = data.validation.cts[,snps],
total.snps.genome.wide = 500000, # Total SNPs across all chromosomes
n.mil = 0.2,
seed = 1 # For re-producibility. If not set a random seed is used
)
# Generate predictions
out.of.sample.predictions <-
0 + # NB: A trait mean could be set here (otherwise it is assumed the outcome is mean-centred)
data.validation.cts[,jampred.res.lin$snps] %*%
jampred.res.lin$step2.posterior.mean.snp.weights
# Predictive r2
cor(out.of.sample.predictions, data.validation.cts[,"y"])^2 # Should be 0.15
##################################################################################
# --- Using JAMPred to model multiple chromosomes under different sparsities --- #
##################################################################################
# Run JAMPred, looping over sparsities and chromosomes
jampred.res.loop <- list() # List to hold all the results
for (lambda in c(0.01, 0.1, 1, 10)) { # Loop over lambda choices (see the paper)
jampred.res.loop[[paste(lambda)]] <- list()
for (chr in c(1:2)) { # Loop over chromosomes
jampred.res.loop[[paste(lambda)]][[chr]] <- JAMPred(
marginal.betas = marginal.logors[chromosome.snps[[chr]]],
n.training = n.training,
marginal.logor.ses = marginal.logor.ses,
p.cases.training = n.cases.training/n.training,
ref.geno = data.validation[,chromosome.snps[[chr]]],
total.snps.genome.wide = 500000,
n.mil = 0.2,
beta.binom.b.lambda = lambda,
initial.block.size = 100, # Default size of SNP blocks
seed = 1 # For re-producibility. If not set a random seed is used
)
}
}
# Generate predictions for each lambda, summing predictive scores over chromosoms
out.of.sample.predictions <- list()
for (lambda in c(0.01, 0.1, 1, 10)) {
out.of.sample.predictions[[paste(lambda)]] <- 0
for (chr in c(1:2)) {
out.of.sample.predictions[[paste(lambda)]] <-
out.of.sample.predictions[[paste(lambda)]] +
data.validation[,jampred.res.loop[[paste(lambda)]][[chr]]$snps] %*%
jampred.res.loop[[paste(lambda)]][[chr]]$step2.posterior.mean.snp.weights
}
}
# Predictive r2
sapply(out.of.sample.predictions, function(preds) cor(preds,data.validation[,"d"])^2)
# Predictive correlation 0.37 and highest at lambda = 0.01
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.