View source: R/logisticRidgeGenotypes.R
logisticRidgeGenotypes | R Documentation |
Fits logistic ridge regression models for genome-wide SNP data. The SNP genotypes are not read into R but file names are passed to the code directly, enabling the analysis of genome-wide SNP data sets which are too big to be read into R.
logisticRidgeGenotypes(genotypesfilename, phenotypesfilename, lambda = -1, thinfilename = NULL, betafilename = NULL, approxfilename = NULL, permfilename = NULL, intercept = TRUE, verbose = FALSE)
genotypesfilename |
character string: path to file containing SNP genotypes coded 0, 1,
2. See |
phenotypesfilename |
character string: path to file containing phenotypes. See |
lambda |
(optional) shrinkage parameter. If not provided, the default denotes automatic choice of the shrinkage parameter using the method of Cule & De Iorio (2012). |
thinfilename |
(optional) character string: path to file containing three columns: SNP name, chromosme and SNP psotion. See |
betafilename |
(optional) character string: path to file where the output will be written. See |
approxfilename |
(optional) character string: path to fine where the approximate test p-values will be written.
Approximate p-values are not computed unless this argument is given. Approximate p-values
are computed using the method of Cule et al (2011). See |
permfilename |
(optional) character string: path to file where the permutation test
p-values will be written.
Permutation test p-values are not computed unless this argument is
given. (See warning). See |
intercept |
Logical: Should the ridge regression model be fitted with an
intercept? Defaults to |
verbose |
Logical: If |
If a file thin
is supplied, and the shrinkage parameter
lambda
is being computed automatically based on the data, then
this file is used to thin the SNP data by SNP position. If this file
is not supplied, SNPs are thinned automatically based on number of SNPs.
The vector of fitted ridge regression coefficients.
If betafilename
is given, the fitted coefficients are written to this
file as well as being returned.
If approxfilename
and/or permfilename
are given, results of approximate
test p-values and/or permutation test p-values are written to the files
given in their arguments.
A header row, plus one row for each individual, one SNP per column. The header row contains SNP names. SNPs are coded as 0, 1, 2 for minor allele count. Missing values are not accommodated. Invariant SNPs in the data cause an error, please remove these from the file before calling the function.
A single column of phenotypes with the individuals in the same order as those in the file genotypesfilename
. Phenotypes must be coded as 0 or 1.
(optional) Three columns and the same number of rows as there are SNPs in the file genotypesfilename
, one row per SNP. First column: SNP names (must match names in genotypesfilename
); second column: chromosome; third column: SNP position in BP.
All output files are optional. Whether or not betafilename
is provided, fitted coefficients are returned to the R workshpace. If betafilename
is provided, fitted coefficients are written to the file specified (in addition).
Two columns: First column is SNP names in same order as in genotypesfilename
, second column is fitted coefficients. If intercept = TRUE
(the default) then the first row is the fitted intercept (with the name Intercept in the first column).
Two columns: First column is SNP names in same order as in genotypesfilename
, second column is approximate p-values.
Two columns: First column is SNP names in same order as in genotypesfilename
, second column is permutation p-values.
When data are large, the permutation test p-values
may take a very long time to compute. It is recommended not to request
permutation test p-values (using the argument permfilename
)
when data are large.
Erika Cule
Significance testing in ridge regression for genetic data. Cule, E. et al (2011) BMC Bioinformatics, 12:372 A semi-automatic method to guide the choice of ridge parameter in ridge regression. Cule, E. and De Iorio, M. (2012) arXiv:1205.0686v1 [stat.AP]
logisticRidge
for fitting logistic ridge regression models
when the data are small enough to be read into R.
linearRidge
and linearRidgeGenotypes
for fitting linear ridge
regression models.
## Not run: genotypesfile <- system.file("extdata","GenBin_genotypes.txt",package = "ridge") phenotypesfile <- system.file("extdata","GenBin_phenotypes.txt",package = "ridge") beta_logisticRidgeGenotypes <- logisticRidgeGenotypes(genotypesfilename = genotypesfile, phenotypesfilename = phenotypesfile) ## compare to output of logisticRidge data(GenBin) ## Same data as in GenBin_genotypes.txt and GenBin_phenotypes.txt beta_logisticRidge <- logisticRidge(Phenotypes ~ ., data = as.data.frame(GenBin)) cbind(round(coef(beta_logisticRidge), 6), beta_logisticRidgeGenotypes) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.