krm.most: Kernel-based Regression Model Maximum of adjusted Score Test
In krm: Kernel Based Regression Models

View source: R/krm.most.R

krm.most

R Documentation

Kernel-based Regression Model Maximum of adjusted Score Test

Description

Performs maximum of adjusted score test for kernel-based regression models. Both Euclidean and protein sequence covariates can be used to form kernels.

Usage

krm.most (formula, data, regression.type=c("logistic","linear"), 
    kern.type=c("rbf","mi","mm","prop"), n.rho=10, range.rho=0.99, 
    n.mc=2000, 
    seq.file.name=NULL, formula.kern=NULL, seq.start=NULL, seq.end=NULL,
    inference.method=c("parametric.bootstrap", "perturbation", "Davies"),
    verbose=FALSE)

Arguments

`formula`	a formula object describing the null model
`data`	data frame
`regression.type`	logistic regression or linear regression
`kern.type`	rbf: radial basis function kernel, a kernel type for Euclidean covariates. The other three kernels are for protein sequence covariates (Fong et al. 2014). mm: match-mismatch, prop: physicochemical properties, mi: mutual information,
`n.rho`	integer. Number of rhos to maximize over
`range.rho`	numeric. A number between 0 and 1. It controls the range of rhos to use to compute kernel
`seq.file.name`	There are two ways to provide protein sequence information. One is to supply a sequence file named 'seq.file.name', which contains sequences in fasta format. Two is to supply a formula through formula.kern, and the variable name should be part of data.
`formula.kern`	The formula for the covariates used to form the kernel. It may specify Euclidean covariates or a string covariate that contains protein sequences.
`seq.start`	integer. Start position of subsequence to be used in computing kernel. Only supported when the sequence is specified through formula.kern.
`seq.end`	integer. End position of subsequence to be used in computing kernel. Only supported when the sequence is specified through formula.kern.
`n.mc`	integer. Number of bootstrap samples used to compute p-values.
`inference.method`	parametric.bootstrap implements methods from Fong et al. (2014). perturbation uses methods from Wu et a. (2013) Davies uses the upper bound method from Davies (1987) and Liu et al. (2008).
`verbose`	boolean

Value

A list of class krm

p.values

If inference.method=="Davies", a single p-value. If inference.method=="perturbation" or "parametric.bootstrap", a vector of four p-values, named chiI, chiII, normI, normII. For perturbation, chiII and normII are NA. chiI/chiII p-values are based on chi-squared approximation and normI/normII are based on normal approximations. chiI/normI p-values are based on plugin estimator of mean and variance of score statistic, chiII/normII are based on modified estimator of mean and variance of score statistic. chiII or normII are more powerful than chiI and normI. For more details, see Fong et al. (2014)

References

Fong, Y. and Datta, S. and Georgiev, I. and Kwong, P. and Tomaras, G. (2014) Kernel-based Logistic Regression Model for Protein Sequence Without Vectorialization, Biostatistics.

Wu et al. (2013) Kernel Machine SNP-Set Testing Under Multiple Candidate Kernels, Genetic epidemiology.

Davies, R. (1987) Hypothesis testing when a nuisance parameter is present only under the alternative, Biometrika, 74, 33-43.

Liu, D., Ghosh, D. and Lin, X. (2008) Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models, BMC Bioinformatics, 9, 292.

Examples


# in addition to the examples listed here, there are more examples 
# under folder R/library/krm/unitTests

## Not run: 
# the examples are not run during package build because it takes a little too long to run

# an Euclidean kernel example from Liu et al. (2008)
data=sim.liu.2008 (n=100, a=.1, seed=1) 
test = krm.most(y~x, data, formula.kern=~z.1+z.2+z.3+z.4+z.5, kern.type="rbf")


# a protein sequence kernel example
dat.file.name=paste(system.file(package="krm")[1],'/misc/y1.txt', sep="") 
seq.file.name=paste(system.file(package="krm")[1],'/misc/sim1.fasta', sep="")
dat=read.table(dat.file.name); names(dat)="y"
test = krm.most (y~1, dat, seq.file.name=seq.file.name, kern.type="mi")



## End(Not run)

krm documentation built on Oct. 18, 2022, 9:09 a.m.