Description Usage Arguments Details Value Author(s) References
Stepwise regression for snp selection and haplotype testing
1 2 3 4 5 6 7 |
snps |
(n, m)-Matrix; n=No. of individuals; m=no. of SNPs; Rohde-Code |
trait |
numeric; Outcome, phenotype |
famid |
vector; Identifier for every family; only need in case of type=families |
patid |
vector; Identifier for every individuals; only need in case of type=families |
fid |
vector; Identifier for father (0=unkown); only need in case of type=families |
mid |
vector; Identifier for mother (0=unkown); only need in case of type=families |
adj.var |
(n, m)-Matrix; n = No. of individuals; m = no. of covariates; variables for adjustment |
lim |
numeric; threshold for skipping haplotypes from analysis |
maxSNP |
integer; Number of SNPs maximal group to multilocus genotypes |
nt |
integer; Number of notice best hits (for every step) |
sort.by |
the results in each step were sorted by "AIC", corrected ("AICc""), or p value ("p.value"). default = "AICc". |
selection |
0 = none, 1 = improve of the lowest corrected AIC (AICc) of the step before, 2 = improve of the lowest AIC of the step before, 3 = improve of p value, 4 = improve of best ten log10(p values), 5 = improve of the single AICc by adding one SNP to the noticed pattern |
p.threshold |
numeric vector; if global p value is lower than p.threshold[i], then the pattern will be stored for further processing. I indicates the number of SNPs. If your calculation should start with all pairwise SNPs, then p.threshold[1] will be not used but should be included. |
pair.begin |
If true then will be begin with first 2 SNP genotypes. Attention: k SNP lead to choose(k, 2) = k * (k - 1) / 2 possible pairs |
pattern.begin.mat |
if begin.pattern.mat is not NA then is this starting point of |
type |
type of depending variable |
baseline.hap |
Choose baseline haplotype for statistical test to avoid singularity. "max" for most frequent haplotype and "min" for less frequent haplotype |
min.count |
minimal count of rare haplotypes. If the count of estimated haplotypes < min.count, then the combined rare haplotypes were excluded from the analysis of that specific pattern. |
sort |
A logical value (TRUE or FALSE). If TRUE, family data will be sorted. |
Haplotypes are infered by EM algorithm (Excoffier and Slatkin 1995). Family haplotypes are inferred by modified EM algorithm proposed by Rohde (2001, 2003).
For normal distributed phenotypes from independent individuals we prefer an F test and for case control data we prefer the likelihood ratio test (logistic regression) in comparison of full model with genetic and non-genetic factors to a reduced model, which includes only non-genetic variables. In the case of no specified non-genetic variable only the intercept is used. If one of these tests are significance we assume a genetic effect. In case of family data the weigthed TDT statistic is used.
The procedure of multi-locus stepwise regression could be time consuming.
msr
provides a list with maxSNP components.
list |
for every step one component: SNP numbers and test details. |
Sven Knueppel and Klaus Rohde
Excoffier L, Slatkin M. Mol Biol Evol. 1995 Sep;12(5):921-7.
Rohde K, Fuerst R. Hum Hered. 2003;56(1-3):41-7.
Rohde K, Fuerst R. Hum Mutat. 2001 Apr;17(4):289-95.
Knueppel S, Esparza-Gordillo J, Marenholz I, Holzhuetter HG,
Bauerfeind A, Ruether A, Weidinger S, Lee Y-A, Rohde K.
Multi-locus stepwise regression: a haplotype-based algorithm
for finding genetic associations applied to atopic dermatitis.
BMC Med Genet 2012;13(1):8.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.