Description Usage Arguments Details Value Author(s) Examples
Fit Generalized Linear Mixed Effects model (GLMM) with logistic link and a normal distributed random intercept for each cluster to test associations between a dichotomous phenotype
and all genotyped SNPs in a genotype file in family data with user specified genetic model. Each pedigree is treated as a cluster.
This function applies the same trait-SNP association test to all SNPs in the genotype data. When analyzing rare variants for dichotomous traits, this GLMM, as implemeted by this function,
is recommended over other methods such as GEE. The trait-SNP association test is carried out by glmm.lgst
function where the
the lmer
function from package lme4
is used.
1 2 | glmm.lgst.batch(genfile, phenfile, pedfile, outfile, phen, covars = NULL,
model = "a", col.names = T, sep.ped = ",", sep.phe = ",", sep.gen = ",")
|
genfile |
a character string naming the genotype file for reading (see format requirement in details) |
phenfile |
a character string naming the phenotype file for reading (see format requirement in details) |
pedfile |
a character string naming the pedigree file for reading (see format requirement in details) |
outfile |
a character string naming the result file for writing |
phen |
a character string for a phenotype name in |
covars |
a character vector for covariates in |
model |
a single character of 'a','d','g', or 'r', with 'a'=additive, 'd'=dominant, 'g'=general and 'r'=recessive models |
col.names |
a logical value indicating whether the output file should contain column names |
sep.ped |
the field separator character for pedigree file |
sep.phe |
the field separator character for phenotype file |
sep.gen |
the field separator character for genotype file |
The glmm.lgst.batch
function first reads in and merges phenotype-covariates, genotype
and pedigree files, then tests the association of phen
against all SNPs in genfile
.
genfile
contains unique individual id and genotype data, with the column names being "id" and SNP names.
For each genotyped SNP, the genotype data should be coded as 0, 1, 2 indicating the numbers of the coded alleles. The SNP names in genotype file should not have any
dash, '-' and other special characters(dots and underscores are OK). phenfile
contains unique individual id,
phenotype and covariates data, with the column names being "id" and phenotype and
covaraite names. pedfile
contains pedigree informaion, with the column names being
"famid","id","fa","mo","sex". In all files, missing value should be an empty space, except missing parental id in pedfile
.
Only phenotypes with two categories are analyzed. A phenotype should be coded as
0 and 1, with 1 denoting affected and 0 unaffected. SNPs with low genotype counts
(especially minor allele homozygote) may be omitted or analyzed with dominant model or
analyzed with logistic regression.
The glmm.lgst.batch
function fits GLMM using each pedigree as a cluster
with glmm.lgst
function from GWAF package and lmer
function from lme4
package.
No value is returned. Instead, results are written to outfile
.
When the genetic model is 'a', 'd' or 'r', the result includes the following columns.
When the genetic model is 'g', beta
and se
are replaced with beta10
,
beta20
, beta21
, se10
, se20
, and se21
.
phen |
phenotype name |
snp |
SNP name |
n0 |
the number of individuals with 0 copy of coded alleles |
n1 |
the number of individuals with 1 copy of coded alleles |
n2 |
the number of individuals with 2 copies of coded alleles |
nd0 |
the number of individuals with 0 copy of coded alleles in affected sample |
nd1 |
the number of individuals with 1 copy of coded alleles in affected sample |
nd2 |
the number of individuals with 2 copies of coded alleles in affected sample |
miss.0 |
Genotype missing rate in unaffected sample |
miss.1 |
Genotype missing rate in affected sample |
miss.diff.p |
P-value of differential missingness test between unaffected and affected samples |
beta |
regression coefficient of SNP covariate |
se |
standard error of |
chisq |
Chi-square statistic for testing |
df |
degree of freedom of the Chi-square statistic |
model |
model actually used in the analysis |
remark |
warning or additional information for the analysis, 'exp count<5' indicates any expected count is less than 5 in phenotype-genotype table; 'collinearity' indicates collinearity exists between SNP and some covariates |
pval |
p-value of the chi-square statistic |
|
|
beta10 |
regression coefficient of genotype with 1 copy of coded allele vs. that with 0 copy |
beta20 |
regression coefficient of genotype with 2 copy of coded allele vs. that with 0 copy |
beta21 |
regression coefficient of genotype with 2 copy of coded allele vs. that with 1 copy |
se10 |
standard error of |
se20 |
standard error of |
se21 |
standard error of |
Qiong Yang <qyang@bu.edu> and Ming-Huei Chen <mhchen@bu.edu>
1 2 3 4 5 | ## Not run:
glmm.lgst.batch(phenfile="simphen.csv",genfile="simgen.csv",pedfile="simped.csv",
phen="SIMQT",model="d",outfile="simout.csv",sep.ped=",",sep.phe=",",sep.gen=",")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.