scan1 | R Documentation |
Genome scan with a single-QTL model by Haley-Knott regression or a linear mixed model, with possible allowance for covariates.
scan1(
genoprobs,
pheno,
kinship = NULL,
addcovar = NULL,
Xcovar = NULL,
intcovar = NULL,
weights = NULL,
reml = TRUE,
model = c("normal", "binary"),
hsq = NULL,
cores = 1,
...
)
genoprobs |
Genotype probabilities as calculated by
|
pheno |
A numeric matrix of phenotypes, individuals x phenotypes. |
kinship |
Optional kinship matrix, or a list of kinship matrices (one per chromosome), in order to use the LOCO (leave one chromosome out) method. |
addcovar |
An optional numeric matrix of additive covariates. |
Xcovar |
An optional numeric matrix with additional additive covariates used for null hypothesis when scanning the X chromosome. |
intcovar |
An numeric optional matrix of interactive covariates. |
weights |
An optional numeric vector of positive weights for the
individuals. As with the other inputs, it must have |
reml |
If |
model |
Indicates whether to use a normal model (least
squares) or binary model (logistic regression) for the phenotype.
If |
hsq |
Considered only if |
cores |
Number of CPU cores to use, for parallel calculations.
(If |
... |
Additional control parameters; see Details. |
We first fit the model y = X \beta + \epsilon
where X
is a matrix of covariates (or just an intercept) and
\epsilon
is multivariate normal with mean 0 and covariance
matrix \sigma^2 [h^2 (2 K) + I]
where
K
is the kinship matrix and I
is the identity matrix.
We then take h^2
as fixed and then scan the genome, at
each genomic position fitting the model y = P \alpha + X \beta
+ \epsilon
where P
is a matrix of genotype
probabilities for the current position and again X
is a
matrix of covariates \epsilon
is multivariate normal with
mean 0 and covariance matrix \sigma^2 [h^2 (2 K) +
I]
, taking h^2
to be known.
Note that if weights
are provided, the covariance matrix becomes
\sigma^2 [h^2 (2 K) + D]
where
D
is a diagonal matrix with the reciprocal of the weights.
For each of the inputs, the row names are used as
individual identifiers, to align individuals. The genoprobs
object should have a component "is_x_chr"
that indicates
which of the chromosomes is the X chromosome, if any.
The ...
argument can contain several additional control
parameters; suspended for simplicity (or confusion, depending on
your point of view). tol
is used as a tolerance value for linear
regression by QR decomposition (in determining whether columns are
linearly dependent on others and should be omitted); default
1e-12
. intcovar_method
indicates whether to use a high-memory
(but potentially faster) method or a low-memory (and possibly
slower) method, with values "highmem"
or "lowmem"
; default
"lowmem"
. max_batch
indicates the maximum number of phenotypes
to run together; default is unlimited. maxit
is the maximum
number of iterations for converence of the iterative algorithm
used when model=binary
. bintol
is used as a tolerance for
converence for the iterative algorithm used when model=binary
.
eta_max
is the maximum value for the "linear predictor" in the
case model="binary"
(a bit of a technicality to avoid fitted
values exactly at 0 or 1).
If kinship
is absent, Haley-Knott regression is performed.
If kinship
is provided, a linear mixed model is used, with a
polygenic effect estimated under the null hypothesis of no (major)
QTL, and then taken as fixed as known in the genome scan.
If kinship
is a single matrix, then the hsq
in the results is a vector of heritabilities (one value for each phenotype). If
kinship
is a list (one matrix per chromosome), then
hsq
is a matrix, chromosomes x phenotypes.
An object of class "scan1"
: a matrix of LOD scores, positions x phenotypes.
Also contains one or more of the following attributes:
sample_size
- Vector of sample sizes used for each
phenotype
hsq
- Included if kinship
provided: A matrix of
estimated heritabilities under the null hypothesis of no
QTL. Columns are the phenotypes. If the "loco"
method was
used with calc_kinship()
to calculate a list
of kinship matrices, one per chromosome, the rows of hsq
will be the heritabilities for the different chromosomes (well,
leaving out each one). If Xcovar
was not NULL, there will at
least be an autosome and X chromosome row.
Haley CS, Knott SA (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69:315–324.
Kang HM, Zaitlen NA, Wade CM, Kirby A, Heckerman D, Daly MJ, Eskin E (2008) Efficient control of population structure in model organism association mapping. Genetics 178:1709–1723.
scan1perm()
, scan1coef()
, cbind.scan1()
, rbind.scan1()
, scan1max()
# read data
iron <- read_cross2(system.file("extdata", "iron.zip", package="qtl2"))
# insert pseudomarkers into map
map <- insert_pseudomarkers(iron$gmap, step=1)
# calculate genotype probabilities
probs <- calc_genoprob(iron, map, error_prob=0.002)
# grab phenotypes and covariates; ensure that covariates have names attribute
pheno <- iron$pheno
covar <- match(iron$covar$sex, c("f", "m")) # make numeric
names(covar) <- rownames(iron$covar)
Xcovar <- get_x_covar(iron)
# perform genome scan
out <- scan1(probs, pheno, addcovar=covar, Xcovar=Xcovar)
# leave-one-chromosome-out kinship matrices
kinship <- calc_kinship(probs, "loco")
# genome scan with a linear mixed model
out_lmm <- scan1(probs, pheno, kinship, covar, Xcovar)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.