Description Usage Arguments Details Value Models Methods Covariates X chromosome Author(s) References See Also Examples
Genome scan with a single QTL model, with possible allowance for covariates, using any of several possible models for the phenotype and any of several possible numerical methods.
1 2 3 4 5 6 7  scanone(cross, chr, pheno.col=1, model=c("normal","binary","2part","np"),
method=c("em","imp","hk","ehk","mr","mrimp","mrargmax"),
addcovar=NULL, intcovar=NULL, weights=NULL,
use=c("all.obs", "complete.obs"), upper=FALSE,
ties.random=FALSE, start=NULL, maxit=4000,
tol=1e4, n.perm, perm.Xsp=FALSE, perm.strata=NULL, verbose,
batchsize=250, n.cluster=1, ind.noqtl)

cross 
An object of class 
chr 
Optional vector indicating the chromosomes for which LOD
scores should be calculated. This should be a vector of character
strings referring to chromosomes by name; numeric values are
converted to strings. Refer to chromosomes with a preceding 
pheno.col 
Column number in the phenotype matrix which should be
used as the phenotype. This can be a vector of integers; for methods

model 
The phenotype model: the usual normal model, a model for binary traits, a twopart model or nonparametric analysis 
method 
Indicates whether to use the EM algorithm,
imputation, HaleyKnott regression, the extended HaleyKnott method,
or marker regression. Not all methods are available for all models.
Marker regression is performed either by dropping individuals with
missing genotypes ( 
addcovar 
Additive covariates; allowed only for the normal and binary models. 
intcovar 
Interactive covariates (interact with QTL genotype); allowed only for the normal and binary models. 
weights 
Optional weights of individuals. Should be either NULL
or a vector of length n.ind containing positive weights. Used only
in the case 
use 
In the case that multiple phenotypes are selected to be scanned, this argument indicates whether to use all individuals, including those missing some phenotypes, or just those individuals that have data on all selected phenotypes. 
upper 
Used only for the twopart model; if true, the "undefined" phenotype is the maximum observed phenotype; otherwise, it is the smallest observed phenotype. 
ties.random 
Used only for the nonparametric "model"; if TRUE, ties in the phenotypes are ranked at random. If FALSE, average ranks are used and a corrected LOD score is calculated. 
start 
Used only for the EM algorithm with the normal model and
no covariates. If 
maxit 
Maximum number of iterations for methods 
tol 
Tolerance value for determining convergence for methods

n.perm 
If specified, a permutation test is performed rather than an analysis of the observed data. This argument defines the number of permutation replicates. 
perm.Xsp 
If 
perm.strata 
If 
verbose 
In the case 
batchsize 
The number of phenotypes (or permutations) to be run
as a batch; used only for methods 
n.cluster 
If the package 
ind.noqtl 
Indicates individuals who should not be allowed a QTL effect (used rarely, if at all); this is a logical vector of same length as there are individuals in the cross. 
Use of the EM algorithm, HaleyKnott regression, and the extended
HaleyKnott method require that multipoint genotype probabilities are
first calculated using calc.genoprob
. The
imputation method uses the results of sim.geno
.
Individuals with missing phenotypes are dropped.
In the case that n.perm
>0, so that a permutation
test is performed, the R function scanone
is called repeatedly.
If perm.Xsp=TRUE
, separate permutations are performed for the
autosomes and the X chromosome, so that an Xchromosomespecific
threshold may be calculated. In this case, n.perm
specifies
the number of permutations used for the autosomes; for the X
chromosome, n.perm
* L_A/L_X permutations
will be run, where L_A and L_X are the total genetic
lengths of the autosomes and X chromosome, respectively. More
permutations are needed for the X chromosome in order to obtain
thresholds of similar accuracy.
For further details on the models, the methods and the use of covariates, see below.
If n.perm
is missing, the function returns a data.frame whose
first two columns contain the chromosome IDs and cM positions.
Subsequent columns contain the LOD scores for each phenotype.
In the case of the twopart model, there are three LOD score columns
for each phenotype: LOD(p,mu), LOD(p) and
LOD(mu). The result is given class "scanone"
and
has attributes "model"
, "method"
, and
"type"
(the latter is the type of cross analyzed).
If n.perm
is specified, the function returns the results of a
permutation test and the output has class "scanoneperm"
. If
perm.Xsp=FALSE
, the function returns a matrix with
n.perm
rows, each row containing the genomewide
maximum LOD score for each of the phenotypes. In the case of the
twopart model, there are three columns for each phenotype,
corresponding to the three different LOD scores. If
perm.Xsp=TRUE
, the result contains separate permutation results
for the autosomes and the X chromosome respectively, and an attribute
indicates the lengths of the chromosomes and an indicator of which
chromosome is X.
The normal model is the standard model for QTL mapping (see Lander and Botstein 1989). The residual phenotypic variation is assumed to follow a normal distribution, and analysis is analogous to analysis of variance.
The binary model is for the case of a binary phenotype, which
must have values 0 and 1. The proportions of 1's in the different
genotype groups are compared. Currently only methods em
, hk
, and
mr
are available for this model. See Xu and Atchley (1996) and
Broman (2003).
The twopart model is appropriate for the case of a spike in the phenotype distribution (for example, metastatic density when many individuals show no metastasis, or survival time following an infection when individuals may recover from the infection and fail to die). The twopart model was described by Boyartchuk et al. (2001) and Broman (2003). Individuals with QTL genotype g have probability p[g] of having an undefined phenotype (the spike), while if their phenotype is defined, it comes from a normal distribution with mean mu[g] and common standard deviation s. Three LOD scores are calculated: LOD(p,mu) is for the test of the hypothesis that p[g] = p and mu[g] = mu. LOD(p) is for the test that p[g] = p while the mu[g] may vary. LOD(mu) is for the test that mu[g] = mu while the p[g] may vary.
With the nonparametric "model", an extension of the
KruskalWallis test is used; this is similar to the method described
by Kruglyak and Lander (1995). In the case of incomplete genotype
information (such as at locations between genetic markers), the
KruskalWallis statistic is modified so that the rank for each
individual is weighted by the genotype probabilities, analogous to
HaleyKnott regression. For this method, if the argument
ties.random
is TRUE, ties in the phenotypes are assigned random
ranks; if it is FALSE, average ranks are used and a corrected LOD
score is calculate. Currently the method
argument is ignored
for this model.
em
: maximum likelihood is performed via the
EM algorithm (Dempster et al. 1977), first used in this context by
Lander and Botstein (1989).
imp
: multiple imputation is used, as described by Sen
and Churchill (2001).
hk
: HaleyKnott regression is used (regression of the
phenotypes on the multipoint QTL genotype probabilities), as described
by Haley and Knott (1992).
ehk
: the extended HaleyKnott method is used (like HK,
but taking account of the variances), as described in Feenstra et
al. (2006).
mr
: Marker regression is used. Analysis is performed
only at the genetic markers, and individuals with missing genotypes
are discarded. See Soller et al. (1976).
Covariates are allowed only for the normal and binary models. The normal model is y = b[q] + A g + Z d[q] + e where q is the unknown QTL genotype, A is a matrix of additive covariates, and Z is a matrix of covariates that interact with the QTL genotype. The columns of Z are forced to be contained in the matrix A. The binary model is the logistic regression analog.
The LOD score is calculated comparing the likelihood of the above model to that of the null model y = m + A g + e.
Covariates must be numeric matrices. Individuals with any missing covariates are discarded.
The X chromosome must be treated specially in QTL mapping. See Broman et al. (2006).
If both males and females are included, male hemizygotes are allowed to be different from female homozygotes. Thus, in a backcross, we will fit separate means for the genotype classes AA, AB, AY, and BY. In such cases, sex differences in the phenotype could cause spurious linkage to the X chromosome, and so the null hypothesis must be changed to allow for a sex difference in the phenotype.
Numerous special cases must be considered, as detailed in the following table.
BC  Sexes  Null  Alternative  df  
both sexes  sex  AA/AB/AY/BY  2  
all female  grand mean  AA/AB  1  
all male  grand mean  AY/BY  1  
F2  Direction  Sexes  Null  Alternative  df 
Both  both sexes  femaleF/femaleR/male  AA/ABf/ABr/BB/AY/BY  3  
all female  pgm  AA/ABf/ABr/BB  2  
all male  grand mean  AY/BY  1  
Forward  both sexes  sex  AA/AB/AY/BY  2  
all female  grand mean  AA/AB  1  
all male  grand mean  AY/BY  1  
Backward  both sexes  sex  AB/BB/AY/BY  2  
all female  grand mean  AB/BB  1  
all male  grand mean  AY/BY  1  
In the case that the number of degrees of freedom for the linkage test
for the X chromosome is different from that for autosomes, a separate
Xchromosome LOD threshold is recommended. Autosome and
Xchromosomespecific LOD thresholds may be estimated by permutation
tests with scanone
by setting n.perm
>0 and using
perm.Xsp=TRUE
.
Karl W Broman, [email protected]; Hao Wu
Boyartchuk, V. L., Broman, K. W., Mosher, R. E., D'Orazio S. E. F., Starnbach, M. N. and Dietrich, W. F. (2001) Multigenic control of Listeria monocytogenes susceptibility in mice. Nature Genetics 27, 259–260.
Broman, K. W. (2003) Mapping quantitative trait loci in the case of a spike in the phenotype distribution. Genetics 163, 1169–1175.
Broman, K. W., Sen, Ś, Owens, S. E., Manichaikul, A., SouthardSmith, E. M. and Churchill G. A. (2006) The X chromosome in quantitative trait locus mapping. Genetics, 174, 2151–2158.
Churchill, G. A. and Doerge, R. W. (1994) Empirical threshold values for quantitative trait mapping. Genetics 138, 963–971.
Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B, 39, 1–38.
Feenstra, B., Skovgaard, I. M. and Broman, K. W. (2006) Mapping quantitative trait loci by an extension of the HaleyKnott regression method using estimating equations. Genetics, 173, 2111–2119.
Haley, C. S. and Knott, S. A. (1992) A simple regression method for mapping quantitative trait loci in line crosses using flanking markers. Heredity 69, 315–324.
Kruglyak, L. and Lander, E. S. (1995) A nonparametric approach for mapping quantitative trait loci. Genetics 139, 1421–1428.
Lander, E. S. and Botstein, D. (1989) Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121, 185–199.
Sen, Ś. and Churchill, G. A. (2001) A statistical framework for quantitative trait mapping. Genetics 159, 371–387.
Soller, M., Brody, T. and Genizi, A. (1976) On the power of experimental designs for the detection of linkage between marker loci and quantitative loci in crosses between inbred lines. Theor. Appl. Genet. 47, 35–39.
Xu, S., and Atchley, W.R. (1996) Mapping quantitative trait loci for complex binary diseases using line crosses. Genetics 143, 1417–1424.
plot.scanone
,
summary.scanone
, scantwo
,
calc.genoprob
, sim.geno
,
max.scanone
,
summary.scanoneperm
,
.scanone
, +.scanone
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173  ###################
# Normal Model
###################
data(hyper)
# Genotype probabilities for EM and HK
## Not run: hyper < calc.genoprob(hyper, step=2.5)
out.em < scanone(hyper, method="em")
out.hk < scanone(hyper, method="hk")
# Summarize results: peaks above 3
summary(out.em, thr=3)
summary(out.hk, thr=3)
# An alternate method of summarizing:
# patch them together and then summarize
out < c(out.em, out.hk)
summary(out, thr=3, format="allpeaks")
# Plot the results
plot(out.hk, out.em)
plot(out.hk, out.em, chr=c(1,4), lty=1, col=c("blue","black"))
# Imputation; first need to run sim.geno
# Do just chromosomes 1 and 4, to save time
## Not run: hyper.c1n4 < sim.geno(subset(hyper, chr=c(1,4)),
step=2.5, n.draws=8)
## End(Not run)
out.imp < scanone(hyper.c1n4, method="imp")
summary(out.imp, thr=3)
# Plot all three results
plot(out.imp, out.hk, out.em, chr=c(1,4), lty=1,
col=c("red","blue","black"))
# extended HaleyKnott
out.ehk < scanone(hyper, method="ehk")
plot(out.hk, out.em, out.ehk, chr=c(1,4))
# Permutation tests
## Not run: permo < scanone(hyper, method="hk", n.perm=1000)
# Threshold from the permutation test
summary(permo, alpha=c(0.05, 0.10))
# Results above the 0.05 threshold
summary(out.hk, perms=permo, alpha=0.05)
####################
# scan with squareroot of phenotype
# (Note that pheno.col can be a vector of phenotype values)
####################
out.sqrt < scanone(hyper, pheno.col=sqrt(pull.pheno(hyper, 1)))
plot(out.em  out.sqrt, ylim=c(0.1,0.1),
ylab="Difference in LOD")
abline(h=0, lty=2, col="gray")
####################
# Stratified permutations
####################
extremes < (nmissing(hyper)/totmar(hyper) < 0.5)
## Not run: operm.strat < scanone(hyper, method="hk", n.perm=1000,
perm.strata=extremes)
## End(Not run)
summary(operm.strat)
####################
# Xspecific permutations
####################
data(fake.f2)
## Not run: fake.f2 < calc.genoprob(fake.f2, step=2.5)
# genome scan
out < scanone(fake.f2, method="hk")
# Xchrspecific permutations
## Not run: operm < scanone(fake.f2, method="hk", n.perm=1000, perm.Xsp=TRUE)
# thresholds
summary(operm)
# scanone summary with pvalues
summary(out, perms=operm, alpha=0.05, pvalues=TRUE)
###################
# Nonparametric
###################
out.np < scanone(hyper, model="np")
summary(out.np, thr=3)
# Plot with previous results
plot(out.np, chr=c(1,4), lty=1, col="green")
plot(out.imp, out.hk, out.em, chr=c(1,4), lty=1,
col=c("red","blue","black"), add=TRUE)
###################
# Twopart Model
###################
data(listeria)
## Not run: listeria < calc.genoprob(listeria,step=2.5)
out.2p < scanone(listeria, model="2part", upper=TRUE)
summary(out.2p, thr=c(5,3,3), format="allpeaks")
# Plot all three LOD scores together
plot(out.2p, out.2p, out.2p, lodcolumn=c(2,3,1), lty=1, chr=c(1,5,13),
col=c("red","blue","black"))
# Permutation test
## Not run: permo < scanone(listeria, model="2part", upper=TRUE,
n.perm=1000)
## End(Not run)
# Thresholds
summary(permo)
###################
# Binary model
###################
binphe < as.numeric(pull.pheno(listeria,1)==264)
out.bin < scanone(listeria, pheno.col=binphe, model="binary")
summary(out.bin, thr=3)
# Plot LOD for binary model with LOD(p) from 2part model
plot(out.bin, out.2p, lodcolumn=c(1,2), lty=1, col=c("black", "red"),
chr=c(1,5,13))
# Permutation test
## Not run: permo < scanone(listeria, pheno.col=binphe, model="binary",
n.perm=1000)
## End(Not run)
# Thresholds
summary(permo)
###################
# Covariates
###################
data(fake.bc)
## Not run: fake.bc < calc.genoprob(fake.bc, step=2.5)
# genome scans without covariates
out.nocovar < scanone(fake.bc)
# genome scans with covariates
ac < pull.pheno(fake.bc, c("sex","age"))
ic < pull.pheno(fake.bc, "sex")
out.covar < scanone(fake.bc, pheno.col=1,
addcovar=ac, intcovar=ic)
summary(out.nocovar, thr=3)
summary(out.covar, thr=3)
plot(out.covar, out.nocovar, chr=c(2,5,10))

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.