Description Usage Arguments Details Value Author(s) References See Also Examples
Simulate data from a mixture of regressions model, as specified by the user or as fitted to a data set. The simulation may be done either in a parametric or “semiparametric” manner.
1 2 3 4 5 6 7 |
x |
For the default method, this is a numeric vector constituting
a predictor for a regression model, or a matrix whose columns
form such predictors. The number of columns of Data from a scalar mixture model may also be generated by
specifying For the |
nobs |
Integer scalar, specifying the number of observations to be
generated. Used only if argument |
theta |
Either a list or a matrix specifying the parameters of the model
from which data are to be simulated. If it is a list it should
have components |
seed |
A numeric scalar. If not an integer it gets rounded to the nearest
integer (so |
xNms |
Character vector of names for the predictors other than
the intercept (if there is one). This vector must be of length
equal to the number of (non-intercept) predictors. This is
equal to the number of columns of |
yNm |
Character scalar; a name for the response. |
semiPar |
Logical scalar. Should the simulation be done “semiparametrically”? (See Details.) |
conditional |
Logical scalar; should the component-selection probabilities be determined conditionally upon the observations? |
... |
Not used. |
In this context “parametric” bootstrapping means that the
bootstrap data sets are generated by simulating from the fitted
ncomp
model parameters, using Gaussian errors. In contrast
semiparametric bootstrapping means that the errors are generated by
resampling from the residuals. Since at each predictor value there
are ncomp
residuals, one for each component of the model,
the errors are selected from these ncomp
possibilities.
If the argument conditional
is TRUE
then the selection
probabilities at this step are the conditional probabilities, of
the observation being generated by each component of the model,
given that observation. If conditional
is FALSE
then
these probabilities are the corresponding entries of lambda
(see Value. The residuals are sampled independently
in either case. The procedure is termed semiparametric
(rather than non-parametric) since the sampling probabilities depend
on the parameters of the model. Note that it makes no sense to
specify conditional=TRUE
if semiPar
is FALSE
.
Doing so will generate an error.
A data frame whose columns consist of the predictors and the
simulated response. For the default method the predictor are the
columns of the matrix specified by argument x
. They have
names given by argument xNms
if this was provided and by
X1
, X2
, ..., Xn
(where n
is the
number of columns of x
) or simply x
if there is
only a single predictor. For the "mixreg"
method the
columns are the same as those of x$data
, with response
column replaced by the simulated response.
Rolf Turner r.turner@auckland.ac.nz
Turner, T. R. (2000) Estimating the rate of spread of a viral infection of potato plants via mixtures of regressions. Applied Statistics 49, Part 3 pp. 371 – 384.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | fit <- mixreg(plntsInf ~ aphRel, ncomp=2, data=aphids)
sim1 <- rmixreg(fit)
with(sim1,plot(aphRel,plntsInf,,main="Parametric simulation"))
sim2 <- rmixreg(fit,semiPar=TRUE)
with(sim2,plot(aphRel,plntsInf,,main="Semiparametric simulation"))
x <- cbind(1:50,rnorm(50))
pmat <- matrix(c(3,5,0.01,1600,0.7,1,2,-0.01,100,0.3),nrow=2,byrow=TRUE)
sim3 <- rmixreg(x,theta=pmat,seed=42)
with(sim3,plot(X1,y,main="Using rmixreg.default; predictor 1"))
with(sim3,plot(X2,y,main="Using rmixreg.default; predictor 2"))
pmat <- matrix(c(10,2,0.7,3,1,0.3),nrow=2,byrow=TRUE)
sim4 <- rmixreg(x=rep(1,50),theta=pmat,seed=17)
sim5 <- rmixreg(x=NULL,nobs=50,theta=pmat,seed=17) # Same as sim4 but
# with no columns of 1s.
chk4 <- mixreg(y~1,data=sim4,ncomp=2,seed=116)
chk5 <- mixreg(y~1,data=sim5,ncomp=2,seed=116) # Same-same.
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.