varPed | R Documentation |
Creates offspring specific design matrices the columns of which refer to the explanatory variables of the liner model.
varPed(x, gender=NULL, lag=c(0,0), relational=FALSE, lag_relational=c(0,0), restrict=NULL, keep=FALSE, USvar=NULL, merge=FALSE, NAvar=NULL)
x |
predictor variable; numeric or factor |
gender |
the gender of the parent to which |
lag |
numeric vector of length 2. The time interval over which |
relational |
a character string. If "OFFSPRING", the Euclidean distance between |
lag_relational |
numeric vector of length 2. If |
restrict |
character string designating parents with a zero prior probability of parentage. Only parents for which |
keep |
logical; if |
USvar |
if |
merge |
logical; if |
NAvar |
numeric; replacement for missing values in the predictors. |
The design matrix for each offspring represents the state of each parental (dam/sire) combination for each explanatory variable. The number of rows in the design matrix (the number of parental combinations) is free to vary across offspring, but the number of explanatory variables remain the same. As with standard generalised linear modelling the columns of the design matrices take on numerical values or inidicator values for continuous and categorical variables, respectively. When relational=FALSE
, elements of the design matrices refering to specific parental combinations will not vary across offspring (unless longitudinal data are being used) and the associated vector of parameters will relate the explanatory variables to overall fecundity. For these variables the model is essentially the multinomial analogue of the more familiar Poisson model often used to analyse such data. However, the counts of the multinomial are not known with certainty because uncertainty exists around the maternity and/or paternity of each offspring.
Additional variables can be fitted that relate specific parental combinations to specific offspring, or specific dams to specific sires. Elements of the design matrices refering to specific parental combinations are then free to vary across offspring. The most obvious variable of this type is the mendelian transition probability obtained from the genetic data themsleves. However, by specifying relational="OFFSPRING"
, relational="OFFSPRINGV"
, relational="MATE"
or relational="MATEV"
, non-genetic variables are free to vary across offspring. When x
is numeric
the Euclidean distances between parents and offspring, or between mates enter into the design matrix, when relational="OFFSPRING"
or relational="MATE"
respectively. When relational="OFFSPRINGV"
or relational="MATEV"
are specified a signed vector is calculated rather than a distance. When x
is a factor
then an indicator variable is set up indicating whether parent and offspring, or mate, factor levels match. Often, each offspring will have a variable number of candidate parents as some parents may be excluded a priori. When x
is a factor
and both relational="OFFSPRING"
and restrict="=="
, only those potential parents that have factor levels matching the offspring factor level are retained. When relational=FALSE
, restrict
can take on factor levels which exclude parents that have non-matching factor levels.
If a time variable (timevar
) is not passed to PdataPed
the data are assumed to be cross-sectional and each indivdiual only respresented once. If a time variable (timevar
) is passed to PdataPed
then lag
and lag_relational
can be set so that time specific covariates are used. lag
designates time units relative to the offspring record when relational=FALSE
; for example, if lag=c(0,0)
the value of x
is taken for that parent during the same time period as the offspring record. If relational="OFFSPRING"
or relational="MATE"
then lag
determines the time units relative to the record of the offspring or mate to which the focal inidvidual is being compared. This record can be specified by using lag_relational
, which is always relative to the offspring record. Negative lags refer to previous time intervals (e.g. lag=c(-1,-1)
takes x
from the previous time step), and if the elements of lag
or lag_relational
differ then the average value of x
during this period is taken (e.g lag=c(-1,0)
averages x
in the record matching and preceding the offspring record). This is not applicable when x
is a factor
unless restrict
takes one of the logical values (e.g."=="
) in which case parents are retained when the logical value is TRUE
at least once in the specified interval.
Below are models that can be fitted using varPed
, where x
is a univariate continuous variable:
varPed(x, gender="Female")
p(i,j) = exp(b*x(i)...)
varPed(x, gender="Male")
p(i,j) = exp(b*x(j)...)
varPed(x)
p(i,j) = exp(b*(x(i)+x(j))...)
varPed(x, gender="Female", relational="OFFSPRING")
p(i,j) = exp(b*abs(x(i)-x(o))...)
varPed(x, gender="Female", relational="OFFSPRINGV")
p(i,j) = exp(b*(x(i)-x(o))...)
varPed(x, gender="Female", relational="MATE")
p(i,j) = exp(b*abs(x(i)-x(j))...)
varPed(x, gender="Female", relational="MATEV")
p(i,j) = exp(b*(x(i)-x(j))...)
varPed(x, gender="Female", lag=c(-1,-1))
p(i,j) = exp(b*x(i,t-1)...)
varPed(x, gender="Female", lag=c(-1,-1), relational="OFFSPRING")
p(i,j) = exp(b*abs(x(i,t-1)-x(o,t))...)
varPed(x, gender="Female", lag=c(-2,-2), relational="MATE",
lag_relational=c(-1,-1))
p(i,j) = exp(b*(abs(x(i,t-2)-x(j,t-1)))...)
varPed(x, gender="Male", lag=c(-2,-2), relational="OFFSPRING",
lag_relational=c(-1,-1))
p(i,j) = exp(b*(abs(x(j,t-2)-x(o,t-1)))...)
Where p(i,j) is the probability that dam i and sire j are the parents of an offspring o. x and b are the variable of interest and the associated parameter, and t is the time period to which the offspring record belongs.
For a categorical variable with two levels (A
and B
) the model specified by varPed(x, gender="Female")
takes on the form
p(i,j) = exp(b*I(i)...)
where I(i) is an indicator variable taking the value 1 if x(i) is equal to the first level of x
and zero otherwise. beta is then the log odds ratio of the two levels of x
with respect to maternity. If merge=TRUE
is specified then beta may vary across offspring, and b_m is estimated. b_m is related to b:
b_m = logit(((theta*N_A)/(N_A*theta+N_B*(1-theta)))
where theta is the inverse logit transformation of b, and N_A and N_B are the number of potential mothers that have level A
and B
for x
. If N_A and N_B are invariant over offspring the models are functionally equivalent.
The denominator of the multinomial likelihood is the summed linear predictors of all possible parents (after setting up a contrast with the baseline parents). Designating the first set of parents as baseline, the contrast for each set of parents is simply:
eta(i,j)=log(p(i,j)/p(1,1))
and the likelihood of b is
Pr(x| b) = prod(no)(exp(eta(d,s))/sum(ni*nj)(exp(eta(i,j))))
where no, ni and nj are the number of offspring, the number of potential mothers for offspring o, and the number of potential fathers for offspring o, respectively. d and s are the actual parents of offspring o. The set of possible parents in the denominator of the multinomial likelihood are those that are not excluded using the argument restrict
. However, if the argument keep=TRUE
is used then the denominator of the likelihood will include excluded parents depsite the fact that d!=i and s!=j.
In version 2.31-2.42 DSapprox=TRUE
can be passed to MCMCped
which approximates the likelihood of b when a variable specifies the distance between mates (i.e relational="MATE"
). This approximation reduces the computational burden by fixing i=d or j=s in the denominator of the multinomial likelihood. The parent defined as the "MATE"
is fixed, so that a varPed
expression with gender="Male"
has the approximated likelihood:
Pr(x| b) = prod(no)(exp(eta(d,s))/sum(nj)(exp(eta(d,j))))
For certain types of problem this approximation does not work well. In version 2.43 and after, another approximation is used which seems to work better:
Pr(x| b) = prod(no)(exp(eta(d,s))/(sum(nj)(exp(eta(d,j))+sum(ni)(exp(eta(i,s))-exp(eta(d,s)))))
list containing the design matrix for variable x
, the identity of retained parents and the gender of the parents
Versions >=2.1 accept different arguments for restrict
than earlier versions. When relational="OFFSPRING"
, earlier versions accepted restrict=TRUE
and restrict=FALSE
, but these have now been replaced with restrict="=="
and restrict="!="
, respectively. In addition, restrict
now also accepts ">"
, ">="
, "<"
and "<="
with parental values on the LHS and offspring values on the RHS.
Also, versions >=2.1 also accept "OFFSPRINGV"
and "MATEV"
for relational
in addition to "OFFSPRING"
and "MATE"
. "V"
specifies that the signed vector should be used rather than the Euclidean distance.
Jarrod Hadfield j.hadfield@ed.ac.uk
Hadfield J.D. et al (2006) Molecular Ecology 15 3715-31
MCMCped
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.