dspDat: Specify model variables for day-specific probabilities MCMC...

Description Usage Arguments Details Value Data Processing Steps Author(s) References

Description

dspDat is used to create an object of class "dspDat"; the resultant object may then be used as input to the dsp function to sample an MCMC chain for the methodology proposed by Dunson and Stanford in Bayesian Inferences on Predictors of Conception Probabilities (2005). The dspDat function is essentially a convenience function provided to (if necessary) merge multiple datasets of varying time-specificities, as is common for the type of fertility data for which the aformentioned methodology is designed.

Usage

1
2
dspDat(formula, baseline = NULL, cycle = NULL, daily, idName, cycName,
  sexName, fwName = NULL, fwLen, useNA = "none")

Arguments

formula

An object of class formula (or one that can be coerced to that class). The term on the left-hand side of the formula must be the name of a column in either cycle or daily, with the observations in this column indicating whether the cycle (or cycle in which the day was a part of) resulted in a successful pregnancy. The terms on the right-hand side of the formula must each be a name of a column in any of one of baseline, cycle, or daily.

baseline

Either NULL or a data.frame (or an object that can be coerced to data.frame). If non-NULL, then contains data in the form of one observation (row) per study subject, and may include covariates (columns) such as e.g. age at time entering study, BMI at time entering study, gravidity status, etc. A value of NULL may mean that either no such variables are to be included in the model, or that such data has already been expanded and is included in the cycle or daily data. If a non-NULL object is supplied, then baseline must include a column with name as specified by the idName parameter which provides a study ID for each observation.

cycle

Either NULL or a data.frame (or an object that can be coerced to data.frame). If non-NULL, then contains data in the form of one observation (row) per study cycle, and may include columns such as e.g. cycle pregnancy indicator, attempt cycle number, or cycle length, etc. A value of NULL indicates that such data has already been expanded and is included in the daily data. If a non-NULL object is supplied, then cycle must include a column with name as specified by the idName parameter which provides a study id for each observation, and must include a column with name as specified by cycName which provides a cycle number for each observation. If the pregnancy outcome variable is included in this data, then the column must have the name as specified by the left-hand term in the formula parameter.

daily

A data.frame (or an object that can be coerced to data.frame). Contains data in the form of one observation (row) per study day. May include data such as e.g. day intermenstual bleeding indicator, day cervical mucus type etc. Must include a column with name as specified by the idName parameter which provides a study id for each observation, a column with name as specified by cycName which provides a cycle number for each observation, and a column with name as specified by sexName which provides an indicator of whether the observation (day) is within the fertile window. If a cycle pregnancy outcome column was not provided in the cycle data, then one must be provided in the daily data, and must have name as specified by the left-hand term in the formula parameter.

idName

A string specifying the name of the column in each of the non-NULL baseline, cycle, or daily objects such that the column observations provide the study id for the subject to which each observation belongs to. The name of the column must be the same for each of the non-NULL datasets.

cycName

A string specifying the name of the column in each of the non-NULL cycle or daily objects such that the column observations provide the cycle number to which each observation belongs to. If cycle is non-NULL, then the name of the column must be the same for both the cycle and daily data.

sexName

A string specifying the name of the column in the daily data such that the column observations provide an indicator of whether unprotected vaginal intercourse occurred during that day.

fwName

If non-NULL, then a string specifying the name of the column in the daily data such that the column observations provide an indicator of whether the observation is part of a cycle's fertile window. If NULL, then it is assumed that the daily data has already been restricted to only observations that occurred during each cycle's fertile window.

As a convenience, if the name specified by fwName is included in the formula parameter, then it is interpreted to mean that a factor is to be included in the model corresponding to the day number in the fertile window, i.e. fertile window day 1, fertile window day 2, etc. Warning: this assumes that the observations within a cycle are in chronological order.

fwLen

A value specifying the number of days belonging to a cycle's fertile window. The length of the fertile window is assumed to be same across all cycles.

useNA

One of either "none" or "sex". If "none" then observations with missing data are removed from the model. If "sex" then observations with missing intercourse data are included in the model conditional on no other data missing in the observation. See Data Processing Steps for more details.

Details

The class "dspDat" is equipped with a summary function.

It is natural to record fertility study data in up to three datasets of varying time-specificities. First, a dataset of variables that do not change throughout the study which we denote as the baseline data, second a dataset of cycle-specific variables which we denote as the cycle data, and third a dataset of day-specific variables which we denote as the daily data. dspDat is provided as a convenience function which merges all of the provided datasets into one day-specific dataset and creates some internal objects for use by the MCMC sampler function dsp.

At a minimum the daily data must be provided so that daily intercourse data is available. baseline and cycle data are optional, so long as pregnancy information is included in one of either the cycle data or daily data. For example, if the data was collected only in a daily format or has already been combined, then only a day-specific dataset would need to be passed to dspDat.

The usual model.matrix is used to construct the design matrix for the specified model, so any of the usual formula commands are available. In particular, a formula has an implied intercept term which may not be desireable for these types of models. To remove this use either y ~ x - 1 or y ~ 0 + x.

Value

dspDat returns an object of class "dspDat". An object of class "dspDat" is a list containing the following components:

cleanDat

A list containing objects bas, cyc, and day, which are the datasets after removing missing and reducing the daily data to fertile window days as described in Data Processing Steps. If NULL was supplied for baseline or cycle, then the value of bas or cyc is also NULL.

redDat

A list containing objects bas, cyc, and day, which are the datasets after reducing the cleaned data to the set of IDs and cycles that are common to every non-NULL dataset. If NULL was supplied for baseline or cycle, then the value of bas or cyc is also NULL.

combDat

*******

modelObj

A list containing objects Y, X, U, and id. Y, X, U are as in the Dunson and Stanford paper, and id is a vector of subject IDs such that each observation specifies the subject ID for the corresponding observation.

samplerObj

A list containing objects for use by the dsp function when executing the MCMC algorithm

datInfo

A list containing objects for use by the summary function

Data Processing Steps

Cleaning data

If either a baseline or cycle dataset is provided, then all observations that contain missing data among the model variables are removed. All non-fertile window days are removed from the daily dataset, and any cycles that either contain missing in the fertile window or have too many or too few fertile window days are also removed.

Reducing data

Each non-NULL dataset is reduced to the set of IDs and cycles that are common to every non-NULL dataset.

Author(s)

David A. Pritchard and Sam Berchuck, 2015

References

Dunson, David B., and Joseph B. Stanford. "Bayesian inferences on predictors of conception probabilities." Biometrics 61.1 (2005): 126-133.


dpritchLibre/DSP_Package documentation built on May 15, 2019, 1:49 p.m.