ds.mice.pmm: Calculates imputations for univariate missing data by...
In gflcampos/dsMiceClient: DataSHIELD client-side functions for the mice package

Description Usage Arguments Value Examples

This function performs imputation by predictive mean matching by executing the pmmDS function on the server-side.

1 2	ds.mice.pmm(y = NULL, ry = NULL, x = NULL, wy = NULL, donors = 5, matchtype = 1L, ridge = 1e-05, checks = TRUE, datasources = NULL, ...)

`y`	Vector to be imputed
`ry`	Logical vector of length `length(y)` indicating the the subset `y[ry]` of elements in `y` to which the imputation model is fitted. The `ry` generally distinguishes the observed (`TRUE`) and missing values (`FALSE`) in `y`.
`x`	Numeric design matrix with `length(y)` rows with predictors for `y`. Matrix `x` may have no missing values.
`wy`	Logical vector of length `length(y)`. A `TRUE` value indicates locations in `y` for which imputations are created.
`donors`	The size of the donor pool among which a draw is made. The default is `donors = 5L`. Setting `donors = 1L` always selects the closest match, but is not recommended. Values between 3L and 10L provide the best results in most cases (Morris et al, 2015).
`matchtype`	Type of matching distance. The default choice (`matchtype = 1L`) calculates the distance between the predicted value of `yobs` and the drawn values of `ymis` (called type-1 matching). Other choices are `matchtype = 0L` (distance between predicted values) and `matchtype = 2L` (distance between drawn values).
`ridge`	The ridge penalty used in `.norm.draw()` to prevent problems with multicollinearity. The default is `ridge = 1e-05`, which means that 0.01 percent of the diagonal is added to the cross-product. Larger ridges may result in more biased estimates. For highly noisy data (e.g. many junk variables), set `ridge = 1e-06` or even lower to reduce bias. For highly collinear data, set `ridge = 1e-04` or higher.
`...`	Other named arguments.

Vector with imputed data, same type as y, and of length sum(wy)

# In this example, we assume that the Opal server to which we are connecting, 
# has a table that contains the 'boys' data from the original mice package.

# Load DataSHIELD libraries
library(dsBaseClient)
library(dsMiceClient)

# Build login information
server <- c("server_name")
url <- c("opal_url")
user <- "username"
password <- "password"
table <- c("project_name.table_name")
logindata <- data.frame(server,url,user,password,table)

# Login and assign the 'boys' dataset to varable 'D' on the server-side
opals <- datashield.login(logins=logindata, assign=TRUE)

datashield.assign(opals, symbol="xname", value=as.symbol("c('age', 'hgt', 'wgt')"))
datashield.assign(opals, symbol="r", value=as.symbol("complete.cases(D[, xname])"))
datashield.assign(opals, symbol="x", value=as.symbol("D[r, xname]"))
datashield.assign(opals, symbol="y", value=as.symbol("D[r, 'tv']"))
datashield.assign(opals, symbol="ry", value=as.symbol("notNaDS(y)"))

# Impute missing tv data
yimp <- ds.mice.pmm('y','ry','x')
length(yimp)
table(yimp)
hist(table(yimp), xlab = 'Imputed missing tv')