ImputeSuperIndividuals: Impute missing super-individual data

ImputeSuperIndividualsR Documentation

Impute missing super-individual data

Description

For each (target) individual with missing value in ImputeAtMissing, identify all (source) individuals in the haul for which ImputeAtMissing is non-missing and for which the values in ImputeByEqual are identical to the target individual. Then sample one of these source individuals, and copy values of ToImpute to the target individual. Only values that are non-missing are copied from the sampled individual, and only missing values in the target individual are replaced. If no source individuals are found in the haul, expand the search to the stratum, and finally to the survey. If no source individuals are found in the survey, leave the target individual unchanged.

Usage

ImputeSuperIndividuals(
  SuperIndividualsData,
  ImputationMethod = c("RandomSampling", "Regression"),
  ImputeAtMissing = character(),
  ImputeByEqual = character(),
  ToImpute = character(),
  ImputationLevels = c("Haul", "Stratum", "Survey"),
  Seed = 1,
  RegressionDefinition = c("FunctionParameter", "FunctionInput"),
  GroupingVariables = character(),
  RegressionModel = c("SimpleLinear", "Power"),
  RegressionTable = data.table::data.table(),
  Regression
)

Arguments

SuperIndividualsData

The SuperIndividualsData data.

ImputationMethod

The method to use for the imputation. Currently, only "RandomSampling" is implemented, but may be accompanied "Regression" in a coming release.

ImputeAtMissing

A single string naming the variable which when missing identifies target individuals to input data to. I.e., if ImputeAtMissing is missing for an individual, perform the imputation. In StoX 3.0.0 and older, ImputeAtMissing was hard coded to IndividualAge.

ImputeByEqual

A vector of strings naming the variable(s) which, when identical to the target individual, identifies the source individuals to impute data from. The source individuals need also to have non-missing ImputeAtMissing. In StoX 3.0.0 and older, ImputeByEqual was hard coded to c("SpeciesCategory","IndividualTotalLength").

ToImpute

A vector of strings naming the variable(s) to impute (copy to the target individual). Values that are not missing are not imputed. Note that values are only imputed when ImputeAtMissing is missing, so including many variables in ToImpute is only recommended if all these are present for the individuals (see Details). In StoX 3.0.0 and older, ToImpute was hard coded to all available variables of the BioticData contained in the SuperIndividualsData.

ImputationLevels

A vector of strings naming the levels at which to input, defaulted to c("Haul", "Stratum", "Survey"). To prevent imputation at the Survey level, use c("Haul", "Stratum").

Seed

An integer giving the seed to use for the random sampling used to obtain the imputed data.

RegressionDefinition

Character: A string naming the method to use, one of FunctionParameter to define the Regression on the fly in this function (using GroupingVariables, RegressionModel and RegressionTable), or FunctionInput to import Regression process data from a previously run process using the function

GroupingVariables

An optional vector of strings defining variables seving as grouping variables in the RegressionTable. Setting this adds the its elements as columns in the RegressionTable in the GUI.

RegressionModel

Character: A string naming the model to use for the regression. See Details for options.

RegressionTable

A table with one row defining the name of the dependent variable (column name DependentVariable), the name of the independent variable (column name IndependentVariable), and the Intersect and Slope if RegressionModel = "SimpleLinear" and Factor and Exponent if RegressionModel = "Power".

Regression

The Regression process data.

Details

When ToImpute contains more variables than that given by ImputeAtMissing there is a risk that values remain missing even after successful imputation. E.g., if ImputeAtMissing is IndividualAge, and ToImpute includes IndividualRoundWeight, then the weight is only imputed when age is missing. Super-individuals with age but not weight will then still have missing weight. Variables that are naturally connected, such as IndividualRoundWeight and WeightMeasurement, or IndividualTotalLength and LengthResolution, should both be included in ToImpute.

Value

An object of StoX data type SuperIndividualsData.

See Also

SuperIndividuals for distributing Abundance to the Individuals.


StoXProject/RstoxBase documentation built on Dec. 21, 2024, 7:26 p.m.