RemoveSamplesWithInstability: Remove samples that have multiple values for a single column...

Description Usage Arguments Details Value See Also Examples

View source: R/preprocessing.R

Description

This function uses the StabilityTestingAcrossVisits() function, and then uses the results to subset the inputted data.

Usage

1
2
3
4
5
6
RemoveSamplesWithInstability(
  inputted.data,
  col.name.of.unique.identifier,
  value.to.evaluate,
  standard.deviation.threshold
)

Arguments

inputted.data

A dataframe

col.name.of.unique.identifier

A string that specifies name of column in inputted.data containing unique identifiers.

value.to.evaluate

A string that specifies name of column in inputted.data to look at for stability of values.

standard.deviation.threshold

A numeric value that specifies the value of the standard deviation that is considered large enough to say vists for a single sample is too unstable.

Details

Samples with only a single visit are removed. Additionally, samples that have values that differ significantly (stddev greater than a specified threshold) are also removed.

Value

A dataframe where only rows from stable samples remain.

See Also

Other Preprocessing functions: AddColBinnedToBinary(), AddColBinnedToQuartiles(), AddPCsToEnd(), ConvertDataToPercentiles(), CorAssoTestMultipleWithErrorHandling(), DownSampleDataframe(), GenerateElbowPlotPCA(), GeneratePC1andPC2PlotsWithAndWithoutOutliers(), Log2TargetDensityPlotComparison(), LookAtPCFeatureLoadings(), MultipleColumnsNormalCheckThenBoxCox(), NormalCheckThenBoxCoxTransform(), RanomlySelectOneRowForEach(), RecodeIdentifier(), RemoveColWithAllZeros(), RemoveRowsBasedOnCol(), SplitIntoTrainTest(), StabilityTestingAcrossVisits(), SubsetDataByContinuousCol(), TwoSampleTTest(), ZScoreChallengeOutliers(), captureSessionInfo(), correlation.association.test(), describeNumericalColumnsWithLevels(), describeNumericalColumns(), generate.descriptive.plots.save.pdf(), generate.descriptive.plots()

Examples

1
2
3
4
5
6
7
identifier.col <- c("a", "a", "a", "b", "b", "b", "c")
value.col <- c(1, 2, 3, 1, 1, 1, 5)
input.data.frame <- as.data.frame(cbind(identifier.col, value.col))

results <- RemoveSamplesWithInstability(input.data.frame, "identifier.col", "value.col", 0.5)

results

yhhc2/machinelearnr documentation built on Dec. 23, 2021, 7:19 p.m.