Description Usage Arguments Value See Also Examples
View source: R/preprocessing.R
If multiple rows contain the same identifier for a column, then randomly select just one row. Do this for all identifiers and output a new dataframe where each identifier now only has one row.
1 | RanomlySelectOneRowForEach(inputted.data, col.name.of.unique.identifier, seed)
|
inputted.data |
A dataframe. |
col.name.of.unique.identifier |
Name of column in inputted.data containing identifiers. |
seed |
Number indicating the seed to set for random number generation. |
A dataframe where a single row remains for each identifier.
Other Preprocessing functions:
AddColBinnedToBinary()
,
AddColBinnedToQuartiles()
,
AddPCsToEnd()
,
ConvertDataToPercentiles()
,
CorAssoTestMultipleWithErrorHandling()
,
DownSampleDataframe()
,
GenerateElbowPlotPCA()
,
GeneratePC1andPC2PlotsWithAndWithoutOutliers()
,
Log2TargetDensityPlotComparison()
,
LookAtPCFeatureLoadings()
,
MultipleColumnsNormalCheckThenBoxCox()
,
NormalCheckThenBoxCoxTransform()
,
RecodeIdentifier()
,
RemoveColWithAllZeros()
,
RemoveRowsBasedOnCol()
,
RemoveSamplesWithInstability()
,
SplitIntoTrainTest()
,
StabilityTestingAcrossVisits()
,
SubsetDataByContinuousCol()
,
TwoSampleTTest()
,
ZScoreChallengeOutliers()
,
captureSessionInfo()
,
correlation.association.test()
,
describeNumericalColumnsWithLevels()
,
describeNumericalColumns()
,
generate.descriptive.plots.save.pdf()
,
generate.descriptive.plots()
1 2 3 4 5 6 7 | identifier.col <- c("a", "a", "a", "b", "b", "b", "c")
value.col <- c(1, 2, 3, 1, 1, 1, 5)
input.data.frame <- as.data.frame(cbind(identifier.col, value.col))
results <- RanomlySelectOneRowForEach(input.data.frame, "identifier.col", 1)
results
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.