verifyDoppelgangers: Verifies the functionality of Doppelgangers
In lr98769/doppelgangerIdentifier: Identifies Doppelgangers Between Datasets With PPCC And Meta data

verifyDoppelgangers

R Documentation

Verifies the functionality of Doppelgangers

Description

The user constructs a csv file with with training-validation set pairs ideally incrementing the number of Doppelgangers between training and validation sets. For each training-validation set pair, 12 models with different feature sets will be trained. 10 random feature sets and 2 features sets of highest and lowest variance would be generated. If an increase in validation accuracy of the 10 random models with increasing number of doppelgangers can be observed, we can conclude that the doppelgangers included are functional doppelgangers.

Usage

verifyDoppelgangers(
  experiment_plan_filename,
  raw_data,
  meta_data,
  feature_set_portion = 0.1,
  seed_num = 2021,
  separator = "\\.",
  do_batch_corr = TRUE,
  k = 5,
  num_random_feature_sets = 10,
  size_of_val_set = 8,
  batch_corr_method = "ComBat",
  neg_con_seed = 10
)

Arguments

`experiment_plan_filename`	Name of file containing csv experiment plan. The csv file has a header with the names of the training_validation sets (e.g. "Doppel_0.train" or "Doppel_0.valid"). In each column (e.g. "Doppel_0.train" column), we include the names of all samples included in this training/validation set.
`raw_data`	Dataframe of count matrix before batch correction
`meta_data`	Dataframe of meta data
`feature_set_portion`	Proportion of variables to be used for feature set generation
`seed_num`	Seed number for random feature set generation
`separator`	The character separating the name of the training_validation pair e.g. "0 Doppel" from the "train", "valid" label. Name of each column should be in format "0 Doppel.train" if . is used as separator
`do_batch_corr`	If False, no batch correction is carried out
`k`	k hyperparameter for KNN classification models
`num_random_feature_sets`	Number of random feature sets for each training-validation set
`size_of_val_set`	Size of each validation set (We assume the size of each validation set is the same, this is used for the binomial model)
`batch_corr_method`	Batch correlation method used. Only 2 options are accepted "ComBat" or "ComBat_seq".
`neg_con_seed`	Seed used for negative control

Details

Troubleshooting tips:

Ensure all the headers have no spaces.
If excel is used for planning, save the spreadsheet as "CSV (MS-DOS) (*.csv)"
Use the exact label "train" and "valid" (take note of capital letters)
Ensure the separator does not exist in the name of the training-validation set (E.g. Doppel.0 is not allowed)
Try to put both training-validation columns beside each other and leave no column gaps
Refer to the csv file in the tutorial on the GitHub README.

Value

Validation Accuracies

Examples

## Not run: 
verificationResults = verifyDoppelgangers(
experiment_plan_filename = "tutorial/experimentPlan.csv",
raw_data = rc,
meta_data = rc_metadata)

## End(Not run)

lr98769/doppelgangerIdentifier documentation built on Aug. 2, 2022, 9:41 a.m.

lr98769/doppelgangerIdentifier index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lr98769/doppelgangerIdentifier
Identifies Doppelgangers Between Datasets With PPCC And Meta data

verifyDoppelgangers: Verifies the functionality of Doppelgangers
In lr98769/doppelgangerIdentifier: Identifies Doppelgangers Between Datasets With PPCC And Meta data

Verifies the functionality of Doppelgangers

Description

Usage

Arguments

Details

Value

Examples

Related to verifyDoppelgangers in lr98769/doppelgangerIdentifier...

R Package Documentation

Browse R Packages

We want your feedback!

lr98769/doppelgangerIdentifier Identifies Doppelgangers Between Datasets With PPCC And Meta data

verifyDoppelgangers: Verifies the functionality of Doppelgangers In lr98769/doppelgangerIdentifier: Identifies Doppelgangers Between Datasets With PPCC And Meta data

Verifies the functionality of Doppelgangers

Description

Usage

Arguments

Details

Value

Examples

Related to verifyDoppelgangers in lr98769/doppelgangerIdentifier...

R Package Documentation

Browse R Packages

We want your feedback!

lr98769/doppelgangerIdentifier
Identifies Doppelgangers Between Datasets With PPCC And Meta data

verifyDoppelgangers: Verifies the functionality of Doppelgangers
In lr98769/doppelgangerIdentifier: Identifies Doppelgangers Between Datasets With PPCC And Meta data