Create.TrainingDataSet: Create a new training data set

Description Usage Arguments Details Value Author(s) See Also

View source: R/Create.TrainingDataSet.R

Description

Create a new training data set from a set of feature files. It concatenates all feature file into a data.table object. Read labels are assigned from the corresponding organism labels. Data are saved in specified folders.

Usage

1
2
3
Create.TrainingDataSet(Path2Files = NULL, pattern = "Features",
  OSlabels = NULL, savePath = file.path(Path2Files, "TrainingData"),
  CompressOption = T)

Arguments

Path2Files

A complete path to the read folder (i.e. it contains a subfolder 'Features')

pattern

A string providing a distinct search pattern for all feature files (default = "Features")

OSlabels

A vector containing either 'HP' or 'NP' together with the name attributes pointing to the Organism identifier (Bioproject ID)

savePath

The name of the newly created training data set folder

CompressOption

Do you want to compress the saved files (takes time but saves disk space)?

Details

Saves 3 files in specified folder "TrainingFolder". FeatureTable.rds containing the features in a data.frame object FeatureRowDescription.rds containing the read description and allows identification of Organism, Chromosome and read ReadLabel_OS.rds containing a label for each read based on the organisms label

This function uses the package data.table for efficiently joining many large data.frames. For even larger files, it is recommended to use linux command line tools

Value

Returns True if completed. Feature and label files are saved in specified folder (see details) #' @examples ## Not run: Create.TrainingDataSet (Path2Files = NULL, OSlabels = NULL, savePath = NULL,CompressOption = T) ## End(Not run)

Author(s)

Carlus Deneke

See Also

Other TrainingFunctions: Run.Training, SelectFeatureSubset


crarlus/paprbag documentation built on May 14, 2019, 11:31 a.m.