optimize.sd_selection: Optimization of sd selection
In xyang2uchicago/NPS: BioTIP: An R package for characterization of Biological Tipping-Point

View source: R/BioTIP_update_04202022.R

optimize.sd_selection

R Documentation

Optimization of sd selection

Description

The optimize.sd_selection filters a multi-state dataset based on a cutoff value for standard deviation per state and optimizes. By default, a cutoff value of 0.01 is used. Suggested if each state contains more than 10 samples.

Usage

optimize.sd_selection(
  df,
  samplesL,
  B = 100,
  percent = 0.8,
  times = 0.8,
  cutoff = 0.01,
  method = c("other", "reference", "previous", "itself", "longitudinal reference"),
  control_df = NULL,
  control_samplesL = NULL
)

Arguments

`df`	A dataframe of numerics. The rows and columns represent unique transcript IDs (geneID) and sample names, respectively.
`samplesL`	A list of n vectors, where n equals to the number of states. Each vector gives the sample names in a state. Note that the vectors (sample names) has to be among the column names of the R object 'df'.
`B`	An integer indicating number of times to run this optimization, default 1000.
`percent`	A numeric value indicating the percentage of samples will be selected in each round of simulation.
`times`	A numeric value indicating the percentage of `B` times a transcript need to be selected in order to be considered a stable signature.
`cutoff`	A positive numeric value. Default is 0.01. If < 1, automatically goes to select top x percentage transcripts using the a selecting method (which is either the `reference`, `other` or `previous` stage), e.g. by default it will select top 1 percentage of the transcripts.
`method`	Selection of methods from `reference`, `other`, `previous`, default uses `other`. Partial match enabled. `itself`, or `longitudinal reference`. Some specific requirements for each option: `reference`, the reference has to be the first. `previous`, make sure `sampleL` is in the right order from benign to malign. `itself`, make sure the cutoff is smaller than 1. `longitudinal reference` make sure control_df and control_samplesL are not NULL. The row numbers of control_df is the same as df and all transcript in df are also in control_df.
`control_df`	A count matrix with unique loci as row names and samples names of control samples as column names, only used for method `longitudinal reference`.
`control_samplesL`	A list of characters with stages as names of control samples, required for method 'longitudinal reference'.

Value

A list of dataframe of filtered transcripts with the highest standard deviation are selected from df based on a cutoff value assigned. The resulting dataframe represents a subset of the raw input df.

Author(s)

Zhezhen Wang zhezhen@uchicago.edu

xyang2uchicago/NPS
BioTIP: An R package for characterization of Biological Tipping-Point

optimize.sd_selection: Optimization of sd selection
In xyang2uchicago/NPS: BioTIP: An R package for characterization of Biological Tipping-Point

Optimization of sd selection

Description

Usage

Arguments

Value

Author(s)

See Also

Related to optimize.sd_selection in xyang2uchicago/NPS...

R Package Documentation

Browse R Packages

We want your feedback!

xyang2uchicago/NPS BioTIP: An R package for characterization of Biological Tipping-Point

optimize.sd_selection: Optimization of sd selection In xyang2uchicago/NPS: BioTIP: An R package for characterization of Biological Tipping-Point

Optimization of sd selection

Description

Usage

Arguments

Value

Author(s)

See Also

Related to optimize.sd_selection in xyang2uchicago/NPS...

R Package Documentation

Browse R Packages

We want your feedback!

xyang2uchicago/NPS
BioTIP: An R package for characterization of Biological Tipping-Point

optimize.sd_selection: Optimization of sd selection
In xyang2uchicago/NPS: BioTIP: An R package for characterization of Biological Tipping-Point