View source: R/BioTIP_update_04202022.R
optimize.sd_selection | R Documentation |
The optimize.sd_selection
filters a multi-state dataset
based on a cutoff value for standard deviation per state and optimizes.
By default, a cutoff value of 0.01 is used. Suggested if each state contains more than 10 samples.
optimize.sd_selection(
df,
samplesL,
B = 100,
percent = 0.8,
times = 0.8,
cutoff = 0.01,
method = c("other", "reference", "previous", "itself", "longitudinal reference"),
control_df = NULL,
control_samplesL = NULL
)
df |
A dataframe of numerics. The rows and columns represent unique transcript IDs (geneID) and sample names, respectively. |
samplesL |
A list of n vectors, where n equals to the number of states. Each vector gives the sample names in a state. Note that the vectors (sample names) has to be among the column names of the R object 'df'. |
B |
An integer indicating number of times to run this optimization, default 1000. |
percent |
A numeric value indicating the percentage of samples will be selected in each round of simulation. |
times |
A numeric value indicating the percentage of |
cutoff |
A positive numeric value. Default is 0.01. If < 1, automatically
goes to select top x percentage transcripts using the a selecting method (which is
either the |
method |
Selection of methods from
|
control_df |
A count matrix with unique loci as row names and samples names
of control samples as column names, only used for method |
control_samplesL |
A list of characters with stages as names of control samples, required for method 'longitudinal reference'. |
A list of dataframe of filtered transcripts with the highest standard
deviation are selected from df
based on a cutoff value assigned. The
resulting dataframe represents a subset of the raw input df
.
Zhezhen Wang zhezhen@uchicago.edu
sd_selection
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.