step_select_genes: Gene selection by differential expression analysis.

Description Usage Arguments Value

View source: R/step-select-genes.R

Description

step_select_genes() creates a specification of a recipe step that will select genes by differential expression analysis, discarding those that don't pass a certain p-value threshold. Currently, cor_de() is used.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
step_select_genes(
  recipe,
  ...,
  role = NA,
  trained = FALSE,
  condition = NULL,
  genes_pass = NULL,
  padj_cutoff = 0.05,
  max_n_genes = NULL,
  min_n_genes = NULL,
  options = list(method = "spearman", padj_method = stats::p.adjust.methods[1]),
  skip = FALSE,
  id = recipes::rand_id("select_genes")
)

Arguments

recipe

A recipe object. The step will be added to the sequence of operations for this recipe.

...

One or more selector functions to choose which variables will be used to compute the components. See selections() for more details. For the tidy method, these are not currently used.

role

For model terms created by this step, what analysis role should they be assigned?. By default, the function assumes that the new principal component columns created by the original variables will be used as predictors in a model.

trained

A logical to indicate if the quantities for preprocessing have been estimated.

condition

The condition for the differential expression. See cor_de().

genes_pass

This should not be specified by the end user. Information about which genes do and don't pass the p-value threshold in the differential expression analysis is stored here.

padj_cutoff

Genes with an adjusted p-value less than or equal to padj_cutoff in the differential expression analysis are kept. The rest are discarded.

max_n_genes

A positive integer. The maximum number of genes selected by this step.

min_n_genes

A positive integer. The minimum number of genes selected. This guarantees that even if no genes pass padj_cutoff, there will be this many returned.

options

A list with two elements named method and padj_method. Both are passed to cor_de(). padj_cutoff is the threshold for keeping genes.

skip

A logical. Should the step be skipped when the recipe is baked by bake.recipe()? While all operations are baked when prep.recipe() is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using skip = TRUE as it may affect the computations for subsequent operations

id

A character string that is unique to this step to identify it.

Value

An updated version of recipe with the new step added to the sequence of existing steps (if any).


mirvie/mirmodels documentation built on Jan. 14, 2022, 11:12 a.m.