discretize_gene_supervised: discretize_gene_supervised

Description Usage Arguments Details Value Examples

View source: R/discretize_gene_supervised.R

Description

Uses several discretizations and selects the one that is best for a given variable (gene) in comparison to a target class by equivocation Note that set.seed() should be used for reproducing the results. The inner kmeans #' function would, otherwise, provide different results each time.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
discretize_gene_supervised(
  gene,
  target,
  output = "discretized_vector",
  discs = c(".split_vector_in_two_by_median", ".split_vector_in_two_by_mean",
    ".split_vector_by_kmeans", ".split_vector_in_three_by_mean_sd",
    ".split_vector_in_two_by_min_max_thresh"),
  vw_params = c(0.25, 0.5, 0.75),
  kmeans_centers = c(2, 3, 4),
  sd_alpha = c(0.75, 1, 1.25)
)

Arguments

gene

A previously normalized gene expression vector

target

A series of labels matching each of the values in the gene vector

output

If it is equal to 'discretized_vector', the output is the vector. I it is 'su', returns a dataframe. Defaults to 'discretized_vector'

discs

Defaults to c( ".split_vector_in_two_by_median", split_vector_in_two_by_mean", ".split_vector_by_kmeans", ".split_vector_in_three_by_mean_sd", ".split_vector_in_two_by_vw")

vw_params

cuttof parameters for the varying width function. Defaults to 0.25, 0.5 and 0.75

kmeans_centers

Numeric vector with the number of centers to use for kmeans. Defaults to 2, 3 and 4

sd_alpha

Parameter for adusting the 'medium' level of the mean +- sd discretization. Defaults to sd_alpha = c(0.75, 1, 1.25))

Details

Note that a seed for random values has to bew set for reproducibility. Otherwise, the "kmeans" value might vary from iteration to iteration.

Value

A data frame with the discretized features in the same order as previously

Examples

1
2
3
4
5
6
7
8
 data(scDengue)
 exprs <- as.data.frame(SummarizedExperiment::assay(scDengue, 'logcounts'))
 gene <- exprs['ENSG00000166825',]
 infection <- SummarizedExperiment::colData(scDengue)
 target <- infection$infection
 set.seed(3)
 discrete_expression <- as.data.frame(discretize_gene_supervised(gene, target))
 table(discrete_expression)

FCBF documentation built on Nov. 8, 2020, 8:30 p.m.