training: Main model training function for finding the best model that...

Description Usage Arguments Value Author(s) Examples

View source: R/MainLassoLDATraining.R

Description

Training a haft of all cells to find optimal ElasticNet and LDA models to predict a subpopulation

Usage

1
2
3
4
training(genes = NULL, cluster_mixedpop1 = NULL, mixedpop1 = NULL,
  mixedpop2 = NULL, c_selectID = NULL, listData = list(),
  out_idx = 1, standardize = TRUE, trainset_ratio = 0.5,
  LDA_run = FALSE)

Arguments

genes

a vector of gene names (for ElasticNet shrinkage); gene symbols must be in the same format with gene names in subpop2. Note that genes are listed by the order of importance, e.g. differentially expressed genes that are most significan, so that if the gene list contains too many genes, only the top 500 genes are used.

cluster_mixedpop1

a vector of cluster assignment in mixedpop1

mixedpop1

is a SingleCellExperiment object from the train mixed population

mixedpop2

is a SingleCellExperiment object from the target mixed population

c_selectID

a selected number to specify which subpopulation to be used for training

listData

list to store output in

out_idx

a number to specify index to write results into the list output. This is needed for running bootstrap.

standardize

a logical value specifying whether or not to standardize the train matrix

trainset_ratio

a number specifying the proportion of cells to be part of the training subpopulation

LDA_run

logical, if the LDA run is added to compare to ElasticNet

Value

a list with prediction results written in to the indexed out_idx

Author(s)

Quan Nguyen, 2017-11-25

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
c_selectID<-1
out_idx<-1
day2 <- day_2_cardio_cell_sample
mixedpop1 <-new_scGPS_object(ExpressionMatrix = day2$dat2_counts, 
    GeneMetadata = day2$dat2geneInfo, CellMetadata = day2$dat2_clusters)
day5 <- day_5_cardio_cell_sample
mixedpop2 <-new_scGPS_object(ExpressionMatrix = day5$dat5_counts,
GeneMetadata = day5$dat5geneInfo, CellMetadata = day5$dat5_clusters)
genes <-training_gene_sample
genes <-genes$Merged_unique
listData  <- training(genes, 
    cluster_mixedpop1 = colData(mixedpop1)[, 1],
    mixedpop1 = mixedpop1, mixedpop2 = mixedpop2, c_selectID,
    listData =list(), out_idx=out_idx, trainset_ratio = 0.5)
names(listData)
listData$Accuracy

scGPS documentation built on Nov. 8, 2020, 5:22 p.m.