callOverallSelectedFeatures: Wrapper to call selected features

Description Usage Arguments Details Value Examples

View source: R/helper.R

Description

Wrapper to call selected features

Usage

1
2
3
4
5
6
callOverallSelectedFeatures(
  featScores,
  featureSelCutoff,
  featureSelPct,
  cleanNames = TRUE
)

Arguments

featScores

(list of lists): matrix of feature scores across all splits, separated by patient label. First level: patient labels. Second level: matrix of scores for corresponding label.

featureSelCutoff

(integer) cutoff score for feature selection. A feature must have minimum of this score for specified fraction of splits (see featureSelPct) to pass.

featureSelPct

(numeric between 0 and 1) cutoff percent for feature selection. A feature must have minimum score of featureSelCutoff for featureSelPct of train/test splits to pass.

cleanNames

(logical) remove internal suffixes for human readability

Details

Calls features that are consistently high-scoring for predicting each class. The context for this is as follows: The original model runs feature selection over multiple splits of data into train/test samples, and each such split generates scores for all features. This function identifies features with scores that exceed a threshold for a fraction of train/test splits; the threshold and fraction are both user-specified. This function is called by the wrapper getResults(), which returns both the matrix of feature scores across splits and list of features that pass the user-specified cutoffs.

Value

(list) Feature scores for all splits, plus those passing selection for overall predictor featScores: (matrix) feature scores for each split selectedFeatures: (list) features passing selection for each class; one key per class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
pathways <- paste("PATHWAY_",1:100,sep="")
highrisk <- list()
lowrisk <- list()
for (k in 1:10) { 
	highrisk[[k]] <- data.frame(PATHWAY_NAME=pathways, 
	        SCORE=floor(runif(length(pathways),min=0,max=10)),
			stringsAsFactors=FALSE);
    lowrisk[[k]] <- data.frame(PATHWAY_NAME=pathways, 
	        SCORE=floor(runif(length(pathways),min=0,max=10)),
			stringsAsFactors=FALSE);
}
names(highrisk) <- sprintf("Split%i",1:length(highrisk))
names(lowrisk) <- sprintf("Split%i",1:length(lowrisk))
callOverallSelectedFeatures(list(highrisk=highrisk,lowrisk=lowrisk), 5,0.5)

BaderLab/netDx documentation built on Sept. 26, 2021, 9:13 a.m.