PSOL_NegativeExpansion: Negative expansion

Description Usage Arguments Value Author(s) Examples

Description

This function expands the negative sample set using PSOL algorithm.

Usage

1
2
3
PSOL_NegativeExpansion(featureMat, positives, negatives, unlabels, cpus = 1, 
                       iterator = 50, cross = 5, TPR = 0.98, method = "randomForest", 
                       plot = TRUE, trace = TRUE, PSOLResDic, ...)

Arguments

featureMat

a feature matrix recording the feature values for all samples.

positives

a character string recording the positive samples.

negatives

a character string recording the negative samples.

unlabels

a character string recording the unlabeled samples.

cpus

an integer value, cpu number

iterator

an integer value, iterator times.

cross

an integer value, cross-times cross validation.

TPR

a numeric value ranged from 0 to 1.0, used to select the prediction score cutoff.

method

a character string, machine learing method

plot

a logic value specifies whether the score distribution of positive and unlabeled samples will be plotted.

trace

logic. TRUE: the intermediate results will be saved as ".RData" format.

PSOLResDic

a character string, PSOL Result directory

...

Further parameters used in PSOL_ExpandSelection. see the further parameters in function classifier.

Value

The PSOL-related results are output in the "resultDic" directory.

Author(s)

Chuang Ma, Xiangfeng Wang.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
## Not run: 

   ##generate expression feature matrix
   sampleVec1 <- c(1, 2, 3, 4, 5, 6)
   sampleVec2 <- c(1, 2, 3, 4, 5, 6)
   featureMat <- expFeatureMatrix( expMat1 = ControlExpMat, 
                                   sampleVec1 = sampleVec1, 
                                   expMat2 = SaltExpMat, 
                                   sampleVec2 = sampleVec2, 
                                   logTransformed = TRUE, 
                                   base = 2,
                                   features = c("zscore", 
                                   "foldchange", "cv", 
                                   "expression"))

   ##positive samples
   positiveSamples <- as.character(sampleData$KnownSaltGenes)
   ##unlabeled samples
   unlabelSamples <- setdiff( rownames(featureMat), positiveSamples )
  
   ##selecting an intial set of negative samples 
   ##for building ML-based classification model
   ##suppose the PSOL results will be stored in:
   PSOLResDic <- "/home/wanglab/mlDNA/PSOL/"
   res <- PSOL_InitialNegativeSelection(featureMatrix = featureMat, 
                                        positives = positiveSamples, 
                                        unlabels = unlabelSamples, 
                                        negNum = length(positiveSamples), 
                                        cpus = 6, PSOLResDic = PSOLResDic)

   ##initial negative samples extracted from unlabelled samples with PSOL algorithm
   negatives <- res$negatives

   ##negative sample expansion
   PSOL_NegativeExpansion(featureMat = featureMat, positives = positiveSamples, 
                          negatives = res$negatives, unlabels = res$unlabels, 
                          cpus = 2, iterator = 50, cross = 5, TPR = 0.98, 
                          method = "randomForest", plot = TRUE, trace = TRUE, 
                          PSOLResDic = PSOLResDic,
                          ntrees = 200 ) # parameters for ML-based classifier 


## End(Not run)

mlDNA documentation built on May 2, 2019, 2:15 p.m.