Filter non-informative trajectories

Share:

Description

Function to remove non-informative trajectories

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
filterNoise(data, noise, RTCutoff, RICutoff, propMissingCutoff, fcCutoff)


  ## S4 method for signature 
## 'matrixOrframe,
##   noise,
##   missingOrnumeric,
##   missingOrnumeric,
##   missingOrnumeric,
##   missingOrnumeric'
filterNoise(data,
  noise, RTCutoff, RICutoff, propMissingCutoff, fcCutoff)

Arguments

data

data.frame or matrix containing the samples as rows and features as columns.

noise

an object of class noise containing time and individual to molecule sd ratios number of missing values and maximum fold changes.

RTCutoff

numeric the R_T cutoff to remove non-informative trajectories.

RICutoff

numeric the R_I to remove non-informative trajectories.

propMissingCutoff

numeric maximum proportion of missing values in trajectories.

fcCutoff

numeric the minimum fold change observed between the mean of any two time points.

Details

filterNoise removes noisy or non-informative profiles based on selected theresholds R_I, R_T (Straube et al. 2015), maximum foldchanges and/or missing values.

Value

filterNoise returns an object of class list containing the following components:

data

numeric filtered data.

removedIndices

numeric removed indices

References

Straube J., Gorse A.-D., Huang B.E., Le Cao K.-A. (2015). A linear mixed model spline framework for analyzing time course 'omics' data PLOSONE, 10(8), e0134540.

See Also

investNoise

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
## Not run: 
data(kidneySimTimeGroup)
G1 <- kidneySimTimeGroup$group=="G1"
noiseTest <-investNoise(data=kidneySimTimeGroup$data[G1,],time=kidneySimTimeGroup$time[G1],
            sampleID=kidneySimTimeGroup$sampleID[G1])
data <-filterNoise(data=kidneySimTimeGroup$data[G1,],noise=noiseTest,RTCutoff=0.9,
              RICutoff=0.3,propMissingCutoff=0.5)$data
             
             
#Alternatively model-based clustering can be used for filtering
library(mclust)
clusterFilter <- Mclust(cbind(noiseTest@RT,noiseTest@RI),G=2)
plot(clusterFilter,what = "classification")
meanRTCluster <-tapply(noiseTest@RT,clusterFilter$classification,mean)
bestCluster <- names(meanRTCluster[which.min(meanRTCluster)])
filterdata <- kidneySimTimeGroup$data[G1,clusterFilter$classification==bestCluster]
              

## End(Not run)