filterTaxonMatrix: Filter taxa in an abundance matrix

Description Usage Arguments Value Examples

Description

function from seqtime package, Discard taxa with less than the given minimum number of occurrences

Usage

1
2
filterTaxonMatrix(x,minocc = 0, dependency = FALSE, keepSum = FALSE,
    return.filtered.indices = FALSE )

Arguments

x

taxon abundance matrix, rows are taxa, columns are samples

minocc

minimum occurrence (minimum number of samples w/ non-zero taxon abundance)

dependency

if true, remove all taxa with a slope above -0.5 or a non-linear slope in the periodogram in log-scale (samples are supposed to represent equidistant time points)

keepSum

if keepSum is true, the discarded rows are summed and the sum is added as a row with name: summed-nonfeat-rows

return.filtered.indices

if true, return an object with the filtered abundance matrix in mat and the indices of removed taxa in the original matrix in filtered.indices

Value

filtered abundance matrix

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (x, minocc=0,dependency = FALSE, keepSum = FALSE,
    return.filtered.indices = FALSE)
{
    toFilter = c()
    xcopy = x
    xcopy[xcopy > 0] = 1
    rowsums = apply(xcopy, 1, sum)
    toFilter = which(rowsums < minocc)
    if (dependency == TRUE) {
        nt = identifyNoisetypes(x, epsilon = 0.5)
        toKeep = c(nt$pink, nt$brown, nt$black)
        toFilter = c(toFilter, setdiff(c(1:nrow(x)), toKeep))
    }
    indices.tokeep = setdiff(c(1:nrow(x)), toFilter)
    if (keepSum == TRUE) {
        filtered = x[toFilter, ]
        x = x[indices.tokeep, ]
        rownames = rownames(x)
        sums.filtered = apply(filtered, 2, sum)
        x = rbind(x, sums.filtered)
        rownames = append(rownames, "summed-nonfeat-rows")
        rownames(x) = rownames
    }
    else {
        x = x[indices.tokeep, ]
    }
    if (return.filtered.indices == TRUE) {
        res = list(x, toFilter)
        names(res) = c("mat", "filtered.indices")
        return(res)
    }
    else {
        return(x)
    }
}

ConesaLab/MDM documentation built on Aug. 1, 2020, 11:47 a.m.