View source: R/replicateStructure.R
replicateStructure | R Documentation |
This function was designed for mining annotation information organized in multiple columns to identify the (potential) grouping of multiple samples, ie to determine factor levels.
The argument method
allows further finetuning if high or low number of groups should be preferred, if multiple columns may be combined, or to choose a particular custom column for desiganting factor levels.
replicateStructure(
x,
method = "median",
sep = "__",
exclNoRepl = TRUE,
trimNames = FALSE,
includeOther = FALSE,
silent = FALSE,
callFrom = NULL,
debug = FALSE
)
x |
(matrix or data.frame) the annotation to inspect; each column is supposed to describe another set of annoation/metadata for the rows of |
method |
(character, length=1) the procedure to choose column(s) with properties of information, may be |
sep |
(character) separator used when a method combining multiple columns (eg combAll, combNonOrth) is chosen (should not appear anywhere in |
exclNoRepl |
(logical) decide whether columns with all values different (ie no replicates or max divergency) should be excluded |
trimNames |
(logical) optional trimming of names in |
includeOther |
(logical) include $allCols with pattern of (all) other columns |
silent |
(logical) suppress messages |
callFrom |
(character) allow easier tracking of messages produced |
debug |
(logical) additional messages for debugging |
Statistical tests require specifying which samples should be considered as replicates of whom. In some cases, like the Sdrf-format, automatic mining of such annotation to indentify an experiment's underlying structure of replicates may be challanging, since the key information may not always be found in the same column. For this reason this function allows inspecting all columns of a matrix of data.frame to identify which colmns may serve describing groups of replicates.
The argument exclNoRepl=TRUE
allows excluding all columns with different content for each line (like line-numbers), ie information without any replicates.
It is set by default to TRUE
to exclude such columns, since statistical tests usually do require some replicates.
When using as method="combAll"
, there is risk all lines (samples) will be be considered different and no replicates remain.
To avoid this situation the argument can be set to method="combNonOrth"
.
Using this mode it will be checked if adding more columns will lead to complete loss of replicates, and -if so- concerned columns omitted.
This function returns a list with $col (column index relativ to x
), $lev (abstract labels of level),
$meth (note of method finally used) and $allCols with general replicate structure of all columns of x
duplicated
, uses trimRedundText
## a is all different, b is groups of 2,
## c & d are groups of 2 nut NOT 'same general' pattern as b
strX <- data.frame(a=letters[18:11], b=letters[rep(c(3:1,4), each=2)],
c=letters[rep(c(5,8:6), each=2)], d=letters[c(1:2,1:3,3:4,4)],
e=letters[rep(c(4,8,4,7),each=2)], f=rep("z",8) )
strX
replicateStructure(strX[,1:2])
replicateStructure(strX[,1:4], method="combAll")
replicateStructure(strX[,1:4], method="combAll", exclNoRepl=FALSE)
replicateStructure(strX[,1:4], method="combNonOrth", exclNoRepl=TRUE)
replicateStructure(strX, method="lowest")
replicateStructure(strX, method=3, includeOther=TRUE) # custom choice of 3rd column
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.