replicateStructure: Search and Select Groups of Replicates

View source: R/replicateStructure.R

replicateStructureR Documentation

Search and Select Groups of Replicates

Description

This function was designed for mining annotation information organized in multiple columns to identify the (potential) grouping of multiple samples, ie to determine factor levels. The argument method allows further finetuning if high or low number of groups should be preferred, if multiple columns may be combined, or to choose a particular custom column for desiganting factor levels.

Usage

replicateStructure(
  x,
  method = "median",
  sep = "__",
  exclNoRepl = TRUE,
  trimNames = FALSE,
  includeOther = FALSE,
  silent = FALSE,
  callFrom = NULL,
  debug = FALSE
)

Arguments

x

(matrix or data.frame) the annotation to inspect; each column is supposed to describe another set of annoation/metadata for the rows of x (min 1 row and 1 column),

method

(character, length=1) the procedure to choose column(s) with properties of information, may be highest or max (max number of levels) lowest or min (min number of levels), median (median of all options for number of levels), combAll (combine all columns of x) or combNonOrth (combine only non-orthogonal columns of x, to avoid avoid n lines with n levels); lazy evluation of the argument is possible

sep

(character) separator used when a method combining multiple columns (eg combAll, combNonOrth) is chosen (should not appear anywhere in x)

exclNoRepl

(logical) decide whether columns with all values different (ie no replicates or max divergency) should be excluded

trimNames

(logical) optional trimming of names in x by removing redundant heading and tailing text

includeOther

(logical) include $allCols with pattern of (all) other columns

silent

(logical) suppress messages

callFrom

(character) allow easier tracking of messages produced

debug

(logical) additional messages for debugging

Details

Statistical tests require specifying which samples should be considered as replicates of whom. In some cases, like the Sdrf-format, automatic mining of such annotation to indentify an experiment's underlying structure of replicates may be challanging, since the key information may not always be found in the same column. For this reason this function allows inspecting all columns of a matrix of data.frame to identify which colmns may serve describing groups of replicates.

The argument exclNoRepl=TRUE allows excluding all columns with different content for each line (like line-numbers), ie information without any replicates. It is set by default to TRUE to exclude such columns, since statistical tests usually do require some replicates.

When using as method="combAll", there is risk all lines (samples) will be be considered different and no replicates remain. To avoid this situation the argument can be set to method="combNonOrth". Using this mode it will be checked if adding more columns will lead to complete loss of replicates, and -if so- concerned columns omitted.

Value

This function returns a list with $col (column index relativ to x), $lev (abstract labels of level), $meth (note of method finally used) and $allCols with general replicate structure of all columns of x

See Also

duplicated, uses trimRedundText

Examples

## a is all different, b is groups of 2,
## c & d  are groups of 2 nut NOT 'same general' pattern as b
strX <- data.frame(a=letters[18:11], b=letters[rep(c(3:1,4), each=2)],
 c=letters[rep(c(5,8:6), each=2)], d=letters[c(1:2,1:3,3:4,4)],
 e=letters[rep(c(4,8,4,7),each=2)], f=rep("z",8) )
strX
replicateStructure(strX[,1:2])
replicateStructure(strX[,1:4], method="combAll")
replicateStructure(strX[,1:4], method="combAll", exclNoRepl=FALSE)
replicateStructure(strX[,1:4], method="combNonOrth", exclNoRepl=TRUE)
replicateStructure(strX, method="lowest")
replicateStructure(strX, method=3, includeOther=TRUE)   # custom choice of 3rd column




wrMisc documentation built on Sept. 11, 2024, 6:10 p.m.