makeNRedMatr: Make non-redundant matrix

View source: R/makeNRedMatr.R

makeNRedMatrR Documentation

Make non-redundant matrix

Description

This function takes matrix or data.frame 'dat' to summarize redundant lines (column argument iniID) along method specified in summarizeRedAs to treat all lines with redundant iniID by same approach (ie for all columns the line where specified column is at eg max = 'maxOfRef' ). If no name given, the function will take the last numeric (factors may be used - they will be read as levels).

Usage

makeNRedMatr(
  dat,
  summarizeRedAs,
  iniID = "iniID",
  retDataFrame = TRUE,
  nEqu = FALSE,
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL
)

Arguments

dat

(matrix or data.frame) main input for making non-redundant

summarizeRedAs

(character) summarization method(s), typical choices 'median','mean','min' or 'maxOfRef'; basic usage like summarizeRedAs='mean' will pick independently the mean for each (numeric) column; it is also possible to specify different methods for each of columnw (length of summarizeRedAs should be equal number of numeric columns); special methods look at a single reference column to decide which line should be picked and their values reported (not compatible with specifying different methods for different columns),

iniID

(character) column-name used as reference for determining groups of redundant lines (default="iniID")

retDataFrame

(logical) if TRUE, check if text-columns may be converted to data.frame with numeric

nEqu

(logical) if TRUE, add additional column indicating the number of equal lines for choice (only with min or max)

silent

(logical) suppress messages

debug

(logical) additional messages for debugging

callFrom

(character) allows easier tracking of messages produced

Details

When using for selection of single initial line give the character-string of argument summarizeRedAs a name (eg summ=c(X1="minOfRef") so that the function will use ONLY the column specified via the name for determining which line should be used/kept.

It is possible to base the choice from 'redundant' lines on a single reference-column. For example, when summarizeRedAs='maxOfRef' summarizing of all (numeric) columns will be performed according to one single column (ie the line where the last numeric column is at its max). Otherwiser, a name can be assigned as reference column to be used (eg see last example using summarizeRedAs=c(x1='maxOfRef'))

Value

This function returns a (numeric) matrix or data.frame with summarized data and add'l col with number of initial redundant lines

See Also

simple/partial functionality in summarizeCols, checkSimValueInSer

Examples

t3 <- data.frame(ref=rep(11:15,3),tx=letters[1:15],
  matrix(round(runif(30,-3,2),1),nc=2),stringsAsFactors=FALSE)
by(t3,t3[,1],function(x) x)
t(sapply(by(t3,t3[,1],function(x) x), summarizeCols, me="maxAbsOfRef"))
# calculate mean for lines concerened of all columns :
(xt3 <- makeNRedMatr(t3, summ="mean", iniID="ref"))
# choose lines based only on content of column 'X1' (here: max):
(xt3 <- makeNRedMatr(t3, summ=c(X1="maxOfRef"), iniID="ref")) 

wrMisc documentation built on Sept. 11, 2024, 6:10 p.m.