select.colnames.lsd: Select a subset of a LSD results matrix (by column/variable...

View source: R/select.R

select.colnames.lsdR Documentation

Select a subset of a LSD results matrix (by column/variable names)

Description

This function select a subset of a LSD results matrix (as produced by read.raw.lsd) by the column (variable) names, considering only the name part of the column labels.

Usage

select.colnames.lsd( dataSet, col.names = NULL, instance = 0,
                     check.names = TRUE, posit = NULL,
                     posit.match = c( "fixed", "glob", "regex" ) )

Arguments

dataSet

matrix produced by the invocation of read.raw.lsd, read.single.lsd, read.multi.lsd or read.list.lsd (a single matrix a time) functions.

col.names

a vector of optional names for the variables. The default is to read all variables. The names must to be in LSD/C++ format, without dots (".") in the name. Any dot (and trailing characters) will be automatically removed.

instance

integer: the instance of the variable to be read, for variables that exist in more than one object. This number is based on the relative position (column) of the variable in the results file. The default (0) is to read all instances.

check.names

logical. If TRUE the names of the variables are checked to ensure that they are syntactically valid variable names. If necessary they are adjusted to ensure that there are no duplicates.

posit

a string, a vector of strings or an integer vector describing the LSD object position of the variable(s) to select. If an integer vector, it should define the position of a SINGLE LSD object. If a string or vector of strings, each element should define one or more different LSD objects, so the returning matrix will contain variables from more than one object. By setting posit.match, globbing (wildcard), and regular expressions can be used to select multiple objects at once; in this case, all matching objects are returned. This option only operates if dataSet was generated by read.raw.lsd WITHOUT argument clean.names = TRUE.

posit.match

a string defining how the posit argument, if provided, should be matched against the LSD object positions. If equal to "fixed", the default, only exact matching is done. "glob" allows using simple wildcard characters ('*' and '?') in posit for matching. If posit.match="regex" interpret posit as POSIX 1003.2 extended regular expression(s). See regular expressions for details of the different types of regular expressions. Options can be abbreviated.

Details

Selection restriction arguments can be provided as needed; when not specified, all available cases are selected.

The selection of specific posit object positions require full detail on dataSet column names, as produced by read.raw.lsd and clean.names = TRUE is NOT used. Other read.XXXX.lsd functions do NOT produce the required detail on the data matrices to do object position selection. If such datasets are used to feed this function and posit is set, the return value will be NULL. In this case, consider using select.colattrs.lsd, or specifying posit when calling read.XXXX.lsd functions.

When posit is supplied together with other attribute filters, the selection process is done in two steps. Firstly, the column names set by otter attribute filters are selected. Secondly, the instances defined by posit are selected from the first selection set.

See also the read.XXXX.lsd functions which may select just specific col.names columns, instance instances, or posit positions when loading LSD results. If only a single set of columns/instance/positions is required, this may be more efficient than using this function.

Value

Returns a single matrix containing the selected variables' time series contained in the original data set.

Note

The variable/column names must be valid R or LSD column names.

Author(s)

Marcelo C. Pereira

See Also

list.files.lsd(), select.colattrs.lsd(), read.raw.lsd()

Examples

# get the list of file names of example LSD results
files <- list.files.lsd( system.file( "extdata", package = "LSDinterface" ) )

# read all instances of all variables in first file
bigTable <- read.raw.lsd( files[ 1 ] )
print( bigTable[ 1 : 10, 1 : 7 ] )

# extract all instances of a set of variables named '_A1p' and '_growth1'
abTable <- select.colnames.lsd( bigTable, c( "_A1p", "_growth1" ) )
print( abTable[ 11 : 15, ] )

# extract specific instances of a set of variables named '_A1p' and '_growth1'
abFirst2 <- select.colnames.lsd( bigTable, c( "_A1p", "_growth1" ),
                                 posit = c( "1_2", "1_5" ) )
print( abFirst2[ 50 : 60, ] )

# extract all second-level object instances of all variables
aSec <- select.colnames.lsd( bigTable, posit = "*_*", posit.match = "glob" )
print( aSec[ 1 : 10, ] )

# extract just top-level object instances variables
aTop <- select.colnames.lsd( bigTable, posit = "^[0-9]+$", posit.match = "regex" )
print( aTop[ 1 : 10, ] )

LSDinterface documentation built on May 14, 2022, 1:05 a.m.