extractData: Function . for Selecting Rows/Columns with base R Extract

.R Documentation

Function . for Selecting Rows/Columns with base R Extract

Description

Using the base R Extract function, with the unobtrusive function name, ., express a subsetting operation as
d[.(rows), .(cols)]
for a less annoying experience. With . to express a logical criterion to select rows, do not append the data frame name and $ to variable names in expressions as otherwise required by Extract. Can also do a random selection of rows. For columns, no need to quote variable names, can include variable ranges defined by a colon, :, and add - to exclude designated columns. Also does not list rows missing data when not requested as does Extract.

Usage

.(x, ...)

Arguments

x

Logical expression to subset rows or columns.

...

Allows multiple expressions when selecting columns.

'

Details

Eliminates the need to prepend the data frame name and a dollar sign to each variable name in the specified logical expression to select rows. For columns, no quoting variables, allow variable ranges.

Can create a character string called rows that expresses the logic of row selection. Can create a character string called cols that expresses the logic of column (variable) selection. To negate the rows expression, .(!rows). Use -.(cols) to exclude designated variables.

Select a random selection of rows with the containing function random(n), where n is the specified number of random rows to select from the full data frame and .n is the proportion of random rows to select.

Value

The row or columns names of the rows of data or columns of data that satisfy the specified logical conditions.

Author(s)

David W. Gerbing (Portland State University; gerbing@pdx.edu)

See Also

Extract subset.

Examples

# see vignette

d <- Read("Employee", quiet=TRUE)

# no data frame name attached to variable names
#   as variables assumed in the data frame
d[.(Gender=="M" & Post>90), ]

# include first three rows and only the specified variables
# variable range permitted
d[1:3, .(Years:Salary, Post)]

#  include first three rows and delete the specified variables
d[1:3, -.(Years:Salary, Post)]

# select rows and columns
d[.(Gender=="M" & Post>90), .(Years:Salary, Post)]

# because of the default for the base R Extract function [ ],
# if only one variable retained,
# then add drop=FALSE to retain the result as a data frame
d[1:3, .(Salary), drop=FALSE]

# define character string arguments
cols <- "Gender:Salary, Post"
rows <- "Gender=='M' & Post>93"
d[.(rows), .(cols)]
# negate
d[.(!rows), -.(cols)]

# random selection of 4 rows, retain all variables
d[.(random(4)), ]

lessR documentation built on Nov. 12, 2023, 1:08 a.m.