pick: Extract Elements From a (Atomic) Vector

View source: R/str2str_functions.R

pickR Documentation

Extract Elements From a (Atomic) Vector

Description

pick extracts the elements from a (atomic) vector that meet certain criteria: 1) using exact values or regular expressions (pat), 2) inclusion vs. exclusion of the value/expression (not), 3) based on elements or names (nm). Primarily for character vectors, but can be used with other typeof.

Usage

pick(x, val, pat = FALSE, not = FALSE, nm = FALSE, fixed = FALSE)

Arguments

x

atomic vector or an object with names (e.g., data.frame) if nm = TRUE.

val

atomic vector specifying which elements of x will be extracted. If pat = FALSE (default), then val should be an atomic vector of the same typeof as x, can have length > 1, and exact matching will be done via is.element (essentially match). If pat = TRUE, then val has to be a character vector of length 1 and partial matching will be done via grepl with the option of regular expressions if fixed = FALSE (default). Note, if nm = TRUE, then val should refer to names of x to determine which elements of x should be extracted.

pat

logical vector of length 1 specifying whether val should refer to exact matching (FALSE) via is.element (essentially match) or partial matching (TRUE) and/or use of regular expressions via grepl. See details for a brief description of some common symbols and help(regex) for more.

not

logical vector of length 1 specifying whether val indicates values that should be retained (FALSE) or removed (TRUE).

nm

logical vector of length 1 specifying whether val refers to the names of x (TRUE) rather than the elements of x themselves (FALSE).

fixed

logical vector of length 1 specifying whether val refers to values as is (TRUE) or a regular expression (FALSE). Only used if pat = TRUE.

Details

pick allows for 8 different ways to extract elements from a (atomic) vector created by the 2x2x2 combination of logical arguments pat, not, and nm. When pat = FALSE (default), pick uses is.element (essentially match) and requires exact matching of val in x. When pat = TRUE, pick uses grepl and allows for partial matching of val in x and/or regular expressions if fixed = FALSE (default).

When dealing with regular expressions via pat = TRUE and fixed = FALSE, certain symbols within val are not interpreted as literal characters and instead have special meanings. Some of the most commonly used symbols are . = any character, "|" = logical or, "^" = starts with, "\n" = new line, "\t" = tab.

Value

a subset of x that only includes the elements which meet the criteria specified by the function call.

Examples

# pedagogical cases
chr <- setNames(object = c("one","two","three","four","five"), nm = as.character(1:5))
# 1) pat = FALSE, not = FALSE, nm = FALSE
pick(x = chr, val = c("one","five"), pat = FALSE, not = FALSE, nm = FALSE)
# 2) pat = FALSE, not = FALSE, nm = TRUE
pick(x = chr, val = c("1","5"), pat = FALSE, not = FALSE, nm = TRUE)
# 3) pat = FALSE, not = TRUE, nm = FALSE
pick(x = chr, val = c("two","three","four"), pat = FALSE, not = TRUE, nm = FALSE)
# 4) pat = FALSE, not = TRUE, nm = TRUE
pick(x = chr, val = c("2","3","4"), pat = FALSE, not = TRUE, nm = TRUE)
# 5) pat = TRUE, not = FALSE, nm = FALSE
pick(x = chr, val = "n|v", pat = TRUE, not = FALSE, nm = FALSE)
# 6) pat = TRUE, not = FALSE, nm = TRUE
pick(x = chr, val = "1|5", pat = TRUE, not = FALSE, nm = TRUE)
# 7) pat = TRUE, not = TRUE, nm = FALSE
pick(x = chr, val = "t|r", pat = TRUE, not = TRUE, nm = FALSE)
# 8) pat = TRUE, not = TRUE, nm = TRUE
pick(x = chr, val = c("2|3|4"), pat = TRUE, not = TRUE, nm = TRUE)
datasets <- data()[["results"]][, "Item"]
# actual use cases
pick(x = datasets, val = c("attitude","mtcars","airquality"),
   not = TRUE) # all but the three most common datasets used in `str2str` package examples
pick(x = datasets, val = "state", pat = TRUE) # only datasets that contain "state"
pick(x = datasets, val = "state.*state", pat = TRUE) # only datasets that have
   # "state" twice in their name
pick(x = datasets, val = "US|UK", pat = TRUE) # only datasets that contain
   # "US" or "UK"
pick(x = datasets, val = "^US|^UK", pat = TRUE) # only datasets that start with
   # "US" or "UK"
pick(x = datasets, val = "k.*o|o.*k", pat = TRUE) # only datasets containing both
   # "k" and "o"

str2str documentation built on Nov. 21, 2023, 1:08 a.m.