gict: Grep by Individual Constrained in Time

View source: R/gict.R

gictR Documentation

Grep by Individual Constrained in Time

Description

in a dataset with one or more variables (typically containing text) associated with a date, find matches on those variables for specific individuals within specifed time frames

Usage

gict(
  pattern,
  x,
  data,
  id,
  date,
  units,
  units.id = id,
  begin = NULL,
  end = NULL,
  include = c(TRUE, TRUE),
  ...,
  data.keep = NULL,
  verbose = TRUE
)

Arguments

pattern

a vector of search strings (regular expressions) (the names attribute will be used as alias if it exists)

x

names of variables to search in (given in order of importance)

data

a data frame

id

name of id variable (in 'data')

date

name of associated date variable (in 'data')

units

a data frame containing id's as well as (but optionally) 'begin' and 'end' variables

units.id

variable name in 'units' to use as id (by default the same as 'id') N.B. a unit can appear several times, and will be identified alongside 'begin' and 'end' (a soft warning will be given if these 3 variables are not enough for uniqueness)

begin

variable name in 'units' to use as begin, if missing will be set to earliest date in data

end

variable name in 'units' to use as end, if missing will be set to latest date in data

include

length 2 logical vector specifying if lower (first entry) and upper (second entry) bounds are inclusive (TRUE) or not (FALSE)

...

arguments passed to data.table::like for identifying matches

data.keep

character vector of variables you want to keep from 'data'

verbose

if TRUE the function will give helpful and/or annoying messages

Value

A data frame with

  • id the id variable

  • alias the name of pattern searched for (else p1, p2, etc)

  • date the date of assicated match

  • time days from 'begin' to 'date'

  • event indicator for a match

  • begin the begin date (could be individual)

  • end the end date (could be individual)

  • match.in the variable the match was found in

  • match the match found

  • first.id indicator for first occurence of associated id/begin/end-combination

  • first.id_date indicator for first occurence of associated id/begin/end- AND date combination

  • pattern the pattern searched for

  • ... variables selected with 'data.keep'.

Note that any individual can have more than one match. See the vignette for examples.


renlund/datma documentation built on June 2, 2025, 5:12 a.m.