persistence: Calculate Persistence

Description Usage Arguments Details Value Examples

Description

The persistence rate is defined as the percentage of units (ID's) in a given period that are also found in the subsequent period. This function will either:

1) Calculate the persitence rate for each period

2) Create an indicator variable on the original dataframe that identifies whether the ID persisted (1) or not (0)

Usage

1
2
persistence(df, id, rank, period, ..., overall = TRUE,
  calculate = TRUE)

Arguments

df

A dataframe

id

Unique ID variable

rank

Numeric or Ordered Factor variable that indicates the sequence of periods

period

Optional pretty name of the rank variable

...

Optional grouping variables

overall

Logical variable to include overall persistence rate

calculate

Logical variable to indicate calculating persistence rates or indicator variable

Details

If calculate==TRUE persistence() removes the last period from the output,

Value

Returns either a summarised dataframe or the original dataframe with an extra column.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
dataFrame <- data.frame(ID = c("A", "B", "C", "A", "B", "D", "A", "D"),
                       RANK = c(1, 1, 1, 2, 2, 2, 3, 3),
                     PERIOD = c("P1", "P1", "P1", "P2", "P2", "P2", "P3", "P3"),
                     GROUP = c("G1", "G2", "G1", "G1", "G2", "G3", "G1", "G3"),
                     stringsAsFactors = FALSE)
# Calculate == TRUE
persistence(dataFrame, ID, RANK, PERIOD)

# Calculate == FALSE
dataFrame <- persistence(dataFrame, ID, RANK, PERIOD, calculate = FALSE)

christian-million/researchR documentation built on May 15, 2019, 12:45 p.m.