preprocess.data: Prepare data frame for analysis

Description Usage Arguments Details Value Author(s) Examples

View source: R/parse_infected.R

Description

Prepare data frame for analysis

Usage

1
2
3
4
5
preprocess.data(
  data,
  infected_in = c("Wuhan", "Outside"),
  symptom_impute = FALSE
)

Arguments

data

A data frame

infected_in

Either "Wuhan" or "Outside"

symptom_impute

Whether to use initial medical visit and confirmation to impute missing symptom onset.

Details

A summary of the procedures:

  1. Convert all dates to number of days since 1-Dec-2019.

  2. Separates data into those returned from Wuhan and those infected outside of wuhan.

  3. Restrict to cases with a known symptom onset date.

  4. Parse column 'Infected' into two columns: Infected_first and Infected_last.

  5. For all cases, set Infected_first to 1 if it is missing.

  6. For outside cases, set Infected_last to be no later than symptom onset.

  7. For Wuhan-exported cases, set Infected_last to no later than symptom onset and end of Wuhan stay.

Value

A data frame

Author(s)

Nianqiao Ju <nju@g.harvard.edu>, Qingyuan Zhao <qyzhao@statslab.cam.ac.uk>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(covid19_data)
head(data <- preprocess.data(covid19_data))

 ## This is how the wuhan_exported data frame is created
data <- subset(data, Symptom < Inf)
data <- subset(data, Arrived <= 54)
data$Location <- do.call(rbind, strsplit(as.character(data$Case), "-"))[, 1]
wuhan_exported <- data.frame(Location = data$Location,
                             B = data$Begin_Wuhan,
                             E = data$End_Wuhan,
                             S = data$Symptom)
## devtools::use_data(wuhan_exported)

bets.covid19 documentation built on July 2, 2020, 2:14 a.m.