My data analysis

options(knitr.kable.NA = "")

Load packages and data

I load my dataset, s_data, which is located in the smallsets package.

library(smallsets)
library(knitr)

head(s_data) |> kable(booktabs = TRUE)
SmallsetTimeline <- Smallset_Timeline(data = s_data,
                                      code = system.file("s_data_preprocess.Rmd", package = "smallsets"))

Preprocessing

I need to preprocess the dataset before I can build a model.

# smallsets snap s_data caption[Remove rows where C2 is FALSE.]caption
s_data <- s_data[s_data$C2 == TRUE,]

# smallsets snap +2 s_data caption[Replace missing values in C6 and C8 with column
# means. Drop C7 because there are too many missing values.]caption
s_data$C6[is.na(s_data$C6)] <- mean(s_data$C6, na.rm = TRUE)
s_data$C8[is.na(s_data$C8)] <- mean(s_data$C8, na.rm = TRUE)
s_data$C7 <- NULL

# smallsets snap +1 s_data caption[Create a new column, C9, by summing C3 and
# C4.]caption
s_data$C9 <- s_data$C3 + s_data$C4

Below is a Smallset Timeline, visualising my preprocessing decisions executed above.

SmallsetTimeline

Modelling

I build a model.

# code to build a model...


Try the smallsets package in your browser

Any scripts or data that you put into this service are public.

smallsets documentation built on May 29, 2024, 8:18 a.m.