knitr::opts_chunk$set(echo = FALSE)
Ride collection:
Three main issues in data (we are concerned with here):
Other issues:
We combined the ride data with
This entire analysis is availible as an R package in a GitHub repository
We have $n$ observations of rides.
\begin{equation} y_i = \begin{cases} 1, & \text{if ride } i \text{ was given a negative rating;}\ 0, & \text{otherwise.} \end{cases} \end{equation}
Define predictors,
for $i = 1, \ldots, n$. All but the last we represent together as matrix $X$.
\begin{table}[htb] \centering \caption{Fit summaries for Models 1--6.\label{tab:modelfits}} \begin{tabular}{lrrr} \toprule \textbf{Model} & \textbf{$\log (\mathcal{L})$} & \textbf{AIC} & \textbf{AUC}$_{\text{CV}}$\footnotemark\ \midrule \rowcolor{red} Model 1 & -4,786 & 9,586 & 0.552\ Model 2 & -3,971 & 7,957 & 0.797\ Model 3 & -3,923 & 7,877 & 0.802\ Model 4 & -3,930 & 7,870 & 0.802\ Model 5 & -3,928 & 7,878 & 0.803\ \rowcolor{red} Model 6 & -4,713 & 9,455 & 0.601\ \bottomrule \end{tabular} \end{table} \footnotetext{Area under ROC curve estimated with 10-fold cross-validation.}
For riders $j = 1, \ldots, l$, we have rider-level predictors
Variables were standardized and clustered using $k$-means clustering
Of $n = 25,397$ rides, $11,365$ not rated.
Let,
\begin{equation} r_i = \begin{cases} 1, & \text{if ride } i \text{ is missing a rating;}\ 0, & \text{otherwise.} \end{cases} \end{equation}
Rubin classifies missing data into three situations^[@little1987 (page 14)]:
We believe the missing ratings are nonignorable.
Create augmented data:
Fit data model and missing data model separately using augmented data
\begin{figure}[htb] \centering \caption{Creation of augmented data set for the weighted method of the EM algorithm for missing response data. \label{fig:augmented-data}} \begin{tabular}{lcl} \toprule \textbf{Original Data} & & \textbf{Augmented Data}\ \midrule
\begin{tabular}{lll} $y_i$ & $x_i$ & $r_i$\ \midrule 1 & 2.4 & 0\ 0 & 1.3 & 0\ NA & -0.4 & 1\ & & \end{tabular} & $\to$ & \begin{tabular}{llll} $y_i$ & $x_i$ & $r_i$ & $w_i$\ \midrule 1 & 2.4 & 0 & 1\ 0 & 1.3 & 0 & 1\ 1 & -0.4 & 1 & 0.2\ 0 & -0.4 & 1 & 0.8 \end{tabular}\ \bottomrule \end{tabular} \end{figure}
\begin{table}[htb] \centering \begin{tabular}{lrrrr} \toprule \textbf{Parameter} & \textbf{Model 4} & \textbf{EM Model}\ \midrule Log(Length) & -0.147 & 0.205\ & \footnotesize (-0.290, -0.005) & \footnotesize (0.106, 0.304)\ Mean Temperature & 0.142 & 0.100\ & \footnotesize (0.004, 0.281) & \footnotesize (0.005, 0.196)\ Mean Wind Speed & 0.002 & -0.026\ & \footnotesize (-0.054, 0.057) & \footnotesize (-0.069, 0.016)\ Max Gust Speed & -0.005 & 0.020\ & \footnotesize (-0.031, 0.021) & \footnotesize (0.001, 0.039)\ Rainfall & 0.050 & 0.051\ & \footnotesize (-0.017, 0.117) & \footnotesize (0.009, 0.093)\ Rainfall 4-Hour & 0.022 & 0.017\ & \footnotesize (0.003, 0.041) & \footnotesize (0.003, 0.030)\ Intercept & -2.792 & -3.144\ & \footnotesize (-3.334, -2.250) & \footnotesize (-3.604, -2.684)\ \bottomrule \end{tabular} \end{table}
\begin{table}[htb] \centering \begin{tabular}{lrrrr} \toprule \textbf{Parameter} & \textbf{Basic Model} & \textbf{EM Model}\ \midrule $y$ & 0.730 & 1.035\ & \footnotesize (0.235, 1.224) & \footnotesize (0.493, 1.577) \ Log(Length) & -0.297 & -0.327\ & \footnotesize (-0.362, -0.232) & \footnotesize (-0.393, -0.262)\ Mean Temperature & 0.200 & 0.139\ & \footnotesize (0.139, 0.262) & \footnotesize (0.077, -0.262)\ Mean Wind Speed & 0.032 & 0.031\ & \footnotesize (0.003, 0.060) & \footnotesize (0.001, 0.061) \ Max Gust Speed & -0.003 & -0.007\ & \footnotesize (-0.016, 0.010) & \footnotesize (-0.021, 0.006) \ Rainfall & 0.007 & -0.024\ & \footnotesize (-0.028, 0.041) & \footnotesize (-0.057, 0.009)\ Rainfall 4-Hour & -0.002 & 0.010\ & \footnotesize (-0.012, 0.009) & \footnotesize (-0.001, 0.021) \ Intercept & -0.927 & -0.967\ & \footnotesize (-1.124, -0.729) & \footnotesize (-1.163, -0.771)\ \bottomrule \end{tabular} \end{table}
nocite: | @stan, @lme4, @gamm4, @Rlang, @wunderground, @pdxrain, @ridereport ...
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.