Prediction

Five films from 2016 were selected from the BoxOfficeMojo.com, IMDb, and Rotten Tomatoes websites and predictions were rendered using the four of the top ten models from the model evaluation section:
Empirical Bayes-Local (Median Prediction Model)
Zellner's g-prior (Median Predictive Model)
AIC (Median Predictive Model)
BIC (Median Predictive Model)

Prediction Data

The five movies selected for prediction (r kfigr::figr(label = "cases", prefix = TRUE, link = TRUE, type="Table")) were so chosen to present a range of values for audience score and predictor values across the parameter space. Audience scores ranged from 34 to 87. The number of IMDb votes varied from approximately 8000 votes for Synchronicity, to over 460,000 for Suicide Squad. Similarly, critics' scores stretched from a low 26 points to a near perfect 98. Results at the Academy varied significantly among the chosen films in order to provide a reasonably diverse sampling. The data was obtained from the IMDb and boxofficemojo web sites.

r kfigr::figr(label = "cases", prefix = TRUE, link = TRUE, type="Table"): Movies selected for prediction

names(cases) <- c("Title", "Feature", "Drama", "Run", "R", "Year", "Oscar", "Summer", "IMDb Rating", "Critics",
                  "Critics (Log)","BP Nom", "BP Win", "Actor", "Actress", "Dir", "Top", "# Votes", "# Votes (Log)", 
                  "# Votes (sqrt)", "Audience")
knitr::kable(cases, digits = 1) %>%
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

Prediction Results

r kfigr::figr(label = "results", prefix = TRUE, link = TRUE, type="Table") summarizes the audience score predictions, true values, and mean squared errors for each prior.

r kfigr::figr(label = "results", prefix = TRUE, link = TRUE, type="Table"): Prediction Results

knitr::kable(predictions, digits = 1) %>%
  kableExtra::kable_styling(bootstrap_options = c("hover", "condensed", "responsive"), full_width = T, position = "center")

All models under-predicted audience scores for Split, a 'middle-of-the-pack' film with no Academy awards and critics scores and IMDB ratings in the 70th percentile. By contrast, all models over-predicted audience scores for Moonlight, a film enjoyed by the critics and the Academy, but had the 2nd lowest number of votes. All models under-predicted audience score for Rogue One, a blockbuster film in the 200 box office category, with over 400,000 votes, and the highest IMDb rating of the group. The largest prediction errors were for Synchronicity, a relatively unknown outlier which was panned by the critics and audiences alike. Predictions for Suicide Squad, the most frequently rated of the five, were significantly lower than the true audience scores.




DataScienceSalon/Bayesian-Regression documentation built on May 29, 2019, 12:06 a.m.