The idea is to create one general pollyvote object (list) per election which contains all data in one data set and information on how to create predictions and calculate prediction errors as functions. This created object will get the object class "pollyvote".
Another special feature is to save only the raw data (e.g. poll data, election result, ...) and functions on how to create aggregated predictions from it (prediction functions, aggregation functions, error calculation functions, ...). Calculated results like predictions or error calculations will not be saved in pollyvote the object. This allows constant updating of the data underlying the predictions.
The pollyvote-package
contains the following functionalities:
This section shows the most important functionalities of pollyvote
-package using examples.
library(devtools) load_all()
With function create_pollyvote
you can create an empty pollyvote object. It is also possible to define permitted values for the respective column of the data of this pollyvote object.
# load the pollyvoter package library("pollyvoter") # create an empty pollvote object that only allows 'D' as a country pv = create_pollyvote(perm_countries = "D")
Use add_data
to include election data. Additional arguments allow to specify
information about data e.g. country, region, source type.
# load example data from the package data("polls_individual") # add data to the pollyvote object # note how additional columns in the data set can be specified as additional arguments to add_data() pv = add_data(pv, newdata = polls_individual, country = "D", region = "national", source_type = "poll", election = "BTW") head(get_data(pv))
Use add_election_result
to add election results to the pollyvote container. Election results are necessary for error calculation. Dataframe of election result must contain 'date' column which can be POSIXct
, POSIXlt
or character in format %Y-%m-%d
.
# load the election results of Bundestagswahl 2013 data("election_result") election_result$date = strptime("2013-09-22", format = "%Y-%m-%d") # add the election result to the pollyvote container pv = add_election_result(pv, "BTW 2013", election_result) head(get_election_result(pv, election_name = "BTW 2013"))
Predictions are saved as functions in the pollyvote container. They can be called using predict.pollyvote()
. There is a set of predefined prediction functions when a pollyvote object is created. Aggregation functions are handled as prediction.
# see which predictions are available names(pv$predictions) # # or use the print function # pv # pollyvoter:::print.pollyvote(pv) # see what the 'pollyvote' function does exactly pv$predictions$pollyvote # # there is also a help page for initial functions # ?initial_prediction_pollyvote # call this function on your pollyvote container pred <- predict(pv, method = "pollyvote") head(pred, 20)
Apply plot
function to create a graphical representation of prediction. Since prediction results were not stored, a prediction method specification is required.
# returns a ggplot2 object that can be further modified p <- plot(pv, .prediction_method = "pollyvote") library("ggplot2") p + scale_colour_brewer(type = "qual") + theme_bw(15)
The calculated error is the absolute difference between election results and predicted values.
error_calc_pred <- error_calc(pv, "prediction_election", prediction = "pollyvote", election = "BTW 2013") error_calc_pred[1:10, c("party", "percent", "percent.true", "government", "error")]
Confidence interval is just a special case of error calculation. Use an error calculation function with a ci = TRUE
and alpha value
.
error_pred_conf_int <- error_calc(pv, "prediction_election", prediction = "pollyvote", election = "BTW 2013", ci = TRUE, alpha = 0.3) error_pred_conf_int[1:10, c("party", "percent", "percent.true","error", "mean_error", "ci_lower", "ci_upper")]
This feature allows calculation of predictions for coalitions where a coalition is made up of 1 or multiple parties. Important optional arguments for this function are: threshold and threshold_handle. Threshold is minimum value that a party needs to have for a given day in order to be considered for coalition calculations. threshold.handle is a parameter which specifies how coalitions constituted from parties which have prediction less than threshold should be handled: - If threshold.handle = "omit" -> Prediction for coalitions which have (at least one) party with shares less than threshold won't be calculated. NA would be returned. - If threshold.handle = "ignore" -> Predictions will be made for all coalitions. Parties which have shares less than threshold won't be considered in overall score calculation of the coalition.
coalitions = list(c("CDU/CSU", "SPD"), c("spd", "linke", "afd")) coalitions_percentages = calc_coalitions(pv, coalitions) #Coalition percentages in format appropriate for visualisation coalitions_percentages = calc_coalitions(pv, coalitions, for.ggplot2 = TRUE) #Coalition percentages with specifying threshold value. By default, threshold_handle = "omit" coalitions_percentages = calc_coalitions(pv, coalitions, threshold = 5, for.ggplot2 = TRUE) #Limit the days in prediction of coalition percentages coalitions_percentages = calc_coalitions(pv, coalitions, limitdays = 10, for.ggplot2 = TRUE)
To save the result of predictions or error calculations use the function write.pollyvote()
. It is important to specify either the prediction method (argument prediction
) or the error calculation method (argument error_calculation
).
write.pollyvote(pv, file = "pred.csv", method = "write.table", prediction = "pollyvote") write.pollyvote(pv, file = "error_calc.csv", method = "write.table", error_calculation = "prediction_election")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.