knitr::opts_chunk$set( collapse = TRUE, warning = FALSE, message = FALSE, comment = "#>" )
An Election is one of the main pillars of democracy. it presents the will of the people and determines a country's leadership.
In Indonesia, elections are highly anticipated, because they provide an opportunity for all people to influence the direction of their country. The results and voting patterns in an election are of great public interest, but often data is not easy to access or work with. This project aims to increase the accessibility of Indonesian election data, which would be particularly beneficial to researchers, journalists and NGOs.
This project will discuss the Indonesian Legislative Election and Presidential Election of 2014. This Indonesian Legislative Election was held in Indonesia on 9 April 2014 to elect members of the Regional Representative Council (DPD), Members of People's Representative Council (DPR) and members of the regional assemblies at the provinces.
The 2014 Presidential Election were held in Indonesia on 9 July. According to the 2008 election law, the nominations of the candidates for the presidential election may only be made by a party (or coalition of parties) which has at least 20% of the seats in the national parliament (the DPR) or which received 25% of national votes in the previous national legislative election for the DPR.
library(csv) library(here) library(dplyr) library(purrr) library(readxl) library(ggplot2) library(scales) library(knitr) library(indoelectdata) data(Legislative_Election) data(totalvotes) data(Totalseats)
data_dictionary <- data.frame( Variable = names(Legislative_Election), details= c( "Names", "PartyNm", "Electorate", "Provinces" ))
library(knitr) kable(data_dictionary)
glimpse(Legislative_Election) %>% head %>% kable
in this part we can summarise the details information about the party that won the legislative Election
who_won <- Legislative_Election %>% group_by(PartyNm) %>% tally() %>% arrange(desc(n)) #inspect who_won %>% kable()
ggplot(who_won, aes(reorder(PartyNm, n), n)) + geom_point(size = 4) + coord_flip() + scale_y_continuous(labels = comma) + theme_bw() + ylab("Total number of electorates") + xlab("Political Parties") + theme(text = element_text(size = 8))
The result above matches with the official legislative election result in Indonesia 2014, showing that PDIP party winning the most (109)
in this part is an alternative way to see the party received the most votes in the Legislative election
totalvotes <- totalvotes %>% mutate(Votes = as.numeric(Votes)) total_votes_for_parties <- totalvotes %>% select(PartyNm, Votes) %>% group_by(PartyNm) %>% summarise(tvotes = sum(Votes)) %>% ungroup() %>% arrange(desc(tvotes)) #inspect total_votes_for_parties %>% kable
the result above shows a similar result, which the PDIP party received the most ordinary votes. and can show in the plot below:
ggplot(total_votes_for_parties, aes(reorder(PartyNm, tvotes), tvotes)) + geom_point(size = 4) + coord_flip() + scale_y_continuous(labels = comma) + theme_bw() + ylab("Total votes") + xlab("Political Parties") + theme(text = element_text(size = 8))
in this part we can see the party that getting the higest number of seats in parliament.
Totalseats <- Totalseats %>% mutate(Totseats = as.numeric(Totseats)) Total_seats_for_parties <- Totalseats %>% select(PartyNm, Totseats) %>% group_by(PartyNm) %>% summarise(tot_seats = sum(Totseats)) %>% ungroup() %>% arrange(desc(tot_seats)) #inspect Total_seats_for_parties %>% kable
from the result above shows that PDIP Party getting the number of seats in Indonesia Parliament, and we can see in the plot below
ggplot(Total_seats_for_parties, aes(reorder(PartyNm, tot_seats), tot_seats)) + geom_point(size = 4) + coord_flip() + scale_y_continuous(labels = comma) + theme_bw() + ylab("Total seats") + xlab("Political Parties") + theme(text = element_text(size=8))
this is the interesting part, we can see which provinces getting the most seats in Indonesia parliament, because 33 provinces joined the Legislative election in 2014
Total_seats_for_provinces <- Totalseats %>% select(Provinces, Totseats) %>% group_by(Provinces) %>% summarise(tot_seats = sum(Totseats)) %>% ungroup() %>% arrange(desc(tot_seats)) Total_seats_for_provinces %>% kable
From the results above shows that Jawa Barat get the highest number seats in Indonesia Parliament with total 91 seats, we can see from the plot below as well.
ggplot(Total_seats_for_provinces, aes(reorder(Provinces, tot_seats), tot_seats)) + geom_point(size = 4) + coord_flip() + scale_y_continuous(labels = comma) + theme_bw() + ylab("Total seats") + xlab("Provinces") + theme(text = element_text(size=8))
in this part to answer the question for details information for each candidates, who did the best
totalvotes <- totalvotes %>% mutate(Votes = as.numeric(Votes)) total_votes_for_candidate <- totalvotes %>% select(Name, Votes) %>% group_by(Name) %>% summarise(tvotes = sum(Votes)) %>% ungroup() %>% arrange(desc(tvotes)) total_votes_for_candidate %>% head %>% kable
the results showed that My Esti Wijayati get the most votes 553
next, we can look at the proportion of votes, in this case we can see who won most votes and who won least votes
who_won_most_votes_prop <- totalvotes %>% group_by(Electorate, Name) %>% summarise(sum_votes = sum(Votes)) %>% mutate(prop_votes = round(sum_votes / sum(sum_votes), 3), sum_votes = prettyNum(sum_votes, ",")) %>% ungroup %>% arrange(desc(prop_votes)) who_won_most_votes_prop %>% head %>% kable
from the result above we can see that Elnino M Husein Mohi from gorontalo won the most votes proportion.
who_won_least_votes_prop <- totalvotes %>% filter(Provinces == "Bengkulu") %>% group_by(Electorate, Name) %>% summarise(sum_votes = sum(Votes)) %>% mutate(prop_votes = round(sum_votes / sum(sum_votes), 3), sum_votes = prettyNum(sum_votes, ",")) %>% ungroup %>% arrange(desc(prop_votes)) who_won_least_votes_prop %>% head %>% kable
from the result above we can see that Elva Hartati received the smallest proportion votes with 11%
totalvotes <- totalvotes %>% mutate(Votes = as.numeric(Votes)) P <- totalvotes %>% group_by(Electorate, Provinces) %>% mutate(Totalvotes = sum(Votes)) %>% mutate(ProportionPDIP = round(sum(Votes[PartyNm == "Partai Demokrasi Indonesia Perjuangan"]) / Totalvotes, 3)) %>% arrange(desc(ProportionPDIP)) %>% distinct_at(vars(Electorate, Provinces, ProportionPDIP, Totalvotes)) province_plots <- P %>% ungroup %>% group_nest(Provinces) %>% # only provinces with >1 electorates filter(map_int(data, nrow) > 1) %>% mutate(province_plots = map(data, ~ggplot(.x, aes( x = ProportionPDIP, y = reorder(Electorate, ProportionPDIP), size = Totalvotes, )) + geom_point(size = 3) + ylab("Electorate") + theme_bw(base_size = 7) + theme(legend.position="none") )) cowplot::plot_grid(plotlist = province_plots$province_plots)
from results above shows the plots proportion PDIP party for each electorate/provinces
below will show the map of the total seats for each provinces
library(rgdal) library(rgeos) library(ggthemes) # Read in shapefile my_map <- readOGR("Shapefile/INDONESIA_PROP.shp") # Fortify (ggplot2 object) nat_map <- ggplot2::fortify(my_map %>% gSimplify(tol = 0.001)) # Add some details to differentiate between group and piece nat_map$group <- paste("g",nat_map$group,sep=".") nat_map$piece <- paste("p",nat_map$piece,sep=".") # Add Propinsi names nms <- my_map@data %>% select(ID, Propinsi) %>% rename(id = ID) %>% mutate(id = as.numeric(id) - 1) nat_map$id <- as.numeric(nat_map$id) nat_map <- left_join(nat_map, nms, by="id") # Get centroids for each region polys <- as(my_map, "SpatialPolygons") library(purrr) centroid <- function(i, polys) { ctr <- sp::Polygon(polys[i])@labpt data.frame(long_c=ctr[1], lat_c=ctr[2]) } centroids <- seq_along(polys) %>% purrr::map_df(centroid, polys=polys) # Load nat_data ### NOTE: ADD DATA TO THE `nat_data` DATAFRAME THAT YOU WANT TO USE IN THE PLOT ### NOTE: THE `nat_map` CONTAINS THE POLYGONS AND `nat_data` CONTAINS THE DATA totalseats <- Total_seats_for_provinces %>% slice(1:32) %>% select(tot_seats) nat_data <- my_map@data %>% rename(id = ID) nat_data <- data.frame(centroids, nat_data, totalseats) # Plot ggplot(aes(map_id= id), data= nat_data) + geom_map(aes(fill = Propinsi), map= nat_map) + expand_limits(x=nat_map$long, y=nat_map$lat) + theme_map() + coord_equal() + guides(fill = F) + labs ( title = "Total seats for provinces in Election 2014 ", subtitle = "The mapping of the provinces who got the highest number of seats in Legislative Election 2014", caption= "data source : Indonesia Central Bureau of Statistics") #plot total seats for provinces ggplot(aes(map_id= id), data= nat_data) + geom_map(aes(fill = tot_seats ), map= nat_map) + expand_limits(x=nat_map$long, y=nat_map$lat) + theme_map() + coord_equal() + labs ( title = "Total seats for provinces in Election 2014 ", subtitle = "The mapping of the provinces who got the highest number of seats in Legislative Election 2014", caption= "data source : Indonesia Central Bureau of Statistics")
in this part to answer the question how the election result correlate with census data, we want to know whether there is positive correlation or negative correlation.
load('../data/IDNS2010.rdata') combined <- sort(union(levels(totalvotes$Provinces), levels(IDNS2010$Provinces))) census_and_electionIDN <- left_join(mutate(totalvotes, Provinces=factor(Provinces, levels=combined)), mutate(IDNS2010, Provinces=factor(Provinces, levels=combined))) #subset only the columns we want for the model census_and_election_subset <- census_and_electionIDN %>% ungroup %>% select(Provinces, Votes, PopulationIDN:PopulationProjection2035) %>% rename(tot_votes = Votes) %>% distinct(Provinces, .keep_all = TRUE) census_and_election_subsetplot <- census_and_election_subset %>% ungroup %>% select(tot_votes, PopulationIDN:PopulationProjection2035) census_and_election_subsetplot$tot_votes <- as.numeric(census_and_election_subsetplot$tot_votes) census_and_election_subsetplot$Medianage <- as.numeric(census_and_election_subsetplot$Medianage) library(broom) library(purrrlyr) library(corrplot) IDN <- cor(census_and_election_subsetplot [, c(2:ncol(census_and_election_subsetplot))], use = "pairwise.complete.obs") corrplot.mixed(IDN, lower = "ellipse", upper = "number", tl.pos = "lt", tl.cex = 0.5, tl.col = "black", number.cex= 0.5) #more readable numbers options(scipen = 10) census_variables <- names(IDNS2010)[-c(1:3)] #compute the multiple regressions census_and_electionIDN$Votes <- as.numeric(census_and_electionIDN$Votes) census_and_electionIDN$Medianage <- as.numeric(census_and_electionIDN$Medianage) multiple_regression_model <- census_and_electionIDN %>% ungroup %>% select( Votes, PopulationIDN:PopulationProjection2035) %>% lm(Votes ~ ., data = .) multiple_regression_model %>% glance %>% dmap(round, 3) %>% kable
from the result above we can see that there is negative strong correlation between election and census data because the P-value bigger than 0,05
#find the variables with a significant effect #in this part we want to find the varialbes with a significant effect multiple_regression_model %>% tidy %>% filter(p.value < 0.05) %>% dmap_if (is.numeric, round, 3) %>% arrange(p.value) %>% kable
the result above showed that the variable Unemployment2018 and Incomeff has the significant effect with election with the p-value less than 0,05
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.