knitr::opts_chunk$set(
  collapse = TRUE,
  warning = FALSE,
  message = FALSE,
  comment = "#>"
)

Exploring Election and Cencus Highly Informative Data Nationally for Indonesia

An Election is one of the main pillars of democracy. it presents the will of the people and determines a country's leadership.

In Indonesia, elections are highly anticipated, because they provide an opportunity for all people to influence the direction of their country. The results and voting patterns in an election are of great public interest, but often data is not easy to access or work with. This project aims to increase the accessibility of Indonesian election data, which would be particularly beneficial to researchers, journalists and NGOs.

This project will discuss the Indonesian Legislative Election and Presidential Election of 2014. This Indonesian Legislative Election was held in Indonesia on 9 April 2014 to elect members of the Regional Representative Council (DPD), Members of People's Representative Council (DPR) and members of the regional assemblies at the provinces.

The 2014 Presidential Election were held in Indonesia on 9 July. According to the 2008 election law, the nominations of the candidates for the presidential election may only be made by a party (or coalition of parties) which has at least 20% of the seats in the national parliament (the DPR) or which received 25% of national votes in the previous national legislative election for the DPR.

Import csv files

library(csv)
library(here)
library(dplyr)
library(purrr)
library(readxl)
library(ggplot2)
library(scales)
library(knitr)
library(indoelectdata)

data(Legislative_Election)
data(totalvotes)
data(Totalseats)
data_dictionary <- data.frame(
  Variable = names(Legislative_Election), 
  details= c(
    "Names",
    "PartyNm", 
    "Electorate",
    "Provinces"
    ))
library(knitr)
kable(data_dictionary)

Parties

glimpse(Legislative_Election) %>% head %>% kable

Exploring the election results

Party that won the Legislative Election

in this part we can summarise the details information about the party that won the legislative Election

who_won <-
  Legislative_Election %>%
  group_by(PartyNm) %>%
  tally() %>%
  arrange(desc(n))

#inspect
who_won %>%
  kable()

plot the party that won

ggplot(who_won,
       aes(reorder(PartyNm, n),
           n)) +
  geom_point(size = 4) +
  coord_flip() +
  scale_y_continuous(labels = comma) +
  theme_bw() +
  ylab("Total number of electorates") +
  xlab("Political Parties") +
  theme(text = element_text(size = 8))

The result above matches with the official legislative election result in Indonesia 2014, showing that PDIP party winning the most (109)

Party receiving the most votes in the Legislative Election

in this part is an alternative way to see the party received the most votes in the Legislative election

totalvotes <- 
  totalvotes %>% 
  mutate(Votes = as.numeric(Votes))

total_votes_for_parties <- 
  totalvotes %>%
  select(PartyNm, Votes) %>%
  group_by(PartyNm) %>%
  summarise(tvotes = sum(Votes)) %>%
  ungroup() %>%
  arrange(desc(tvotes))

#inspect
total_votes_for_parties %>%
  kable

the result above shows a similar result, which the PDIP party received the most ordinary votes. and can show in the plot below:

ggplot(total_votes_for_parties,
       aes(reorder(PartyNm, tvotes),
           tvotes)) +
  geom_point(size = 4) +
  coord_flip() +
  scale_y_continuous(labels = comma) +
  theme_bw() +
  ylab("Total votes") +
  xlab("Political Parties") +
  theme(text = element_text(size = 8))

Party getting the highest number of seats in Parliament

in this part we can see the party that getting the higest number of seats in parliament.

Totalseats <-
  Totalseats %>%
  mutate(Totseats = as.numeric(Totseats))

Total_seats_for_parties <- 
  Totalseats %>%
  select(PartyNm, Totseats) %>%
  group_by(PartyNm) %>%
  summarise(tot_seats = sum(Totseats)) %>%
  ungroup() %>%
  arrange(desc(tot_seats))

#inspect
Total_seats_for_parties  %>%
  kable

from the result above shows that PDIP Party getting the number of seats in Indonesia Parliament, and we can see in the plot below

ggplot(Total_seats_for_parties,
       aes(reorder(PartyNm, tot_seats),
           tot_seats)) +
  geom_point(size = 4) +
  coord_flip() +
  scale_y_continuous(labels = comma) +
  theme_bw() +
  ylab("Total seats") +
  xlab("Political Parties") +
  theme(text = element_text(size=8))

Provinces getting the most seats in parliament

this is the interesting part, we can see which provinces getting the most seats in Indonesia parliament, because 33 provinces joined the Legislative election in 2014

Total_seats_for_provinces <- 
  Totalseats %>%
  select(Provinces, Totseats) %>%
  group_by(Provinces) %>%
  summarise(tot_seats = sum(Totseats)) %>%
  ungroup() %>%
  arrange(desc(tot_seats))

Total_seats_for_provinces %>%
  kable

From the results above shows that Jawa Barat get the highest number seats in Indonesia Parliament with total 91 seats, we can see from the plot below as well.

ggplot(Total_seats_for_provinces,
       aes(reorder(Provinces, tot_seats),
           tot_seats)) +
  geom_point(size = 4) +
  coord_flip() +
  scale_y_continuous(labels = comma) +
  theme_bw() +
  ylab("Total seats") +
  xlab("Provinces") +
  theme(text = element_text(size=8))

in this part to answer the question for details information for each candidates, who did the best

Which candidate did the best?

totalvotes <-
  totalvotes %>%
  mutate(Votes = as.numeric(Votes))

total_votes_for_candidate <- totalvotes %>%
  select(Name, Votes) %>%
  group_by(Name) %>%
  summarise(tvotes = sum(Votes)) %>%
  ungroup() %>%
  arrange(desc(tvotes))

total_votes_for_candidate %>%
  head %>% 
  kable

the results showed that My Esti Wijayati get the most votes 553

next, we can look at the proportion of votes, in this case we can see who won most votes and who won least votes

who_won_most_votes_prop <-
  totalvotes %>%
  group_by(Electorate, Name) %>%
  summarise(sum_votes = sum(Votes)) %>%
  mutate(prop_votes = round(sum_votes / sum(sum_votes), 3),
         sum_votes = prettyNum(sum_votes, ",")) %>%
  ungroup %>%
  arrange(desc(prop_votes))

who_won_most_votes_prop %>%
  head %>%
  kable

from the result above we can see that Elnino M Husein Mohi from gorontalo won the most votes proportion.

who_won_least_votes_prop <-
  totalvotes %>%
  filter(Provinces == "Bengkulu") %>%
  group_by(Electorate, Name) %>%
  summarise(sum_votes = sum(Votes)) %>%
  mutate(prop_votes = round(sum_votes / sum(sum_votes), 3),
         sum_votes = prettyNum(sum_votes, ",")) %>%
  ungroup %>%
  arrange(desc(prop_votes))

who_won_least_votes_prop %>%
  head %>%
  kable

from the result above we can see that Elva Hartati received the smallest proportion votes with 11%

How did each electorate vote in each province?

Summarise and compute proportion of votes for PDIP party

totalvotes <- totalvotes %>%
  mutate(Votes = as.numeric(Votes))

P <- totalvotes %>%
  group_by(Electorate, Provinces) %>%
  mutate(Totalvotes = sum(Votes)) %>%
  mutate(ProportionPDIP = round(sum(Votes[PartyNm == "Partai Demokrasi Indonesia Perjuangan"]) / Totalvotes, 3)) %>%
  arrange(desc(ProportionPDIP)) %>% 
  distinct_at(vars(Electorate, Provinces, ProportionPDIP, Totalvotes))

province_plots <- 
P %>% 
  ungroup %>% 
  group_nest(Provinces) %>% 
  # only provinces with >1 electorates
  filter(map_int(data, nrow) > 1) %>% 
  mutate(province_plots = map(data, 
                              ~ggplot(.x,
                                      aes(
                 x = ProportionPDIP,
                 y = reorder(Electorate,
                             ProportionPDIP),
                 size = Totalvotes,
               )) +
        geom_point(size = 3) +
        ylab("Electorate") +
        theme_bw(base_size = 7) +
        theme(legend.position="none")
        )) 

cowplot::plot_grid(plotlist = province_plots$province_plots)

from results above shows the plots proportion PDIP party for each electorate/provinces

below will show the map of the total seats for each provinces

library(rgdal)
library(rgeos)
library(ggthemes)

# Read in shapefile
my_map <- readOGR("Shapefile/INDONESIA_PROP.shp") 

# Fortify (ggplot2 object)
nat_map <- ggplot2::fortify(my_map %>% gSimplify(tol = 0.001))

# Add some details to differentiate between group and piece
nat_map$group <- paste("g",nat_map$group,sep=".")
nat_map$piece <- paste("p",nat_map$piece,sep=".")

# Add Propinsi names
nms <- my_map@data %>% select(ID, Propinsi) %>% rename(id = ID) %>% mutate(id = as.numeric(id) - 1)
nat_map$id <- as.numeric(nat_map$id)
nat_map <- left_join(nat_map, nms, by="id")

# Get centroids for each region
polys <- as(my_map, "SpatialPolygons")

library(purrr)
centroid <- function(i, polys) {
  ctr <- sp::Polygon(polys[i])@labpt
  data.frame(long_c=ctr[1], lat_c=ctr[2])
}

centroids <- seq_along(polys) %>% purrr::map_df(centroid, polys=polys)

# Load nat_data
### NOTE: ADD DATA TO THE `nat_data` DATAFRAME THAT YOU WANT TO USE IN THE PLOT
### NOTE: THE `nat_map` CONTAINS THE POLYGONS AND `nat_data` CONTAINS THE DATA
totalseats <- Total_seats_for_provinces %>% slice(1:32) %>% select(tot_seats)
nat_data <- my_map@data %>% rename(id = ID)
nat_data <- data.frame(centroids, nat_data, totalseats)

# Plot
ggplot(aes(map_id= id), data= nat_data) + 
  geom_map(aes(fill = Propinsi), map= nat_map) +
  expand_limits(x=nat_map$long, y=nat_map$lat) + 
  theme_map() + coord_equal() +
  guides(fill = F) +
  labs ( title = "Total seats for provinces in Election 2014 ",
         subtitle = "The mapping of the provinces who got the highest number of seats in Legislative Election 2014",
         caption= "data source : Indonesia Central Bureau of Statistics")

#plot total seats for provinces
ggplot(aes(map_id= id), data= nat_data) + 
  geom_map(aes(fill = tot_seats ), map= nat_map) +
  expand_limits(x=nat_map$long, y=nat_map$lat) + 
  theme_map() + coord_equal() +
  labs ( title = "Total seats for provinces in Election 2014 ",
         subtitle = "The mapping of the provinces who got the highest number of seats in Legislative Election 2014",
         caption= "data source : Indonesia Central Bureau of Statistics")

in this part to answer the question how the election result correlate with census data, we want to know whether there is positive correlation or negative correlation.

how do election results correlate with Census data?

load('../data/IDNS2010.rdata')
combined <- sort(union(levels(totalvotes$Provinces), levels(IDNS2010$Provinces)))
census_and_electionIDN <- left_join(mutate(totalvotes, Provinces=factor(Provinces, levels=combined)),
                      mutate(IDNS2010, Provinces=factor(Provinces, levels=combined)))

#subset only the columns we want for the model
census_and_election_subset <-
  census_and_electionIDN %>%
  ungroup %>%
  select(Provinces,
         Votes,
         PopulationIDN:PopulationProjection2035) %>%
  rename(tot_votes = Votes) %>%
  distinct(Provinces, .keep_all = TRUE)

census_and_election_subsetplot <- census_and_election_subset %>%
  ungroup %>%
  select(tot_votes,
         PopulationIDN:PopulationProjection2035)

census_and_election_subsetplot$tot_votes <- as.numeric(census_and_election_subsetplot$tot_votes)
census_and_election_subsetplot$Medianage <- as.numeric(census_and_election_subsetplot$Medianage)

library(broom)
library(purrrlyr)

library(corrplot)
IDN <- cor(census_and_election_subsetplot [, c(2:ncol(census_and_election_subsetplot))],
         use = "pairwise.complete.obs")
corrplot.mixed(IDN,
               lower = "ellipse",
               upper = "number",
               tl.pos = "lt",
               tl.cex = 0.5,
               tl.col = "black",
               number.cex= 0.5)

#more readable numbers

options(scipen = 10) 
census_variables <- names(IDNS2010)[-c(1:3)]

#compute the multiple regressions
census_and_electionIDN$Votes <- as.numeric(census_and_electionIDN$Votes)
census_and_electionIDN$Medianage <- as.numeric(census_and_electionIDN$Medianage)

multiple_regression_model <-
  census_and_electionIDN %>%
  ungroup %>%
  select(
    Votes,
    PopulationIDN:PopulationProjection2035) %>%
  lm(Votes ~ .,
     data = .)

multiple_regression_model %>%
  glance %>%
  dmap(round, 3) %>%
  kable

from the result above we can see that there is negative strong correlation between election and census data because the P-value bigger than 0,05

#find the variables with a significant effect
#in this part we want to find the varialbes with a significant effect

multiple_regression_model %>% 
  tidy %>%
  filter(p.value < 0.05) %>%
  dmap_if (is.numeric, round, 3) %>%
  arrange(p.value) %>%
  kable

the result above showed that the variable Unemployment2018 and Incomeff has the significant effect with election with the p-value less than 0,05



FahroziFahrozi/indoelectdata documentation built on Nov. 21, 2019, 1:55 a.m.