To get up and running for the rain prediction kaggle competition, you can first install the kgRainPredictR
package from github, via:
devtools::install_github("potterzot/kgRainPredictR")
It might be better though to just clone the git repository, since that will give you easy access to the package directory, data folder, tests, etc... If you do so, then from R studio you may want to build and reload the package by clicking on the button in Rstudio or just issuing R CMD INSTALL --no-multiarch --with-keep.source kgRainPredictR
from your shell.
you should first download the competition data and save it in a directory. The data-raw
directory of this repository comes with a utility script to convert the data from zipped csv files to ff files. This is not strictly necessary. ffdf files may provide some speed boost, at least for initial load, but fread will allow for loading only select columns.
Once you have it in either csv or ffdf form, you can load it with: ``` {r eval=FALSE} library(ff) ffload("data/train", overwrite = TRUE) df_train <- list( data = as.data.frame(train[,c("Id", "minutes_past", "radardist_km", "Ref", "Expected")]), labels = as.data.frame(train[,c("Expected")]) )
If instead you have csv files, you can use `data.table` and `fread`: ``` {r eval=FALSE} library(data.table) df_train <- list( data = fread("data/train.csv", select = c(1,2,4)) labels = fread("data/train.csv", select = c(24)) )
kgRainPredictR
includes marshall_palmer
, a function to reproduce the data in the sample submission file. This calculates the expected rain as a function of the reflectivity, weighted by the time of measure.
``` {r eval=FALSE} library(ff) library(dplyr) library(lazyeval) library(kgRainPredictR) ffload("data/test", overwrite = TRUE, rootpath = "data/") df_test <- as.data.frame(test[,c("Id", "minutes_past", "Ref")])
res <- group_by(df_test, Id) %>% summarize(Expected = marshall_palmer(Ref, minutes_past))
create_submission(res, "submission.csv") ```
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.