Project : Visualisation
Create a web-based interactive visualization app that interacts with R. Your app should include, at a minimum:
For this project, we use the Forest Cover Type Prediction data provided by Kaggle (https://www.kaggle.com/c/forest-cover-type-prediction/data).
Clone this project. You should have R and RStudio installed in your system.
- From the terminal, go to folder "/visualisation/R" and run the "install.R" file using the command 'Rscript install.R' and install the packages that are needed to run the "app.R" file, ie shiny, plotly, gridExtra and randomForest.
- In RStudio, enter the following command :- shiny::runGitHub('visualisation', 'mathurshikhar') or directly run "app.R" in RStudio.
The study area includes four wilderness areas located in the Roosevelt National Forest of Northern Colorado. Each observation is a 30m x 30m patch. We are asked to predict an integer classification for the forest cover type. The seven types are:
1 - Spruce/Fir
2 - Lodgepole Pine
3 - Ponderosa Pine
4 - Cottonwood/Willow
5 - Aspen
6 - Douglas-fir
7 - Krummholz
The training set ("train.csv") contains both features and the Cover_Type. The test set contains only the features. We must predict the Cover_Type for every row in the test set (565892 observations).
Elevation - Elevation in meters
Aspect - Aspect in degrees azimuth
Slope - Slope in degrees
Horizontal_Distance_To_Hydrology - Horz Dist to nearest surface water features
Vertical_Distance_To_Hydrology - Vert Dist to nearest surface water features
Horizontal_Distance_To_Roadways - Horz Dist to nearest roadway
Hillshade_9am (0 to 255 index) - Hillshade index at 9am, summer solstice
Hillshade_Noon (0 to 255 index) - Hillshade index at noon, summer solstice
Hillshade_3pm (0 to 255 index) - Hillshade index at 3pm, summer solstice
Horizontal_Distance_To_Fire_Points - Horz Dist to nearest wildfire ignition points
Wilderness_Area (4 binary columns, 0 = absence or 1 = presence) - Wilderness area designation
Soil_Type (40 binary columns, 0 = absence or 1 = presence) - Soil Type designation
Cover_Type (7 types, integers 1 to 7) - Forest Cover Type designation
1 - Rawah Wilderness Area
2 - Neota Wilderness Area
3 - Comanche Peak Wilderness Area
4 - Cache la Poudre Wilderness Area
1 Cathedral family - Rock outcrop complex, extremely stony.
2 Vanet - Ratake families complex, very stony.
3 Haploborolis - Rock outcrop complex, rubbly.
4 Ratake family - Rock outcrop complex, rubbly.
5 Vanet family - Rock outcrop complex complex, rubbly.
6 Vanet - Wetmore families - Rock outcrop complex, stony.
7 Gothic family.
8 Supervisor - Limber families complex.
9 Troutville family, very stony.
10 Bullwark - Catamount families - Rock outcrop complex, rubbly.
11 Bullwark - Catamount families - Rock land complex, rubbly.
12 Legault family - Rock land complex, stony.
13 Catamount family - Rock land - Bullwark family complex, rubbly.
14 Pachic Argiborolis - Aquolis complex.
15 unspecified in the USFS Soil and ELU Survey.
16 Cryaquolis - Cryoborolis complex.
17 Gateview family - Cryaquolis complex.
18 Rogert family, very stony.
19 Typic Cryaquolis - Borohemists complex.
20 Typic Cryaquepts - Typic Cryaquolls complex.
21 Typic Cryaquolls - Leighcan family, till substratum complex.
22 Leighcan family, till substratum, extremely bouldery.
23 Leighcan family, till substratum - Typic Cryaquolls complex.
24 Leighcan family, extremely stony.
25 Leighcan family, warm, extremely stony.
26 Granile - Catamount families complex, very stony.
27 Leighcan family, warm - Rock outcrop complex, extremely stony.
28 Leighcan family - Rock outcrop complex, extremely stony.
29 Como - Legault families complex, extremely stony.
30 Como family - Rock land - Legault family complex, extremely stony.
31 Leighcan - Catamount families complex, extremely stony.
32 Catamount family - Rock outcrop - Leighcan family complex, extremely stony.
33 Leighcan - Catamount families - Rock outcrop complex, extremely stony.
34 Cryorthents - Rock land complex, extremely stony.
35 Cryumbrepts - Rock outcrop - Cryaquepts complex.
36 Bross family - Rock land - Cryumbrepts complex, extremely stony.
37 Rock outcrop - Cryumbrepts - Cryorthents complex, extremely stony.
38 Leighcan - Moran families - Cryaquolls complex, extremely stony.
39 Moran family - Cryorthents - Leighcan family complex, extremely stony.
40 Moran family - Cryorthents - Rock land complex, extremely stony.
I predicted the data in "predict.R" and wrote the data with the predictions in "out.csv". This prediction is done using Random Forests. The Shiny app is named "app.R". Shiny was used to display all the graphs/visualisations.
The first graph is built using Plotly and ggplot2. It has The ability to select different x- and/or y-axis variables from a pull-down menu. Also, to zoom in, we can simply brush the area we want to zoom in for. To zoom out, we can use double click. This is a dynamic graph.
The second is a histogram again built using Plotly. It compares different Forest Cover types which we has predicted.
The third is a Boxplot which shows the Boxplot values of the factors such as Elevation, Aspect etc. This just shows the value range for each column.
The fourth is a varImpPlot which shows the importance of the factors which were used to predict the Forest Cover type.
The fifth is a grid of four graphs plotted with ggplot2 which the density of four factors, namely Elevation, Aspect, Horizontal Distance To Roadways and Horizontal Distance To Fire Points in each of the Forest Cover type.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.