R interface to the Data Retriever.
The Data Retriever automates the tasks of finding, downloading, and cleaning up publicly available data, and then stores them in a local database or csv files. This lets data analysts spend less time cleaning up and managing data, and more time analyzing it.
This package lets you access the Retriever using R, so that the Retriever's data handling can easily be integrated into R workflows.
rdataretriever
is an R wrapper for the Python based Data Retriever. This means
that Python and the retriever
package need to be installed first.
Use this if you are new to Python or don't have a local Python installation
reticulate
package:coffee
install.packages("reticulate")
retriever
Python package:coffee
library(reticulate)
py_available(initialize = TRUE)
py_install("retriever")
rdataretriever
R package:coffee
devtools::install_github("ropensci/rdataretriever")
Use this if you are already familiar with Python and have a local Python installation
reticulate
package:coffee
install.packages("reticulate")
replacing "/path/to/python" with the path to you Python executeable
) to install the retriever
Python package:coffee
library(reticulate)
use_python("/path/to/python")
py_install("retriever")
rdataretriever
R package:coffee
devtools::install_github("ropensci/rdataretriever")
library(rdataretriever)
# List the datasets available via the Retriever
rdataretriever::datasets()
# Install the portal into csv files in your working directory
rdataretriever::install_csv('portal')
# Download the raw portal dataset files without any processing to the
# subdirectory named data
rdataretriever::download('portal', './data/')
# Install and load a dataset as a list
portal = rdataretriever::fetch('portal')
names(portal)
head(portal$species)
Set-up and Requirements
Tools
The rdataretriever
supports installation of spatial data into Postgres DBMS
.
Install PostgreSQL and PostGis
To install PostgreSQL with PostGis
for use with spatial data please refer to the
OSGeo Postgres installation instructions.
We recommend storing your PostgreSQL login information in a .pgpass
file to
avoid supplying the password every time.
See the .pgpass
documentation for more details.
After installation, Make sure you have the paths to these tools added to your system's PATHS
.
Please consult an operating system expert for help on how to change or add the PATH
variables.
For example, this could be a sample of paths exported on Mac:
```shell
export PATH="/Applications/Postgres.app/Contents/MacOS/bin:${PATH}" export PATH="$PATH:/Applications/Postgres.app/Contents/Versions/10/bin"
```
Enable PostGIS extensions
If you have Postgres set up, enable PostGIS extensions
.
This is done by using either Postgres CLI
or GUI(PgAdmin)
and run
For psql CLI
shell
psql -d yourdatabase -c "CREATE EXTENSION postgis;"
psql -d yourdatabase -c "CREATE EXTENSION postgis_topology;"
For GUI(PgAdmin)
sql
CREATE EXTENSION postgis;
CREATE EXTENSION postgis_topology
For more details refer to the
PostGIS docs.
Sample commands
rdataretriever::install_postgres('harvard-forest') # Vector data
rdataretriever::install_postgres('bioclim') # Raster data
# Install only the data of USGS elevation in the given extent
rdataretriever::install_postgres('usgs-elevation', list(-94.98704597353938, 39.027001800158615, -94.3599408119917, 40.69577051867074))
To run the image interactively
docker-compose run --service-ports rdata /bin/bash
To run tests
docker-compose run rdata Rscript load_and_test.R
To get citation information for the rdataretriever
in R use citation(package = 'rdataretriever')
A big thanks to Ben Morris for helping to develop the Data Retriever. Thanks to the rOpenSci team with special thanks to Gavin Simpson, Scott Chamberlain, and Karthik Ram who gave helpful advice and fostered the development of this R package. Development of this software was funded by the National Science Foundation as part of a CAREER award to Ethan White.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.