Data Portal brings together a large family of applications that allow users to explore and visualize POC managed data.
The Data Portal team requires that all applications have their data source and generation process documented and - ideally - automated. With rare exceptions (e.g., the COS application), all applications should receive data ready for consumption either directly from a POC SQL database or from an R supported update process.
pocr
has a number of functions that support updating application data for
those applications that are dependent on R data processing. These functions
are tied together by get_portal_app_data()
- a wrapper that calls the R
processes needed to retrieve and format data for R-dependent Data Portal
applications.
This vignette will walk the user through the process of updating R-dependent
Data Portal applications, including using the supporting pocr
functions.
Prior to starting the update process, you need to prepare the following tools.
First things first, you need a GitHub account, you need to be part of pocdata, and you need permissions to view/access any apps you intend to update. Membership in the "Data Portal" team should get you the permissions.
Now you need a way to receive files from and pass files to GitHub. Install Git (specifically we want Git Bash) using all the default settings.
This is an R vignette, so I'm guessing you already did this. But just in
case you don't have R... you need it to use the pocr
package.
Not required, but makes working with R much more pleasant. You want the free desktop version.
devtools
devtools
is an R package that supports building packages. We need it
to install and build the pocr
package.
Open RStudio or an R console and install devtools
from the R command
line.
r
install.packages("devtools", depends = TRUE)
pocr
is the POC R package with the data retrieval/formatting functions,
along with a variety of other useful functions for POC data work.
Open RStudio or an R console and install pocr
from the R command line.
r
devtools::install_github("pocdata/pocr")
Connections to the POC SQL server and the annie MySQL server
Talk to Gregor or your senior colleagues for how to set these up. Record the names you give the odbc connections. I suggest using "POC" for the POC SQL server and "annie" for the annie MySQL server.
Once you've installed the above materials, you will need to set up Git with your credentials.
Depending on your permissions and platform, you may also need to configure your R library path. This process varies by operating system. Pester a fellow Data Portal team member and/or Google to setup your path.
Git allows us to create local, linked copies (aka - "clones") of the repositories on GitHub. We can makes changes to these copies - such as changing the data files - and then "push" our changes to the GitHub repository.
Once you have your update tools installed and configured, you will want to create clones of all the repos you want to update.
For a list of applications - and their repo names - that are currently
supported by pocr
, please see the Data Portal
application portfolio.
I recommend that you create all your clones in a common location (e.g.,
C:/Projects/
) so that they are easy to locate and work with
To clone a repo:
cd
to your target directory (e.g., cd C:/Projects
)git clone repo-url
(e.g.,
git clone https://github.com/pocdata/pocr
)ls
and
you should see a copy of the targe repo (e.g., C:/Projects/pocr/
)TBD
You can ignore this step for now. We are still resolving where in the process this should occur and what this check should entail.
get_portal_app_data()
to Retrieve/Format Current DataAt this point, you need to generate new data files for the applications you are aiming to update.
Open an R console or RStudio, load pocr
, and execute get_portal_app_data()
.
library(pocr) get_portal_app_data()
You may need to adjust the arguments to get_portal_app_data()
to match
the names you gave to the POC and annie odbc connections and/or to match
the specific applications you want to update.
By default:
Observe the update process - get_portal_app_data()
will report what apps
it is trying to retrieve/format data for and will report if each appears to
succeed.
If you observe any errors, you will need to inspect the errors to try and
determine the cause of the issue. This may require that you look up the
helper function associated with a given application (e.g.,
get_county_dashboard_data()
is the function called by get_portal_app_data()
to update the County Dashboard app).
get_portal_app_data()
will create a folder called app_data
(or a numbered
variant, such as app_data1
) each time it is run. This folder will have
a subfolder for each target app repo (e.g., app_data/county_dashboard
).
app_data
folder has been created and subfolders have been
made for all targeted apps.For each app you want to update, you need to replace the data files in your local copy of the repo with the new data files.
As an example, updating county_dashboard
places a bundle of .csv files
in app_data/county_dashboard
. To update your local copy of the County
Dashboard repo, you would:
C:/Projects/county_dashboard/data
)app_data/county_dashboard
Once you have updated your local copies of the target app repos, you will need to push your changes to GitHub.
cd C:/Projects/county_dashboard
)git add -A
)git commit -m "8-15-2015 data update"
)git push
)Complete the above process for all the apps you have generated new data for and VOILA you're just about done!
Contact the Data Portal site manager (Erika Deal: edeal@uw.edu) and let her know which apps have updated data. She will complete the process to move the data to the live apps.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.