getwd()
More details on the server configuration in this google doc
Rmarkdown requires the latest pandoc version, explanation here
To include the latest required version of pandoc:
setup: At least the following dependencies are missing: http-client >=0.3.2 && <0.4 && ==0.4.5 cabal: Error: some packages failed to install: pandoc-1.13.1 failed during the configure step. The exception was: ExitFailure 1
md5sum rstudio-server-0.98.1091-amd64.deb 930eca2738ce335791df41c3f6302fae
Rstudio server requres the installation of a recent libssl version. md5sum should not be used anymore for security reason, authenticity should be checked with
sha256sum libssl0.9.8_0.9.8o-4squeeze14_amd64.deb
Couldn't find the key. So i loaded if from 2 locations and checked that they had the same sum. To manually stop, start, and restart the server you use the following commands:
$ sudo rstudio-server stop $ sudo rstudio-server start $ sudo rstudio-server restart
Inspired by the way hadley prepares this flight planes data. The package includes a training dataset: sawnwood bilateral trade data for European countries.
location inspired by the rapport package. See also their function rapport.ls that lists templates. And their function rapport.read that reads templates from files or form package-bundled templates.
Created based on instructions from Hadley.
devtools::load_all()
or Cmd + Shift + L, reloads all code in the package.
Add packages to the list of required packages
devtools::use_package("dplyr")
devtools::use_package("ggplot2", "suggests")
For data I followed his recommendations in r-pkgs/data.rmd
devtools::use_data(mtcars)
devtools::use_data_raw()
# To create a data-raw/ folder and add it to .Rbuildignore
Use Ctrl+Shift+T to run the package tests in RStudio.
The test_check function documentation tells us that tests should be placed in tests/testthat.
Example of testing for the devtools package
The covr
package can be used to measure code coverage.
covr::package_coverage()
Shows test coverage of scripts in the ./R
directory.
Visualise coverage in a shiny application:
x <- package_coverage() shine(x)
Git command to revert a file one revision back in the "develop" branch:
git checkout develop~1 R/clean.R # Experiment something # Then # Come back to the latest revision git checkout devlop R/clean.R
Use this to check that a test failed in the past for example. And that it doesn't fail anymore.
dplyr uses non standard evaluation. See vignette("nse") NSE is powered by the lazyeval package
# standard evaluation sawnwood %>% select_(.dots = c("yr", "rtCode" )) %>% head # is the same as # lazy evaluation sawnwood %>% select(yr, rtCode ) %>% head
demo(error.catching)
.How to create package vignettes.
To create a vignette, use the command use_vignette(name)
You can build all vignettes from the console with devtools::build_vignettes()
RStudio’s “Build & reload” does not build vignettes to save time. Similarly, devtools::install_github() (and friends) will not build vignettes by default because they’re time consuming and may require additional packages. You can force building with devtools::install_github(build_vignettes = TRUE). This will also install all suggested packages.
Export documentation in a pdf document at the command line in the man folder run
R CMD Rd2pdf *
You should be able to see the documentation of exported functions by placing a question mark before the function name at the R command prompt.
inspired by the documentation of roxygenize
https://github.com/yihui/roxygen2/blob/master/R/roxygenize.R
vignette("namespace", package = "roxygen2")
says:
If you are using just a few functions from another package, the recommended option is to note the package name in the Imports: field of the DESCRIPTION file and call the function(s) explicitly using ::, e.g., pkg::fun(). Alternatively, though no longer recommended due to its poorer readability, use @importFrom, e.g., @importFrom pgk fun, and call the function(s) without ::. If you are using many functions from another package, use @import package to import them all and make available without using ::.
But Hadley says:
Alternatively, if you are repeatedly using many functions from another package, you can import them in one command with @import package. This is the least recommended solution: it makes your code harder to read (because you can’t tell where a function is coming from), and if you @import many packages, the chance of a conflicting function names increases.
calling packages might have to be changed to follow Hadley's recommendations on how package namespaces: http://r-pkgs.had.co.nz/namespace.html see also vignette("namespace", package = "roxygen2") require(RJSONIO) require(dplyr)
The .git repository is backed on bitbucket. Use devtools::install_bitbucket() to install the package.
A demonstration with time series plot and bar chart will be made with shiny and the ggplot2 package, based on the diamond example using.
Use screen to keep a long process running on a server after you close the ssh session. I started a screen session with:
screen -S sessionname
In order to find the screen session later you might want to rename it using sessionname. Or on the first screen invocation use the s flag -S sessionname
I started the R software in this screen session, started a long running process. Then detached the session with:
CTRL-A-D
I could re-attach the session later with:
screen -r sessionname
If the session was not detached properly, it might be necessary to detach it and re attach it:
screen -d -r sessionname
I will try to change the package's version number each time I commit a change that impacts the cleaning procedure. I will also try to tag those versions in git.
It would be nice to clarify the interface: What R functions are used by the PHP code and bash scripts? This would enable code refactoring. For example the parameter called outputdir is not consistent with inputpath. It would be preferable to call tehm both "dir" or "path". Outputdir is named after the rmarkdown::render() parameter output_dir. What is inputpath named after?
See the vignette/installation.Rmd on installation and configuration steps.
Which directories I want to read at https://bitbucket.org/paul4forest/tradeflows/? You want to look at files in the R folders.
The configuration table columnnames located in config/column_names.csv now contains 2 column specifying which columns names are used in the trade flows database: "raw_flow" and "validated_flow"
Database configuration file and column names are located under: a location available from shell command prompt, run:
Rscript -e 'library(tradeflows)' -e 'system.file("config", package="tradeflows")'
This is managed by a PHP program. The data to load is contained in this instruction
itto <- classificationitto %>% filter(productcodecomtrade > 10000 & nomenclature =="HS12") %>% select(product, productcodeitto, productcodecomtrade) write.csv(itto, file="data-raw/ittoproducts.csv", row.names = FALSE)
The function cleandb() will feed data into the database table(s) validated_flow updates will be done on a product basis, at the 6 digit level. The cleaning script will:
The main clean instruction can also run from a system shell directly
Rscript -e 'library(tradeflows)' -e 'cleandbproduct(440799, tableread = "raw_flow_yearly", tablewrite = "validated_flow_yearly")
createreportfromdb(productcode = , template = "", )
There are 6373 distinct bilateral trade flows in the 440799 yearly dataset. Some flows occur only inone year, others are repeated every year. Six thousand plots cannotbe easily represented in one report. This requires an interface.
20141208 A bug in Lyx prevents me from generating pdfs when the text contains a euro € sign.
Load monthly data Load yearly data Rename columns Copy into a database
COLUMN_NAME
FROM INFORMATION_SCHEMA
.COLUMNS
WHERE TABLE_SCHEMA
='tradeflows'
AND TABLE_NAME
='validated_flow_yearly';"
instead of loading a data frame to check the column names.
Because loading a data frame doesn't work when the table is
empty.discrepancy report in the server function, add a parameter to the loadcomtrade_bycode function to render this optional log validataion status of jsonfiles with fileConn<-file("output.txt") writeLines(c("Hello","World"), fileConn) close(fileConn) Change the docs/development folder into vignettes.
Calculate median prices by region patner and see how using partner prices for conversion impacts the world trade flows
20151103 Methodology report add a paragraph on the different types of predefined automated reports that can be generated, with indication of the parameters that can be set. Four different report types: - completeness report - discrepancy report - overview report- trade network analysis 3: Add abstract to country overview report 20151023 commit a58d6e4fb Overview report should be based on the validated data and include quantity besides trade values 20151009 Section titles in the overview report should be those JFSQ-1 names Generate overview report plots accroding to JFSQ product codes. 20151009 Overview report list the 10 largest exporters and 10 largest importers in all plots. 20150904 Include partner data into the quantity estimation for those which have missing partner data. in commit c3d92e77e33008eef2eef64fb465c77d0829bb73 git diff fd724fa080cc c3d92e77e330 # View changes introduced See the function addmissingmirrorflow()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.