2020-04-21: Manage data with pins

https://github.com/2DegreesInvesting/ds-incubator/issues/38

Who is the audience?

Analysts, data managers, software developers at 2DII and beyond.

Why is this important?

Following a discussion on managing and using data (#35) we concluded we can improve.

Before we invest in any one approach we may want to explore a number of potentially good alternatives. A system would be a good candidate if it has this properties:

  1. Allows us to control permissions to read and write data
  2. Supports version control with Git and GitHub -- tools we already know.
  3. Hosts data online yet allows using a data from a cache stored locally.
  4. Can handle datasets of the maximum size we need.
  5. Plays well with R, and maybe Python.
  6. Is low cost or better free.
  7. Implements tools that are as familiar as possible, e.g. git and GitHub as opposed

What should be covered?

  1. Show how the pins package meets these requirements.
  2. Discuss what requirements I forgot to list.
  3. Questions and answers.

Suggested speakers or contributors

I plan to run a demo, then expect questions, and comments from everyone else.

Resources

Q&A and discussion

Questions:

Suggestions:

Table different alternatives to better see pros and cons (thanks @2diiKlaus)



2DegreesInvesting/ds-incubator documentation built on Oct. 13, 2021, 10:09 a.m.