knitr::opts_chunk$set(echo = TRUE)

InfluenzaNet R Analysis Framework

Purpose

The purpose of this package is to provide a set of functions to help working on data analysis using InfluenzaNet database.

Aims are :

Some library are made to wrap R libraries in order to write script without taking account of which R library is currently used or where files are really located.

We recommend an workspace organization (see workspace)

Configuration

Several configuration information should be provided in order the package works correctly

Configuration is stored in a list in R options() at the entry named 'ifn'

It can be set before the package is loaded (in this case some initialization can be done automatically) or after the package is loaded.

Options entry are defined in help page of share.option()

For example :

options(ifn=list(
  'db_driver'='RODBC', # Db driver ROBDC or RPostgreSQL
  'db_dsn'=list(user='', password='...', database='mydb') # Database connexion parameters
  'platform'='eu', # name of platform file (further called [platform_id])
  'platform.path'='/home/influenzanet/R/platform/' # where platform file definition is located
  'autoconnect'=TRUE # Autoconnect to database
  'autoload.platform'=TRUE # Load platform file at startup
  'base.out.path'='/home/influenzanet/output' # Were to put the output files (see init.path() & my.path())
))

library(ifnBase)

Some functions helps to have access to package specific options

Basically it wil get or modify 'ifn' entry in base options() set of R.

Managing file path

Please, never use absolute path. (allowed only in location.r)

To write good program, think as if the program should never know where are the files outside a project directory but the program should ask to the framework the path for the file. It provides a set of function to do that :

We provided several function allowing to use files from a location outside from a given project. They assume an external location (called "share" as its is intended to be shared across several projects).

This location can have several directories :

Output path

package options contains a base.out.path entry, this allow to put all the files and graph in a global output directory. If base.out.path is set to an empty string, the output will be done into the current directory (running one). As it is defined in the global configuration, a given script into a project should be agnostic from this.

To do that, use the following two functions in you script

For example,

At the beginning of a script, you want to put all the generated files into a sub-directory of the ouput path. Just call :

  init.path('web')

All further output (using my.path()) will be done into a "web" subdirectory of the current output path. If the "web" directory does not exists, it will be created.

To know the path of a given file in the output path (to load the file or to output data), use the file name wrapped by the my.path() function:

  my.path('myfile.csv')

my.path always returns the full path of a filename inside the current output path (even if the file does not exists)

Database functions

Database configuration

For RODBC, db_dsn can have an entry 'file' pointing to a file containing all connection parameters

Graphics functions

This functions are device graphic function wrappers. Default use png() function, but for some config others graphics devices could be used.

You can also, use a function directly but if you need to change the driver or the type of ouput file, you will need to change all your script.

Moreover, graph.xxx() function handle "hooks", a function that is automatically called after the graphic had been created.

ISO 8601 week number computation functions

These functions allow to manipulate ISO 8601 week number.

A week number is represented as year_of_the_week * 100 + week_number (called yearweek number)

Geographic levels

Package provides a set of functions to handle geographic levels (to describe a way a territory, let's say a country, is divided in several areas in a hierarchical ways.). You can have several levels (country, regions, ...). Each level is divided into areas, which can be divided in sub-area in the lower level.

The only assumption made is an area can only belongs to a single area on the upper level (it doesn't handle overlap). But it can handle several level hierarchies.

Concepts & functions are described in geography vignette

Loading libraries functions

To load function of a library in share/lib/ you have to use the share.lib function.

  share.lib('survey') # load survey functions
  share.lib(c('survey','syndrome','i18n')) # load several

If you want to have a platform specific library (or load a platform specific implementation of a library) you can use share.lib. This will load the library in the platform library directory (in [share]/platform/[platform_id]/)

  share.lib('survey', platform=T) # load survey library using the platform specific implementation


cturbelin/ifnBase documentation built on Nov. 5, 2023, 12:54 p.m.