README.md

Production Domain Overview

This is the repository containing both the package and the modules for processing the production domain.

The repository also contains information on the complete processing cycle and research materials.

Production Domain Cycle

The complete cycle contains 4 stages:

1. Data collection

This phase collects data and inputs from various sources and merge them into a single final information set.

No R module is involved in this phase.

2. Data validation

In this phase, the input data will be checked, corrected and validated prior to imputation of missing data. All the process will be automised with algorithms.

R module: Production Input Validation

This module performs both input validation of the production domain, and at the same auto-correction of data with given rules.

3. Imputation

In this phase, the missing records will be imputed. All the process will be automised with algorithms. The imputation is consists of two modules, each module performs imputation on a different basket of commodities depending on the nature. The module Impute Livestock performs imputation on the livestock item while the module Impute Non-livestock operates on non-liveestock commodities.

R module: Impute Livestock

This module performs imputation on the livestock commodities and at the same time ensure slaughtered animal is synchronised accross all related parent/child commodities.

R module: Impute Non-livestock

This module craetes the imputed values for the non-livestock items.

4. Post validation

During this phase, the processed dataset will be investigated. It will be possible to manually correct imputed values. However, all corrections are required to be scientifically justified, mandatorily explained in the metadata, and reported to team A for continual improvements of the algorithm.

After the manual intervention, the execution of the Balance Production Identity is required to ensure the production is balanced.

R module: Balance Production Identity

This module re-calculates the production identity, this ensures the relationship of Production = Area Harvested x Yield holds when new changes are introduced in the post validation phase.

Production Work Flow

Auxiliary Datasets

The production modules in addition to the main production data (agriculture:aproduction) depends on several auxiliary datasets detailed below:

Production Processing

During the production imputation phase, certain values based on their observation status and method flag will be removed and replaced with new imputation values. Below we give a description of the flags being processed.

When considering processing production data, the method flag is the main flag which determines the process. The following list provide a guideline for processing the flags. In addition, the description is only valid for the dataset 'aproduction'.

The following flag combination list are of particular interest to the processing of production domain. Additional information are provided.

All work under this repository represents the latest status of development and is made public for collaboration purposes. It does not reflect the current state of the system and use of the program is at the discretion of the users.



SWS-Methodology/faoswsProduction documentation built on March 21, 2023, 8:27 p.m.