In SWS-Methodology/faoswsIndustrial: Package to perform the imputation of industrial data

## Load required functions
library(data.table)
library(ggplot2)
library(scales)
library(knitr)
library(faosws)
library(faoswsUtil)
library(faoswsFlag)
library(magrittr)
library(igraph)
library(dplyr)

library(data.table)
library(countrycode)

## eurostat data: biodiesel

setwd("~/Github/faoswsIndustrial/Data")

# ratio: the piece of information about "ethanol" came from ebb and for "biodiesel" 
# from one website (http://www.paulchefurka.ca/Biofuels.html).
ratioAdditionalSupplyBiofuelDemand = fread("ratioAdditionalSupplyBiofuelDemand.csv")
ratioAdditionalSupplyBiofuelDemand

# structure : this dataset came from ebb
biofuelStructure = fread("biofuelStructure.csv")
biofuelStructure

# Biodiesel from eurostat
eurostatBiodieselTOE = fread("eurostatBiodieselTOE.csv")
head(eurostatBiodieselTOE)

eurostatBiodieselTOE[, m49 := countrycode(country, "country.name", "un")]
eurostatBiodieselTOE = eurostatBiodieselTOE[!is.na(m49)]

eurostatBiodieselTOE

# melt
eurostatBiodieselTOE = melt.data.table(eurostatBiodieselTOE, 
                                       id.vars = c("country", "m49"), 
                                       measure.vars = paste0("year", 2003:2014))

eurostatBiodieselTOE = eurostatBiodieselTOE[!(value == ":")]
eurostatBiodieselTOE[, year := substr(variable, 5, 8)]
eurostatBiodieselTOE[, c("variable") := NULL]
setcolorder(eurostatBiodieselTOE, c("country", "m49", "year", "value"))
eurostatBiodieselTOE[, biofuel := "biodiesel"]

# Biogasoline from eurostat
eurostatBiogasolineTOE = fread("eurostatBiogasolineTOE.csv")
head(eurostatBiogasolineTOE)

eurostatBiogasolineTOE[, m49 := countrycode(country, "country.name", "un")]
eurostatBiogasolineTOE = eurostatBiogasolineTOE[!is.na(m49)]

eurostatBiogasolineTOE

# melt
eurostatBiogasolineTOE = melt.data.table(eurostatBiogasolineTOE, 
                                       id.vars = c("country", "m49"), 
                                       measure.vars = paste0("year", 2003:2014))

eurostatBiogasolineTOE = eurostatBiogasolineTOE[!(value == ":")]
eurostatBiogasolineTOE[, year := substr(variable, 5, 8)]
eurostatBiogasolineTOE[, c("variable") := NULL]
setcolorder(eurostatBiogasolineTOE, c("country", "m49", "year", "value"))
eurostatBiogasolineTOE[, biofuel := "bioethanol"]

## join both biodiesel and biogasoline data

euroStatBiofuel = rbind(eurostatBiodieselTOE, eurostatBiogasolineTOE)

Abstract

This document provides detailed description of how we are using the data sets biodiesel and biogasoline from EuroStat for spliting the figures into the commodity level to the countries of European Union. We are considering that biogasoline is bioethanol.

Approach 1

In this approach we are using the share's table from European Biodiesel Board (EBB) for spliting the production of biofuels for country/commodity level.

1.1 Sources of data sets

We are pulling the data sets biodiesel and biogasoline between 2003 and 2014 from the EuroStat:

http://ec.europa.eu/eurostat/tgm/refreshTableAction.do?tab=table&plugin=1&pcode=ten00081&language=en

After combine both, we get this data set below. In this example, France produced 1.618 (1000 tonnes of oil equivalent) of biodiesel and 423,7 (1000 tonnes of oil equivalent) of bioethanol.

bioethanol = head(euroStatBiofuel[biofuel == "bioethanol" & country == "France" & year == 2011], 3)
biodiesel = head(euroStatBiofuel[biofuel == "biodiesel" & country == "France" & year == 2011], 3)

kable(rbind(biodiesel, bioethanol), format = "markdown", padding = 0, 
      col.names=c("country", "m49", "year", "value", "biofuel"))

This piece of information from EuroStat is the primary production of biodiesel and bioethanol just for the country level and we need to figure out how many tonnes of each commodity was used to produce biofuel. In order to do that we pulled a piece of information from others sources. First, we pulled the Conversion Efficiency (Litres/tonne) from FAO (http://www.greenfacts.org/en/biofuels/figtableboxes/biofuel-yields-countries.htm) and from another website (http://www.paulchefurka.ca/Biofuels.html). From this table we can see that 1 tonne of Sugar Beet produces 110 litres of biodiesel.

kable(ratioAdditionalSupplyBiofuelDemand, format = "markdown", padding = 0, 
      col.names=c("commodity", "biofuel", "litres2tonne"))

We also pulled the average proportion of each commodity used to produce biodiesel and bioethanol from European Biodiesel Board (EBB) and the density of each commodity from FAO (http://www.fao.org/docrep/017/ap815e/ap815e.pdf). So, 53.1% of biodiesel came from Rapeseed and its density is 0,92g/ml. Also we can say that 4.7% of all biodiesel was produced from Sunflower.

biofuelStructure[, refScenario2 := 100 * round(refScenario, 3)]
kable(biofuelStructure[, c("commodity", "biofuel", "timePointYears", "refScenario2", "densityG_ml"), with = F], format = "markdown", padding = 0, 
      col.names=c("commodity", "biofuel", "year", "%(EBB)", "density"))

1.2 Spliting production by commodity

We merged the data sets to figure out how many tonnes were produced from different commodities based on EBB's table. So, in the table below you find the valueByCommodityTonnes what means the amount of biofuel produced from the specific commodity. For example, we can say that 169,953 tonnes of biodiesel came from Palm Fruit in France in 2011.

```r

merge with biofuelStructure

euroStatBiofuel = merge(euroStatBiofuel, biofuelStructure[, c("commodity", "refScenario", "biofuel", "densityG_ml"), with = F], by = "biofuel", all.x=T, allow.cartesian=TRUE)

euroStatBiofuel[, value := as.numeric(value)] euroStatBiofuel[, valueByCommodityTonnes := value * refScenario * 1000]

merge eurostatBiodieselTOE with the conversion factors

euroStatBiofuel = merge(euroStatBiofuel, ratioAdditionalSupplyBiofuelDemand[, c("commodity", "conversion_efficiency_litres_tonne"), with = F], by = "commodity", all.x=T)

````r
bioethanol = euroStatBiofuel[biofuel == "bioethanol" & country == "France" & year == 2011, c("country", "biofuel", "commodity", "year", "value", "refScenario", "densityG_ml", "conversion_efficiency_litres_tonne", "valueByCommodityTonnes"), with=F]

biodiesel = euroStatBiofuel[biofuel == "biodiesel" & country == "France" & year == 2011, c("country", "biofuel", "commodity", "year", "value", "refScenario", "densityG_ml", "conversion_efficiency_litres_tonne", "valueByCommodityTonnes"), with=F]

biofuelRbind = rbind(biodiesel, bioethanol)
biofuelRbind[, refScenario2 := 100 * round(refScenario, 3)]

kable(biofuelRbind[, c("country", "biofuel", "commodity", "year", "value", "refScenario2", "densityG_ml", "conversion_efficiency_litres_tonne", "valueByCommodityTonnes"), with=F], format = "markdown", padding = 0, 
      col.names=c("country", "biofuel", "commodity", "year", "totalValue", "%(EBB)", "density", "litres2tonne", "valueByCommodityTonnes"))

1.3 Calculating how many tonnes of commodity was used

We can calculate how many tonnes of each commodity was used to produce the amount above. In order to do that, we have to convert from litres to tonnes using the density. So, we know for example that 212,176.03 tonnes of Palm Fruit was used to produce 1,618.6 (1000 tonnes of oil equivalent).

# we need to convert from litres to tonnes

euroStatBiofuel[, conversion_litres_tonne := (densityG_ml * conversion_efficiency_litres_tonne)/1000]

## Now we can calculate how many tonnes of each commodity was used
euroStatBiofuel[, prodByCommodity := valueByCommodityTonnes/conversion_litres_tonne]

bioethanol = euroStatBiofuel[biofuel == "bioethanol" & country == "France" & year == 2011, c("country", "biofuel", "commodity", "year", "value", "refScenario", "densityG_ml", "prodByCommodity"), with=F]

biodiesel = euroStatBiofuel[biofuel == "biodiesel" & country == "France" & year == 2011, c("country", "biofuel", "commodity", "year", "value", "refScenario", "densityG_ml", "prodByCommodity"), with=F]

kable(rbind(biodiesel, bioethanol), format = "markdown", padding = 0, 
      col.names=c("country", "biofuel", "commodity", "year", "totalValue", "%(EBB)", "density", "prodByCommodity"))

Approach 2

We also could use the Production and Net Trade to split by country/commodity level the amount of biodiesel and bioethanol produced. Let's do that only for biodiesel. We'll pull the data from de old system for 2011 with the variables: Production, Imports, Stock variation, Exports, Feed, Food and Other Util for the commodities Oil palm fruit, Palm oil, Oil of Palm Kernel, Cake of Palm kernel, Rapeseed or canola oil, crude, Cake of Rapeseed, Soya bean oil, Cake of Soya beans, Sunflower-seed oil, crude and Cake of Sunflower seed.

```r

if(!exists("DEBUG_MODE") || DEBUG_MODE == "") { if(Sys.info()[7] == "josh"){ # Josh's work computer R_SWS_SHARE_PATH <- "/media/hqlprsws1_qa/" SetClientFiles(dir = "~/R certificate files/QA/") files = dir("~/Documents/Github/faoswsFood/R", full.names = TRUE) token = "557f0e65-5f84-43b0-a021-fe3bf3f02316" } else if(Sys.info()[7] == "caetano"){ # bruno's work computer SetClientFiles(dir = "~/.R/QA/") R_SWS_SHARE_PATH = "//hqlprsws1.hq.un.fao.org/sws_r_share" files = dir("~/Github/faoswsFood/R", full.names = TRUE) token = "66a36f31-1a29-4a49-8626-ae62117c251a" } else { stop("User not yet implemented!") }

GetTestEnvironment( baseUrl = "https://hqlqasws1.hq.un.fao.org:8181/sws", token = token ) }

This code below is pulling data from the old system for "Other util"

51 production (Mt)

61 imports (Mt)

71 stock variation (Mt)

91 exports (Mt)

101 feed (Mt)

131 processed (Mt)

141 food (Mt)

151 other util

elements = c("51", "61", "71", "91", "101", "131", "141", "151")

As we want to analyse just the results for biodiesel, let's pull just the

commodities used to produce it. The names of the commodities are: "Oil palm fruit",

"Palm oil", "Oil of Palm Kernel", "Cake of Palm kernel", "Rapeseed or canola oil, crude",

"Cake of Rapeseed", "Soya bean oil", "Cake of Soya beans", "Sunflower-seed oil, crude",

"Cake of Sunflower seed"

items = c("254", "257", "258", "259", "271", "272", "237", "238", "268", "269")

key = DatasetKey(domain = "faostat_one", dataset = "FS1_SUA_UPD", dimensions = list( Dimension(name = "geographicAreaFS", keys = GetCodeList("faostat_one", "FS1_SUA_UPD", "geographicAreaFS")[, code]), Dimension(name = "measuredElementFS", keys = elements), Dimension(name = "timePointYears", keys = as.character(2011)), Dimension(name = "measuredItemFS", keys = items)) ) data = GetData(key)

dcast.data.table

data <- dcast.data.table(data, geographicAreaFS + measuredItemFS + timePointYears ~ measuredElementFS, value.var = "Value")

setnames(data, old=c("101", "131", "141", "151", "51", "61", "71", "91"), new=c("feed", "processed", "food", "otherUtil", "production", "imports", "stockVariation", "exports"))

data[, geographicAreaM49 := fs2m49(as.character(geographicAreaFS))]

data[, measuredItemFS := formatC(as.numeric(measuredItemFS), width = 4, flag = "0")] data[, measuredItemCPC := fcl2cpc(as.character(measuredItemFS))] data

names of the commodities and countries

data <- nameData("agriculture", "aproduction", data)

data[grepl("palm", tolower(measuredItemCPC_description))][, .N, c("measuredItemCPC_description", "measuredItemFS")] data[grepl("palm", tolower(measuredItemCPC_description)), commodity := "Palm Fruit"]

data[grepl("rapeseed", tolower(measuredItemCPC_description))][, .N, c("measuredItemCPC_description", "measuredItemFS")] data[grepl("rapeseed", tolower(measuredItemCPC_description)), commodity := "Rapeseed"]

data[grepl("soya", tolower(measuredItemCPC_description))][, .N, c("measuredItemCPC_description", "measuredItemFS")] data[grepl("soya", tolower(measuredItemCPC_description)), commodity := "Soybeans"]

data[grepl("sunflower", tolower(measuredItemCPC_description))][, .N, c("measuredItemCPC_description", "measuredItemFS")] data[grepl("sunflower", tolower(measuredItemCPC_description)), commodity := "Sunflower"]

no NA's

for(cname in c("feed", "processed", "food", "otherUtil", "production", "imports", "stockVariation", "exports")){ data[is.na(get(cname)), c(cname) := 0] }

data1 = data[, list(production = sum(production), imports = sum(imports), exports = sum(exports), stockVariation = sum(stockVariation)), by=list(geographicAreaM49_description, geographicAreaM49, timePointYears, measuredItemCPC,commodity)]

data1[, prodNetTrade := production + imports - exports - stockVariation] data1[, geographicAreaM49:= as.factor(geographicAreaM49)]

## 2.1 Calculating the food availability

The food availability is Production + Imports - Exports - Stock_variation. So, in this case we have 614,162 tonnes of **Soybeans** available in France in 2011.
````r

kable(data1[geographicAreaM49_description == "France" & timePointYears == 2011, c("geographicAreaM49_description", "commodity", "timePointYears", "production", "imports", "exports", "stockVariation", "prodNetTrade"), with=F], format = "markdown", padding = 0, 
      col.names=c("country", "commodity", "year", "production", "imports", "exports", "stockVariation", "prodNetTrade"))

```r

let's read the table with the flag to EU28

setwd("C:/Users/caetano/Documents/Github/faoswsIndustrial/sandbox") europeanCountryCode = fread("europeanCountryCode.csv") europeanCountryCode[, geographicAreaM49 := as.factor(countrycode(countryName, "country.name", "un"))] europeanCountryCode = europeanCountryCode[flagEU28 == 1]

merge data with the the flag of EU28

data1 = merge(data1, europeanCountryCode[, c("geographicAreaM49", "flagEU28"), with = F], by = "geographicAreaM49", all.x=T)

data1[is.na(flagEU28), flagEU28 := 0] data1[, flagEU28 := as.factor(flagEU28)]

by country, flagEU28 and commodity

countryByCommodity = data1[, list(prodNetTrade = sum(prodNetTrade)), by=list(geographicAreaM49, geographicAreaM49_description, flagEU28, commodity)]

countryByCommodity[, biofuelStructByCountry := prodNetTrade/sum(prodNetTrade), by = geographicAreaM49]

by flagEU28 and commodity

flagEU28Commodity = data1[, list(prodNetTrade = sum(prodNetTrade)), by=list(flagEU28, commodity)]

flagEU28Commodity[, biofuelStructEU28 := prodNetTrade/sum(prodNetTrade), by = flagEU28]

Now, we have to merge the table countryByCommodity with the table

euroStatBiofuel.

keys = c("geographicAreaM49", "commodity")

setnames(euroStatBiofuel, "m49", "geographicAreaM49") euroStatBiofuel[, geographicAreaM49 := as.factor(geographicAreaM49)]

euroStatBiofuel = merge(euroStatBiofuel, countryByCommodity[, c("geographicAreaM49", "commodity", "biofuelStructByCountry"), with = F], by=keys, all.x=T)

## 2.2 Merge tables

Now we can merge this data set with the data set previous with the **density** and **conversion efficiency** information. We see that 39.9% of **biodiesel** came from **Rapeseed**.

````r

euroStatBiofuel[, biofuelStructByCountry2 := 100*round(biofuelStructByCountry, 3)]

kable(euroStatBiofuel[country == "France" & biofuel == "biodiesel" & year == 2011, c("country", "commodity", "year", "densityG_ml", "conversion_efficiency_litres_tonne", "biofuelStructByCountry2"), with=F], format = "markdown", padding = 0, 
      col.names=c("country", "commodity", "year", "density", "conversion_efficiency_litres_tonne", "% ProdNetTrade"))

```r

euroStatBiofuel[biofuel == "biodiesel", valueBiodiselByCommodityTonnes := value * biofuelStructByCountry * 1000]

we need to convert from litres to tonnes

Now we can calculate how many tonnes of each commodity was used

euroStatBiofuel[biofuel == "biodiesel", prodBiodiselByCommodity := valueBiodiselByCommodityTonnes/conversion_efficiency_litres_tonne] euroStatBiofuel

## 2.3 Calculating how many tonnes of commodity was used

Using the same rules applied before, we get the result below. In this case
63.81488 tonnes of Palm Fruit was used to produce 1,618.6 (1000 tonnes of oil equivalent).

````r

kable(euroStatBiofuel[country == "France" & biofuel == "biodiesel" & year == 2011, c("country", "commodity", "year", "value", "biofuelStructByCountry2","densityG_ml", "prodBiodiselByCommodity"), with=F], format = "markdown", padding = 0, 
      col.names=c("country", "commodity", "year", "totalValue", "% ProdNetTrade", "density", "prodByCommodity"))

SWS-Methodology/faoswsIndustrial documentation built on May 9, 2019, 11:48 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

SWS-Methodology/faoswsIndustrial
Package to perform the imputation of industrial data

In SWS-Methodology/faoswsIndustrial: Package to perform the imputation of industrial data

Abstract

Approach 1

1.1 Sources of data sets

1.2 Spliting production by commodity

merge with biofuelStructure

merge eurostatBiodieselTOE with the conversion factors

1.3 Calculating how many tonnes of commodity was used

Approach 2

This code below is pulling data from the old system for "Other util"

51 production (Mt)

61 imports (Mt)

71 stock variation (Mt)

91 exports (Mt)

101 feed (Mt)

131 processed (Mt)

141 food (Mt)

151 other util

As we want to analyse just the results for biodiesel, let's pull just the

commodities used to produce it. The names of the commodities are: "Oil palm fruit",

"Palm oil", "Oil of Palm Kernel", "Cake of Palm kernel", "Rapeseed or canola oil, crude",

"Cake of Rapeseed", "Soya bean oil", "Cake of Soya beans", "Sunflower-seed oil, crude",

"Cake of Sunflower seed"

dcast.data.table

names of the commodities and countries

no NA's

let's read the table with the flag to EU28

merge data with the the flag of EU28

by country, flagEU28 and commodity

by flagEU28 and commodity

Now, we have to merge the table countryByCommodity with the table

euroStatBiofuel.

we need to convert from litres to tonnes

Now we can calculate how many tonnes of each commodity was used

R Package Documentation

Browse R Packages

We want your feedback!

SWS-Methodology/faoswsIndustrial Package to perform the imputation of industrial data

In SWS-Methodology/faoswsIndustrial: Package to perform the imputation of industrial data

Abstract

Approach 1

1.1 Sources of data sets

1.2 Spliting production by commodity

merge with biofuelStructure

merge eurostatBiodieselTOE with the conversion factors

1.3 Calculating how many tonnes of commodity was used

Approach 2

This code below is pulling data from the old system for "Other util"

51 production (Mt)

61 imports (Mt)

71 stock variation (Mt)

91 exports (Mt)

101 feed (Mt)

131 processed (Mt)

141 food (Mt)

151 other util

As we want to analyse just the results for biodiesel, let's pull just the

commodities used to produce it. The names of the commodities are: "Oil palm fruit",

"Palm oil", "Oil of Palm Kernel", "Cake of Palm kernel", "Rapeseed or canola oil, crude",

"Cake of Rapeseed", "Soya bean oil", "Cake of Soya beans", "Sunflower-seed oil, crude",

"Cake of Sunflower seed"

dcast.data.table

names of the commodities and countries

no NA's

let's read the table with the flag to EU28

merge data with the the flag of EU28

by country, flagEU28 and commodity

by flagEU28 and commodity

Now, we have to merge the table countryByCommodity with the table

euroStatBiofuel.

we need to convert from litres to tonnes

Now we can calculate how many tonnes of each commodity was used

R Package Documentation

Browse R Packages

We want your feedback!

SWS-Methodology/faoswsIndustrial
Package to perform the imputation of industrial data