library(magrittr) library(ggplot2) library(dplyr) library(igraph) knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Our company databases have codes that indicate a company's economic activity. Economic activities are indicated by the NACE, SBI or SIC codes, depending on the country. Although this classification in helpful when looking at a client's customer portfolio, it contains so many groups and in some cases so little customers per group, that it is hard to make a proper portfolio analysis to spot a clients opportunities. With this package you can make use of the hierarchical nature of the economic activity coding to make less and larger groups.
First let's load the package:
library(graydon.package)
The package contains a data frame, tbl_SBI_count which is an example of the SBI codes completed with a count of the Dutch companies associated with that SBI code. The data frame contains the following columns:
data.frame(`Column names` = names(tbl_SBI_count)) %>% knitr::kable()
To do any calculation or plotting on the economic activity hierarchy, we have to create a graph. The parameters col_id and col_id_parent are the column names that contain the economic activity code and the parent code respectively.
graph_SBI <- create_economic_activity_graph(tbl_SBI_count, col_id = "code_SBI", col_id_parent = "code_SBI_parent")
By visualizing the hierarchy with the package's plot_graydon_graph function we can see it is very complex:
plot_graydon_graph(graph_SBI, vertex.label = "", vertex.size = 3, edge.arrow.size = 0)
A method for decreasing the complexity of the hierarchy is aggregating subcodes into higher up codes according whenever subnodes do not meet a minimum value requirement. Let's take the number of companies as an example: whenever the number of companies is lower than 5.000, the companies with that subcode will get the code of one higher up in the hierarchy:
graph_SBI_rolled <- roll_up_hierarchy_by_minimum(graph_tree = graph_SBI, name_attribute = "qty_companies", name_propagated = "qty_companies_cum", threshold = 5000)
Maybe it doesn't look as simple as you might want it to be, but it is a lot simpler than you'd think. The smaller networks you now see are the 'translation' networks: they specify which subcode should be recoded to which 'higher up' code. The highest code get's a loop back to itself. So each 'subnetwork' in this plot is actually one code in the newly acquired aggregation. The 'higher up' codes are marked as orange in this graph.
V(graph_SBI_rolled)$color <- ifelse(V(graph_SBI_rolled)$is_root, 1, 2) V(graph_SBI_rolled)$label <- ifelse(V(graph_SBI_rolled)$is_root, V(graph_SBI_rolled)$name, "") plot_graydon_graph(graph_SBI_rolled, vertex.size = 3, edge.arrow.size = 0)
V(graph_SBI_rolled)$label <- format_number(V(graph_SBI_rolled)$qty_companies_cum) V(graph_SBI_rolled)$label <- ifelse(V(graph_SBI_rolled)$label == 0, "", V(graph_SBI_rolled)$label) plot_graydon_graph(graph_SBI_rolled, vertex.size = 3, edge.arrow.size = 0)
We could isolate these clusters and inspect them further if we wanted to:
list_graphs <- decompose(graph_SBI_rolled) # Getting all ending activity codes idx_searched <- names(sapply(list_graphs, function(x) igraph::V(x)[1] )) plot_graydon_graph(list_graphs[[1]])
This graph is the basis of translating the original economic activity
df_translation_codes <- rolled_up_as_data_frame(graph_SBI_rolled)
df_translation_codes %>% head() %>% knitr::kable()
When looking at which sectors a bunch of companies occupy, you'll quickly get drowned in data when looking the most specific NACE or SBI level. In this case you might want to go up the tree a few notches. The funcion hierarchy_get_higher_level helps you doing this. It takes a economic activity hierarchy, SBI (tbl_SBI) or NACE (tbl_NACE), and delivers a data-frame with the recoding and descriptions of the higher specified level.
Let's take an example, where we want all NACE codes that are level 2 or lower, have the information of that level 2:
tbl_NACE_recoded <- hierarchy_get_higher_level(tbl_NACE, level_no = 2, col_code= "code_NACE", col_code_parent = "code_NACE_parent", col_layer_no = "hierarchy_layer") tbl_NACE_recoded %>% head(10) %>% knitr::kable()
Here you see that you get the original code code_NACE with all the oher fields from the higher level code code_NACE_higher_level.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.