NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = NOT_CRAN, fig.width = 7, fig.asp = 0.618, fig.align = "center", message = FALSE, warning = FALSE )
This vignette provides a minimal introduction to the realestatebr package, showing how to use its core functions. Since realestatebr returns tibble as default values, we recommend using it together with the dplyr package, though conversion do data.table is trivial.
library(realestatebr) library(dplyr)
The code below defines a common theme for all plots in this vignette and is required to fully replicate the code in this document. Despite this, this code is entirely optional and can be omitted.
#| code-fold: true library(ggplot2) color_palette <- c( "#1E3A5F", "#DD6B20", "#2C7A7B", "#D69E2E", "#805AD5", "#C53030" ) theme_series <- function() { theme_minimal( # swap for other font if needed base_family = "Avenir", base_size = 10 ) + theme( plot.title = element_text(size = 16), panel.grid.minor = element_blank(), panel.grid.major.x = element_blank(), axis.line.x = element_line(color = "gray10", linewidth = 0.5), axis.ticks.x = element_line(color = "gray10", linewidth = 0.5), axis.title.x = element_blank(), legend.position = "bottom", palette.color.discrete = color_palette ) }
#| include: false library(knitr) library(kableExtra)
realestatebr provides a unified interface to Brazilian real estate data from
multiple public sources. All datasets are returned as tidy tibble objects.
The goal of realestatebr is to provide a unified interface to Brazilian real estate data from multiple public sources. All datasets are returned as tidy tibble objects. The package is centered around a key function: get_dataset(name, table) which retrieves any dataset by name. Without a table argument it returns the default table; use table to select a specific sub-table.
get_dataset() main function to retrieve datasets.#| eval: false # Default table abecip <- get_dataset("abecip") # Specific table sbpe <- get_dataset("abecip", table = "units")
In order to explore which datasets are available, use list_datasets() and get_dataset_info().
list_datasets() returns a catalogue of all available datasets and their
tables.ds <- list_datasets()
#| echo: false ds |> select(name, title, source, available_tables, frequency) |> kable() |> kable_styling(bootstrap_options = "striped") |> scroll_box(width = "100%", height = "400px")
get_dataset_info() shows available tables and metadata for a given
dataset.#| eval: false info <- get_dataset_info("abecip") names(info$categories) #> [1] "sbpe" "units" "cgi"
source ArgumentThe source argument from get_dataset() controls where data comes from. The default ("auto") reads the in-session memo if present, falls back to the package's GitHub release, and finally falls back to a fresh download from the original source. Typically the default is fine. Use "github" to force the pre-processed asset, or "fresh" to always pull from the original source (slower but guaranteed up-to-date).
#| eval: false get_dataset("abecip", source = "github") # pre-processed asset from GitHub release get_dataset("abecip", source = "fresh") # direct from the original source
Repeated calls within one R session are served from an in-memory memo, so
fetching the same dataset twice does not re-download. Use
clear_session_cache() to drop the memo without restarting R.
SBPE (Sistema Brasileiro de Poupança e Empréstimo) is the primary funding
mechanism for residential mortgages in Brazil. The table sbpe fromabecip` tracks the deposits and withdrawals from saving accounts, that help finance real estate construction and acquisition.
sbpe <- get_dataset("abecip", table = "sbpe") glimpse(sbpe)
The plot below shows the annual net savings flow in recent years.
#| code-fold: true # Annual net credit flow sbpe_annual <- sbpe |> filter(date >= as.Date("2019-01-01")) |> mutate(year = lubridate::year(date)) |> summarise(net_flow = sum(sbpe_netflow, na.rm = TRUE) / 1e3, .by = year) |> mutate( label_num = format(round(net_flow, 1)), ypos = if_else(net_flow > 0, net_flow + 10, net_flow - 10) ) ggplot(sbpe_annual, aes(year, net_flow)) + geom_col(fill = color_palette[1], alpha = 0.9, width = 0.8) + geom_text(aes(y = ypos, label = label_num), size = 3) + geom_hline(yintercept = 0) + scale_x_continuous(breaks = 2019:2026) + labs( title = "Annual Net Savings Flow (SBPE)", x = NULL, y = "R$ billions" ) + theme_series()
The companion table "units" contains monthly counts of financed units.
units <- get_dataset("abecip", table = "units") glimpse(units)
The plot shows the amount of units financed per month together with a LOESS trend line.
#| code-fold: true # SBPE units financed per year units_recent <- units |> filter(date >= as.Date("2019-01-01")) ggplot(units_recent, aes(date, units_total)) + geom_point(alpha = 0.5, size = 0.8, color = color_palette[1]) + geom_smooth( color = color_palette[1], lwd = 0.7, se = FALSE, method = stats::loess, method.args = list(span = 0.4) ) + scale_x_date(date_breaks = "1 year", date_labels = "%Y") + labs( title = "Monthly Financed Units", y = "Units" ) + theme_series()
The bcb_realestate dataset imports all real estate statistics from the Brazilian Central Bank. This is a relatively large dataset and exploring can be cumbersome. Each series is uniquely identified by date and series_info. Helper functions v1, v2, ..., v5, abbrev_state, category, and type are provided to simplify the use of the dataset.
The code below shows how to access a specific series and also how to fetch a group of related series.
bcb <- get_dataset("bcb_realestate") # Get a specific series sfh_pf <- bcb |> filter(series_info == "credito_estoque_carteira_credito_pf_sfh_br") # Get the all the related series for 'estoque_carteira_credito_pf' credit_stock <- bcb |> filter( category == "credito", type == "estoque", v1 == "carteira", v2 == "credito", v3 == "pf", # since v4 is left blank, we get all credit lines v5 == "br" ) # The helper columns essentially separate the 'series_info' column allowing # for easier filtering. It's equivalent to filtering by regex credit_stock <- bcb |> filter(grepl( "(?<=credito_estoque_carteira_credito_pf_).+_br$", series_info, perl = TRUE ))
The single series shows only the values from SFH (specific credit line).
#| code-fold: true ggplot(sfh_pf, aes(date, value / 1e9)) + geom_line(lwd = 0.7, color = color_palette[1]) + labs(title = "SFH", y = "R$ (billions)") + theme_series()
The grouped series show the entire household credit stock by credit line.
#| code-fold: true credit_labels <- c( "Home Equity" = "home-equity", "Comercial" = "comercial", "Livre" = "livre", "FGTS" = "fgts", "SFH" = "sfh" ) credit_stock <- credit_stock |> mutate( credit_line_label = factor( v4, levels = credit_labels, labels = names(credit_labels) ) ) ggplot(credit_stock, aes(date, value / 1e9)) + geom_area(aes(fill = credit_line_label), alpha = 0.9) + scale_fill_manual(values = rev(color_palette[1:5])) + scale_x_date(expand = expansion(mult = c(0.01))) + scale_y_continuous(expand = expansion(mult = c(0, 0.05))) + labs( title = "Real Estate Credit Stock", subtitle = "Household real estate credit stock (total debt) by credit line", y = "R$ (billions)", fill = NULL ) + theme_series()
As a final warning, note that the bcb_realestate dataset follows the YYYY-MM-DD format using the last day of the month as default value (e.g. 2023-01-31). This can cause issues when merging with other datasets, since the first day of the month is the more common date format (e.g. 2023-01-01).
To avoid this, use lubridate::floor_date(date, 'month'). Future versions of realestatebr might provide this as a default behavior.
The available datasets are listed below.
| Dataset | Source | Tables | Status |
|---------|--------|--------|--------|
| abecip | ABECIP | sbpe, units, cgi | Active |
| abrainc | ABRAINC / FIPE | indicator, radar, leading | Active |
| bcb_realestate | Banco Central do Brasil | accounting, application, indices, sources, units | Active |
| bcb_series | Banco Central do Brasil | core, primary, secondary, tertiary, full | Active |
| fgv_ibre | FGV IBRE | — | Active |
| rppi | FIPE/ZAP, IVGR, IGMI, IQA, IVAR, SECOVI-SP | sale, rent, fipezap, ivgr, igmi, iqa, iqaiw, ivar, secovi_sp | Active |
| rppi_bis | Bank for International Settlements | selected, detailed_monthly, detailed_quarterly, detailed_annual, detailed_halfyearly | Active |
| secovi | SECOVI-SP | condo, rent, launch, sale | Active |
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.