bdplyr | R Documentation |
Allow you to explore and perform operation with Base dos Dados' datasets
without using SQL language. The bdplyr()
function creates lazy
variables
that will be connected directly to the desired table from Base dos Dados at
Google BigQuery and can be handled with the dplyr::dplyr-package's verbs
as traditionally done as local bases. See also: bigrquery::src_bigquery.
Therefore, it is possible (without using SQL
) to perform, for example,
column selection with dplyr::select()
, filter rows with dplyr::filter()
,
operations with dplyr::mutate()
, joins with dplyr::left_join()
and
other vebs from {dplyr}
package.
The data will be automatically be downloaded from Google BigQuery in the background as it if necessary, but wille not be loaded into your virtual memory nor recorded on disk unless expressly requested.
For this, the functions such as bd_collect()
or bd_write()
should be
used. To load the data handled locally in your virtual memory, use
bd_collect()
. To save the result in disk use the broader function
bd_write()
or its derivatives bd_write_csv()
or bd_write_rds()
to
save, respectively in .csv
or .rds
format.
bdplyr( table, billing_project_id = basedosdados::get_billing_id(), query_project_id = "basedosdados" )
table |
String in the format |
billing_project_id |
a string containing your billing project id.
If you've run |
query_project_id |
The project name at GoogleBigQuery. By default
|
A lazy tibble
, which can be handled (almost) as if were a local
database. After satisfactorily handled, the result must be loaded into
memory using bd_collect()
or written to disk using bd_write()
or its
derivatives.
bd_collect()
, bd_write()
, bd_write_rds()
, bd_write_rds()
,
bigrquery::src_bigquery
## Not run: # set project billing id basedosdados::set_billing_id("avalidprojectbillingid") # connects to the remote table I want base_sim <- bdplyr("br_ms_sim.municipio_causa_idade") # connects to another remote table municipios <- bdplyr("br_bd_diretorios_brasil.municipio") # explore data base_sim %>% dplyr::glimpse() # use normal `{dplyr}` operations municipios %>% head() # filter base_sim_acre <- base_sim %>% dplyr::mutate(ano = as.numeric(ano)) %>% dplyr::filter(sigla_uf == "AC", ano >= 2018) municipios_acre <- municipios %>% dplyr::filter(sigla_uf == "AC") %>% dplyr::select(id_municipio, municipio, regiao) # join base_junta <- base_sim_acre %>% dplyr::left_join(municipios_acre, by = "id_municipio") # tests whether the result is satisfactory base_junta # collect the result base_final <- base_junta %>% basedosdados::bd_collect() # alternatively, write in disk the result base_final %>% basedosdados::bd_write_rds(path = "data-raw/data.rds") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.