View source: R/09_report_generate.R
| generate_report | R Documentation |
Produces an HTML report combining the eligibility flowchart, the codebook, and a per-variable inspection panel. Supports two inspection modes:
generate_report(
data,
type = c("cross_sectional", "longitudinal"),
id_var = NULL,
time_var = NULL,
variables = NULL,
labels = NULL,
treat_as_categorical = NULL,
output_html,
output_dir = NULL,
export_codebook_editable = TRUE,
cache_data = TRUE,
title = NULL,
n_bins = 30,
top_n_cat = 20
)
data |
A Spark DataFrame (tbl_spark) or local data frame. |
type |
One of |
id_var |
Character. Name of the ID column. For |
time_var |
Character or NULL. Name of the time/wave column.
Used in |
variables |
Optional character vector. If provided, inspects only these variables. Default: NULL (all except id_var/time_var). |
labels |
Optional named list (variable -> label). If NULL, uses labels from the codebook when available. |
treat_as_categorical |
Character vector of variable names to treat
as categorical even when their R class is numeric or integer. Useful
for coded variables (e.g. |
output_html |
File path for the HTML output. There is no default:
the destination must be supplied explicitly (e.g. a file under
|
output_dir |
Optional directory for ancillary files (codebook.xlsx, codebook.docx, etc.). If NULL, derived from output_html. |
export_codebook_editable |
Logical. Also export codebook as
.docx and .xlsx in |
cache_data |
Logical. If TRUE and |
title |
Optional title for the report. |
n_bins |
Number of bins for numeric histograms. Default: 30. |
top_n_cat |
Max categories shown in categorical plots. Default: 20. |
cross_sectional: one plot per variable (histogram / bar / time).
longitudinal: three plots per variable (global distribution, intra-ID
variation, missingness by time) plus a meta plot of observations per ID.
All aggregations happen in Spark/dplyr; only small summaries are collected.
Invisible list with paths to all generated files.
# Rendering the HTML report needs rmarkdown + pandoc and a few plotting
# packages (all in Suggests); it also takes more than 5 seconds, so the
# example is wrapped in \donttest and writes only to tempdir().
if (requireNamespace("rmarkdown", quietly = TRUE) &&
requireNamespace("knitr", quietly = TRUE) &&
requireNamespace("ggplot2", quietly = TRUE) &&
requireNamespace("patchwork", quietly = TRUE) &&
requireNamespace("scales", quietly = TRUE) &&
rmarkdown::pandoc_available()) {
cb_init(id_col = "id_indiv")
df_baseline <- data.frame(
id_indiv = sprintf("ID%03d", 1:50),
cod_sexo = sample(c(1L, 2L), 50, replace = TRUE),
idade = sample(18:80, 50, replace = TRUE)
)
# Write to a dedicated subdir of tempdir() and clean everything up after:
out_dir <- file.path(tempdir(), "autocodebook_report_demo")
generate_report(df_baseline, type = "cross_sectional",
id_var = "id_indiv",
treat_as_categorical = "cod_sexo",
output_html = file.path(out_dir, "report_baseline.html"))
unlink(out_dir, recursive = TRUE)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.