knitr::opts_chunk$set( collapse = TRUE, message = FALSE, warning = FALSE, comment = "#>", fig.path = "man/figures/", out.width = "100%") options(tibble.print_min = 5, tibble.print_max = 5) options(rmarkdown.html_vignette.check_title = FALSE)
Labelled data in SPSS and Stata refers to datasets where each variable (or column) and its values are assigned meaningful labels. These labels provide context, such as descriptions or categories, making the data easier to understand and analyze. For instance, a variable representing gender might have numerical codes (1, 2) with labels ("Male", "Female"). This feature enhances data analysis by allowing researchers to work with descriptive labels instead of deciphering codes or numeric values, facilitating clearer interpretation and communication of statistical results.
The R ecosystem, through packages like foreign
and haven
, facilitates the importation of labelled data from software like SPSS and Stata, ensuring a smooth transition into R. The bulkreadr
package extends this functionality by leveraging haven
to further streamline the process. It automatically converts labelled data into R's factor data type, eliminating the need for manual recoding. This enhancement significantly improves the efficiency of the data analysis workflow within the R environment.
For the majority of functions within this package, we will utilize data stored in the system file by the
bulkreadr
, which can be accessed using thesystem.file()
function. If you wish to utilize your own data stored in your local directory, please ensure that you have set the appropriate file path prior to using any functions provided by the bulkreadr package.
read_spss_data()
is designed to seamlessly import data from an SPSS data (.sav
or .zsav
) files. It converts labelled variables into factors, a crucial step that enhances the ease of data manipulation and analysis within the R programming environment.
Read the SPSS data file without converting variable labels as column names
library(bulkreadr) file_path <- system.file("extdata", "Wages.sav", package = "bulkreadr") data <- read_spss_data(file = file_path) data
Read the SPSS data file and convert variable labels as column names
data <- read_spss_data(file = file_path, label = TRUE) data
read_stata_data()
reads Stata data file (.dta
) into an R data frame, converting labeled variables into factors.
Read the Stata data file without converting variable labels as column names
file_path <- system.file("extdata", "Wages.dta", package = "bulkreadr") data <- read_stata_data(file = file_path) data
Read the Stata data file and convert variable labels as column names
data <- read_stata_data(file = file_path, label = TRUE) data
generate_dictionary()
creates a data dictionary from a specified data frame. This function is particularly useful for understanding and documenting the structure of your dataset, similar to data dictionaries in Stata or SPSS.
# Creating a data dictionary from an SPSS file file_path <- system.file("extdata", "Wages.sav", package = "bulkreadr") wage_data <- read_spss_data(file = file_path) generate_dictionary(wage_data)
The look_for()
function is designed to emulate the functionality of the Stata lookfor
command in R. It provides a powerful tool for searching through large datasets, specifically targeting variable names, variable label descriptions, factor levels, and value labels. This function is handy for users working with extensive and complex datasets, enabling them to quickly and efficiently locate the variables of interest.
# Look for a single keyword. look_for(wage_data, "south") look_for(wage_data, "^s")
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.