wfs_extract: Data extraction

Description Usage Arguments Value Details Example

View source: R/wfs_extract.R

Description

Extract a set of variables from a dataset reading it from the DHS data archive or the local file system

Usage

1
wfs_extract(varlist, dataset, source = "", convert.factors = TRUE)

Arguments

varlist

A required string with a comma-separated list of variables, for example "v010, v110". May include ranges, wilcards or keywords as explained under Details

dataset

A required string with the name of a dataset, for example "cosr02"

source

An optional string specifying a local folder where to find the files, leave blank to read the files from the DHS data archive

convert.factors

An optional boolean, use value labels to create factors?

Value

a data frame with attributes.

Details

The varlist is a comma-separated list of variable names, all in lowercase (even if dictionaries use uppercase). The list may include a range such as v701-v705 to extract variables v701 to v705. It may also include the wildcards ? and * to match one or more characters, so m??2 extracts the date of union for all unions. You may also use the keywords unions to extract the union history, births for the birth history, or all for all variables.

The dataset is required and must be the name of a dataset, for example cosr02. The dataset consists of a dictionary file with extension .dct and an ASCII data file with extension .dat.

The source is the name of a local folder where the two files mentioned above may be found. If left blank the function will download the dictionary and data files directly from the DHS data archive.

By default convert.factors is TRUE and we convert a variable to a factor if the dictionary specifies value labels and all values in the data other than NA have a corresponding value label. The conversion may be turned off by setting the flag to FALSE

The function returns a data frame.

Each variable with value labels has a "labels" attribute with the value labels, unless it was converted to a factor. This information can be used to convert a variable to a factor at a later time using the function labelled::to_factor()

Dictionaries may specify missing values and special codes. We recode all missing values to NA. For variables that are not converted to factors, we add the special code, if any, as a "special" attribute of the variable. Any numeric values greater or equal to the special code require special treatment in analysis.

Example

wfs_extract("v011, v111", "cosr02")


grodri/wfs documentation built on July 16, 2020, 11:11 p.m.