extract_feature_ind: Extract a binary indicator for some feature

Description Usage Arguments Value Examples

Description

Some features from product detailpages contain a delimited list with multiple features. This function extracts a given feature from this list, by searching for a regex expression in that given column.

Usage

1

Arguments

dat_gh

A tibble (data.frame), usually obtained via get_geizhals_data.

col

A character vector of length one, specifying the name of the column in dat_gh that should be parsed for the feature.

regex

A character vector of length one with a regular expression. The column col is scanned for that regular expression.

Value

A vector of length nrow(dat_gh), containing 1 if a feature is present in a given product (i.e., if the regular expression is found), 0 if the feature is not found, and NA if that column is missing (i.e., that category was not present in the detailed product description page).

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
url_geizhals <- "https://geizhals.at/?cat=hwaeschtr"
dat_gh <- get_geizhals_data(url_geizhals, max_pages = 1)
extract_feature_ind(dat_gh, col = "Ausstattung", regex = "wartungsfreier Kondensator")
extract_feature_ind(dat_gh, col = "Ausstattung", regex = "Anschlussmöglichkeit")
extract_feature_ind(dat_gh, col = "Ausstattung",
  regex = "Anschlussmöglichkeit.*Kondenswasserablauf")

## End(Not run)

ingonader/rgeizhals documentation built on May 29, 2019, 3:05 a.m.