clean_html: Clean HTML and whitespace from a string

Description Usage Arguments Value

Description

This function uses regex extensively to clean HTML out of a given text block. "(&[a-z]*;|<.*?>)" is the first regular expression used. It matches a substring that starts with & and ends with ; with lower case letters between them, or a substring with < and > on each side, with any characters between. Each matched substring is replaced with a space character. The next regex is "\s+". It matches multiple characters of whitespace, and reduces them to a single space character. The last regex used is "^\s+|\s+$". It matches whitespace at the beginning or end of the text and removes it.

Usage

1

Arguments

text

any text string that might contain HTML or whitespace that needs stripped.

Value

text without any html or extraneous whitespace.


ctesta01/QualtricsTools documentation built on May 14, 2019, 12:27 p.m.