calculate_str_boundary | R Documentation |
calculate_str_boundary
will use boundary patterns and a target within the boundary to identify a chunk of interest within a string.
calculate_str_boundary(
string,
boundaries,
target,
match_index = 1,
return_as_index = TRUE
)
string |
A character object of length 1. |
boundaries |
A character object of length 2 (concatenated). |
target |
A character object for the REGEX match within boundary. |
match_index |
Integer, determine which match to use if more than one found (default: 1). |
return_as_index |
Logical value, if set to |
Although RegEx can be used directly to achieve a similar results (forward lookups, etc.), this function provides a simple way to find a pattern within a
particular boundary. This can be useful is edits of HTML files, where one wants to excise or adjust text between tags (e.g. <script></script>
). The logic is as
follows: (a) identify all points in the string where the boundaries and target are found, (b) calculate the difference between all combinations of the boundaries from the target,
(c) determine which boundary are closest to the start and end of the target match, (d) return the entire range of the boundaries with the target either as a vector of start/end locations
or the entire text content of the match.
To vectorize over several strings and patterns, it is recommended to use a for
loop, apply
family, or purrr
functions (e.g. pmap
).
Either a vector of start and end points for the match, or a character value of the entire matched range in the provided string.
## Not run:
# Load libraries
library(dplyr); library(stringr); library(magrittr)
# Create fake text
test_data <- '<head><script>RANDOMTEXT</script><script>TARGET.TEXT, OTHER RANDOMTEXT</script><script>RANDOMTEXT</script></head>'
# Determine match
tartget_chunk <- calculate_str_boundary(string = test_data,
boundaries = c('<script>', '</script>'),
target = 'TARGET\\.TEXT')
# Delete from initial text
stringr::str_sub(test_data, tartget_chunk[1], tartget_chunk[2]) <- ''
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.