match_in_substr: Match strings that have a needle near the start, end, or...

View source: R/string_tools.R

match_in_substrR Documentation

Match strings that have a needle near the start, end, or middle

Description

Match strings that have a needle near the start, end, or middle

Usage

match_in_substr(str, query, buffer = 0.25, from = "s", values = FALSE)

Arguments

str

(Character) The vector to be searched.

query

(Character) The regular expression to look for. If length(query) > 1, will be collapsed into a regular expression as "(item1|item2|item3...)".

buffer

(Numeric) The length of the substring to search. If given as a whole number, it will be that many characters long. If given as a decimal number, it will be used as a proportion of the length of each element in str, e.g. buffer = 0.20 is 20% of each element's length.

from

(Character) If "start" or "s" (default), the search will be done from the start of each string. If "end" or "e", it will be from the end. If "middle" or "m", the middle of the string will be searched.

values

(Logical) If FALSE (default), returns a Logical vector of whether a match was found in each element of str. If TRUE, returns a Character vector containing only the elements of str that matched.

Value

A Logical vector if values == FALSE (default), or a Character vector if values == TRUE.

Examples

sentences <- c(
    "The word 'needle' appears at the start of this sentence.",
    "But in this sentence, 'needle' doesn't.",
    "If 'needle' appears several times in a sentence, then we have a lot of needles!",
    "And in here, the word we want to find (needle) is near the middle of the sentence."
    )

# Within 20 characters of the Start of the string
match_in_substr(sentences, "needle", 20, "s")

#> [1] TRUE FALSE TRUE FALSE

# In the last 25% of the string
match_in_substr(sentences, "needle", 0.25, "e")

#> [1] FALSE FALSE TRUE FALSE

# In the middle 1/3rd of the string
match_in_substr(sentences, "needle", 1/3, "m")

#> [1] FALSE FALSE FALSE TRUE

match_in_substr(sentences, "needle", 1/3, "m", values = TRUE)
#> [1] "And in here, the word we want to find (needle) is near the middle of the sentence."


DesiQuintans/desiderata documentation built on April 9, 2023, 5:43 a.m.