That solution worked in this case, but was not very elegant, and might not work for all cases (what if there was a 'great aunt' in the list?)
Or here is a more specific case for this data set.
How many patients have a father with a history of disease? But we don't want to include grandfathers in the results.
We can use something called Regular Expressions, aka Regex, to solve this
Think of regex as a separate language, with it's own code, syntax, and rules.
Regex rules allow complex matching patterns for strings, to ensure matching exactly the content desired
It is far too complex to cover in its entirely here, but here is one specific example.
GOAL: identify all of the patients that have a father with a history of disease, but excluding grandfathers in the results.
father
.
But then we want to make sure that we capture both Father
and father
. To accept either case f in the first spot we add (F|f)
, so now our regex looks like (F|f)ather
Lastly, we want this pattern to appear at the beginning of the word, so we add the regex ^
symbol.
Our completed regex looks like:
str_count(heart_joined$family_history, "^(F|f)ather")
Go to code/
Open 09_stringr.Rmd
Complete the exercise to count mothers.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.