query-matches-and-captures: Query matches and captures

query-matches-and-capturesR Documentation

Query matches and captures

Description

These two functions execute a query on a given node, and return the captures of the query for further use. Both functions return the same information, just structured differently depending on your use case.

  • query_matches() returns the captures first grouped by pattern, and further grouped by match within each pattern. This is useful if you include multiple patterns in your query.

  • query_captures() returns a flat list of captures ordered by their node location in the original text. This is normally the easiest structure to use if you have a single pattern without any alternations that would benefit from having individual captures split by match.

Both also return the capture name, i.e. the ⁠@name⁠ you specified in your query.

Usage

query_matches(x, node, ..., range = NULL)

query_captures(x, node, ..., range = NULL)

Arguments

x

⁠[tree_sitter_query]⁠

A query.

node

⁠[tree_sitter_node]⁠

A node to run the query over.

...

These dots are for future extensions and must be empty.

range

⁠[tree_sitter_range / NULL]⁠

An optional range to restrict the query to.

Predicates

There are 3 core types of predicates supported:

  • ⁠#eq? @capture "string"⁠

  • ⁠#eq? @capture1 @capture2⁠

  • ⁠#match? @capture "regex"⁠

Each of these predicates can also be inverted with a ⁠not-⁠ prefix, i.e. ⁠#not-eq?⁠ and ⁠#not-match?⁠.

String double quotes

The underlying tree-sitter predicate parser requires that strings supplied in a query must use double quotes, i.e. "string" not 'string'. If you try and use single quotes, you will get a query error.

⁠#match?⁠ regex

The regex support provided by ⁠#match?⁠ is powered by grepl().

Escapes are a little tricky to get right within these match regex strings. To use something like ⁠\s⁠ in the regex string, you need the literal text ⁠\\s⁠ to appear in the string to tell the tree-sitter regex engine to escape the backslash so you end up with just ⁠\s⁠ in the captured string. This requires putting two literal backslash characters in the R string itself, which can be accomplished with either "\\\\s" or using a raw string like r'["\\\\s"]' which is typically a little easier. You can also write your queries in a separate file (typically called queries.scm) and read them into R, which is also a little more straightforward because you can just write something like ⁠(#match? @id "^\\s$")⁠ and that will be read in correctly.

Examples


text <- "
foo + b + a + ab
and(a)
"

source <- "(identifier) @id"

language <- treesitter.r::language()

query <- query(language, source)
parser <- parser(language)
tree <- parser_parse(parser, text)
node <- tree_root_node(tree)

# A flat ordered list of captures, that's most useful here since
# we only have 1 pattern!
captures <- query_captures(query, node)
captures$node


treesitter documentation built on June 24, 2024, 5:07 p.m.