cbind_dependencies: Add the dependency parsing information to an annotated...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

cbind_dependencies

R Documentation

Add the dependency parsing information to an annotated dataset

Description

Annotated results of udpipe_annotate contain dependency parsing results which indicate how each word is linked to another word and the relation between these 2 words.
This information is available in the fields token_id, head_token_id and dep_rel which indicates how each token is linked to the parent. The type of relation (dep_rel) is defined at https://universaldependencies.org/u/dep/index.html.
For example in the text 'The economy is weak but the outlook is bright', the term economy is linked to weak as the term economy is the nominal subject of weak.

This function adds the parent or child information to the annotated data.frame.

Usage

cbind_dependencies(
  x,
  type = c("parent", "child", "parent_rowid", "child_rowid"),
  recursive = FALSE
)

Arguments

`x`	a data.frame or data.table as returned by `as.data.frame(udpipe_annotate(...))`
`type`	either one of 'parent', 'child', 'parent_rowid', 'child_rowid'. Look to the return value section for more information on the difference in logic. Defaults to 'parent', indicating to add the information of the head_token_id to the dataset
`recursive`	in case when `type` is set to 'parent_rowid' or 'child_rowid', do you want the parent of the parent of the parent, ... or the child of the child of the child ... included. Defaults to FALSE indicating to only have the direct parent or children.

Details

Mark that the output which this function provides might possibly change in subsequent releases and is experimental.

Value

a data.frame/data.table in the same order of x where extra information is added on top namely:

In case type is set to 'parent': the token/lemma/upos/xpos/feats information of the parent (head dependency) is added to the data.frame. See the examples.
In case type is set to 'child': the token/lemma/upos/xpos/feats/dep_rel information of all the children is put into a column called 'children' which is added to the data.frame. This is a list column where each list element is a data.table with these columns: token/lemma/upos/xpos/dep_rel. See the examples.
In case type is set to 'parent_rowid': a new list column is added to x containing the row numbers within each combination of doc_id, paragraph_id, sentence_id which are parents of the token.
In case recursive is set to TRUE the new column which is added to the data.frame is called parent_rowids, otherwise it is called parent_rowid. See the examples.
In case type is set to 'child_rowid': a new list column is added to x containing the row numbers within each combination of doc_id, paragraph_id, sentence_id which are children of the token.
In case recursive is set to TRUE the new column which is added to the data.frame is called child_rowids, otherwise it is called child_rowid. See the examples.

Examples

## Not run: 
udmodel <- udpipe_download_model(language = "english-ewt")
udmodel <- udpipe_load_model(file = udmodel$file_model)
x <- udpipe_annotate(udmodel, 
                     x = "The economy is weak but the outlook is bright")
x <- as.data.frame(x)
x[, c("token_id", "token", "head_token_id", "dep_rel")]
x <- cbind_dependencies(x, type = "parent")
nominalsubject <- subset(x, dep_rel %in% c("nsubj"))
nominalsubject <- nominalsubject[, c("dep_rel", "token", "token_parent")]
nominalsubject

x <- cbind_dependencies(x, type = "child")
x <- cbind_dependencies(x, type = "parent_rowid")
x <- cbind_dependencies(x, type = "parent_rowid", recursive = TRUE)
x <- cbind_dependencies(x, type = "child_rowid")
x <- cbind_dependencies(x, type = "child_rowid", recursive = TRUE)
x
lapply(x$child_rowid, FUN=function(i) x[sort(i), ])

## End(Not run)

udpipe documentation built on Jan. 6, 2023, 5:06 p.m.

udpipe index

README.md UDPipe Natural Language Processing - Basic Analytical Use Cases UDPipe Natural Language Processing - Model Building UDPipe Natural Language Processing - Parallel UDPipe Natural Language Processing - Text Annotation UDPipe Natural Language Processing - Topic Modelling Use Cases UDPipe Natural Language Processing - Try it out UDPipe Natural Language Processing - Universe

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

udpipe
Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

cbind_dependencies: Add the dependency parsing information to an annotated...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

Add the dependency parsing information to an annotated dataset

Description

Usage

Arguments

Details

Value

Examples

Related to cbind_dependencies in udpipe...

R Package Documentation

Browse R Packages

We want your feedback!

udpipe Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

cbind_dependencies: Add the dependency parsing information to an annotated... In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

Add the dependency parsing information to an annotated dataset

Description

Usage

Arguments

Details

Value

Examples

Related to cbind_dependencies in udpipe...

R Package Documentation

Browse R Packages

We want your feedback!

udpipe
Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

cbind_dependencies: Add the dependency parsing information to an annotated...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit