brussels_reviews_anno: Reviews of the AirBnB customers which are tokenised, POS...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

brussels_reviews_anno

R Documentation

Reviews of the AirBnB customers which are tokenised, POS tagged and lemmatised

Description

Reviews of the AirBnB customerswhich are tokenised, POS tagged and lemmatised. The data contains 1 row per document/token and contains the fields doc_id, language, sentence_id, token_id, token, lemma, xpos.
Data has been converted from UTF-8 to ASCII as in iconv(x, from = "UTF-8", to = "ASCII//TRANSLIT") in order to be able to comply to CRAN policies.

Source

http://insideairbnb.com/brussels: information of 2015-10-03

Examples

## brussels_reviews_anno
data(brussels_reviews_anno)
head(brussels_reviews_anno)
sort(table(brussels_reviews_anno$xpos))

## Not run: 

##
## If you want to construct a similar dataset as the 
## brussels_reviews_anno dataset based on the udpipe library, do as follows
##

library(udpipe)
library(data.table)
data(brussels_reviews)

## The brussels_reviews contains comments on Airbnb sites in 3 languages: es, fr and nl
table(brussels_reviews$language)
bxl_anno <- split(brussels_reviews, brussels_reviews$language)

## Annotate the Spanish comments
m <- udpipe_download_model(language = "spanish-ancora")
m <- udpipe_load_model(file = m$file_model)
bxl_anno$es <- udpipe_annotate(object = m, x = bxl_anno$es$feedback, doc_id = bxl_anno$es$id)

## Annotate the French comments
m <- udpipe_download_model(language = "french-partut")
m <- udpipe_load_model(file = m$file_model)
bxl_anno$fr <- udpipe_annotate(object = m, x = bxl_anno$fr$feedback, doc_id = bxl_anno$fr$id)

## Annotate the Dutch comments
m <- udpipe_download_model(language = "dutch-lassysmall")
m <- udpipe_load_model(file = m$file_model)
bxl_anno$nl <- udpipe_annotate(object = m, x = bxl_anno$nl$feedback, doc_id = bxl_anno$nl$id)

brussels_reviews_anno <- lapply(bxl_anno, as.data.frame)
brussels_reviews_anno <- rbindlist(brussels_reviews_anno)
str(brussels_reviews_anno)

## End(Not run)

udpipe documentation built on Jan. 6, 2023, 5:06 p.m.

udpipe index

README.md UDPipe Natural Language Processing - Basic Analytical Use Cases UDPipe Natural Language Processing - Model Building UDPipe Natural Language Processing - Parallel UDPipe Natural Language Processing - Text Annotation UDPipe Natural Language Processing - Topic Modelling Use Cases UDPipe Natural Language Processing - Try it out UDPipe Natural Language Processing - Universe

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

udpipe
Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

brussels_reviews_anno: Reviews of the AirBnB customers which are tokenised, POS...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

Reviews of the AirBnB customers which are tokenised, POS tagged and lemmatised

Description

Source

See Also

Examples

Related to brussels_reviews_anno in udpipe...

R Package Documentation

Browse R Packages

We want your feedback!

udpipe Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

brussels_reviews_anno: Reviews of the AirBnB customers which are tokenised, POS... In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

Reviews of the AirBnB customers which are tokenised, POS tagged and lemmatised

Description

Source

See Also

Examples

Related to brussels_reviews_anno in udpipe...

R Package Documentation

Browse R Packages

We want your feedback!

udpipe
Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit

brussels_reviews_anno: Reviews of the AirBnB customers which are tokenised, POS...
In udpipe: Tokenization, Parts of Speech Tagging, Lemmatization and Dependency Parsing with the 'UDPipe' 'NLP' Toolkit