cl_rework | R Documentation |
Wrappers for CWB Corpus Library functions suited for writing performance code.
s_attr(corpus, s_attribute, registry)
p_attr(corpus, p_attribute, registry)
p_attr_size(p_attr)
s_attr_size(s_attr)
p_attr_lexicon_size(p_attr)
cpos_to_struc(s_attr, cpos)
cpos_to_str(p_attr, cpos)
cpos_to_id(p_attr, cpos)
struc_to_cpos(s_attr, struc)
struc_to_str(s_attr, struc)
regex_to_id(p_attr, regex)
str_to_id(p_attr, str)
id_to_freq(p_attr, id)
id_to_cpos(p_attr, id)
cpos_to_lbound(s_attr, cpos)
cpos_to_rbound(s_attr, cpos)
corpus |
ID of a CWB corpus (length-one |
s_attribute |
A structural attribute (length-one |
registry |
Registry directory. |
p_attribute |
A positional attribute (length-one |
p_attr |
A |
s_attr |
A |
cpos |
An |
struc |
A length-one |
regex |
A regular expression. |
str |
A |
id |
An |
The default cl_* R wrappers for the functions of the CWB Corpus Library
involve a lookup of a corpus and its p- or s-attributes (using the corpus ID,
registry and attribute indicated by length-one character vectors) every time
one of these functions is called. It is more efficient looking up an
attribute only once. This set of functions passes "externalptr" classes to
reference attributes that have been looked up. A relevant scenario is writing
functions with a C++ implementation that are compiled and linked using
Rcpp::cppFunction()
or Rcpp::sourceCpp()
library(Rcpp)
cppFunction(
'Rcpp::StringVector get_str(
SEXP corpus,
SEXP p_attribute,
SEXP registry,
Rcpp::IntegerVector cpos
){
SEXP attr;
Rcpp::StringVector result;
attr = RcppCWB::p_attr(corpus, p_attribute, registry);
result = RcppCWB::cpos_to_str(attr, cpos);
return(result);
}',
depends = "RcppCWB"
)
result <- get_str("REUTERS", "word", RcppCWB::get_tmp_registry(), 0:50)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.