This repository was published for review purposes and is now only useful for replicating the published results. Please see http://github.com/vanatteveldt/rsyntax for an updated version of the module.
You can install directly from github:
library(devtools)
install_github("anon-author/clauses")
The functions in this module assume that you have a list of tokens in a data frame. A simple example is provided with the module:
library(rsyntax)
data(example_tokens)
tokens
word
parent
sentence
coref
pos
entity
lemma
relation
offset
aid
id
pos1
attack
John
2
1
1
NNP
PERSON
John
nsubj
0
156884180
1
M
FALSE
says
NA
1
NA
VBZ
say
5
156884180
2
V
FALSE
that
5
1
NA
IN
that
mark
10
156884180
3
P
FALSE
Mary
5
1
NA
NNP
PERSON
Mary
nsubj
15
156884180
4
M
FALSE
hit
2
1
NA
VBD
hit
ccomp
20
156884180
5
V
FALSE
him
5
1
1
PRP
he
dobj
24
156884180
6
O
FALSE
Get the text of a sentence, optionally specifying which column(s) to use:
get_text(tokens)
## [1] "John says that Mary hit him"
get_text(tokens, word.column = c("lemma", "pos"))
## [1] "John/NNP say/VBZ that/IN Mary/NNP hit/VBD he/PRP"
Plot the syntactic structure of a sentence: (Note: if you have multiple sentences in one token list, you should filter it or provide a sentence= argument)
g = graph_from_sentence(tokens)
plot(g)
You can use the get_quotes
function to extract quotes and paraphrases
from the sentences. Note that for this, the token ids need to be
globally unique. If that is not the case, you can use the unique.ids
function to make them unique:
tokens = unique_ids(tokens)
You can get the quotes from the tokens with get_quotes
:
quotes = get_quotes(tokens)
quotes
quote_id
key
quote_role
id
1
2
source
1
1
2
quote
3
1
2
quote
4
1
2
quote
6
1
2
quote
5
A single quote was found, with node 2 ("say") as the key, node 1 ("John") as the sources, and nodes 3 through 6 ("that Mary hit him") as quote.
To find the clauses, you can use the get_clauses function, which takes the quotes as an optional argument to make sure that speech actions are not listed as clauses:
clauses = get_clauses(tokens, quotes=quotes)
clauses
clause_id
clause_role
id
1
subject
4
1
predicate
3
1
predicate
6
1
predicate
5
Finally, you can also provide the quotes and clauses to the
graph_from_sentence
function. This will fill the clauses in a
desaturated rainbow, with the subject as a circle and the predicate as
rectangle. Quotes are represented with a bright node for the source, and
the border in the same colour for the quote.
g = graph_from_sentence(tokens, quotes = quotes, clauses = clauses)
plot(g)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.