Description Usage Arguments Value Examples
View source: R/as_transcript.R
Coerce text into a transcript.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
text |
Character string: if file is not supplied and this is, then data are read from the value of text. Notice that a literal string can be used to include (small) data sets within R code. |
person.regex |
A capturing regex describing what is a person portion of a string. |
col.names |
A character vector specifying the column names of the transcript columns. |
text.var |
A character string specifying the name of the text variable
will ensure that variable is classed as character. If |
merge.broke.tot |
logical. If |
header |
logical. If |
dash |
A character string to replace the en and em dashes special characters (default is to remove). |
ellipsis |
A character string to replace the ellipsis special characters. |
quote2bracket |
logical. If |
rm.empty.rows |
logical. If |
na |
A character string to be interpreted as an |
sep |
The field separator character. Values on each line of the file are
separated by this character. The default of |
skip |
Integer; the number of lines of the data file to skip before beginning to read data. |
comment.char |
A character vector of length one containing a single
character or an empty string. Use |
max.person.nchar |
The max number of characters long names are expected to be. This information is used to warn the user if a separator appears beyond this length in the text. |
... |
Further arguments to be passed to |
Returns a dataframe of dialogue and people.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | ## EXAMPLE 1
as_transcript("34 The New York Times reports a lot of words here.
12 Greenwire reports a lot of words.
31 Only three words.
2 The Financial Times reports a lot of words.
9 Greenwire short.
13 The New York Times reports a lot of words again.",
col.names = c("NO", "ARTICLE"), sep = " ")
## EXAMPLE 2
as_transcript("34.. The New York Times reports a lot of words here.
12.. Greenwire reports a lot of words.
31.. Only three words.
2.. The Financial Times reports a lot of words.
9.. Greenwire short.
13.. The New York Times reports a lot of words again.",
col.names = c("NO", "ARTICLE"), sep = "\\.\\.")
## EXAMPLE 3
as_transcript("JAKE The New York Times reports a lot of words here.
JIM Greenwire reports a lot of words.
JILL Only three words.
GRACE The Financial Times reports a lot of words.
JIM Greenwire short.
JILL The New York Times reports a lot of words again.",
person.regex = '(^[A-Z]{3,})'
)
|
Table: [6 x 2]
NO ARTICLE
1 34 The New York Times reports a lot of word
2 12 Greenwire reports a lot of words.
3 31 Only three words.
4 2 The Financial Times reports a lot of wor
5 9 Greenwire short.
6 13 The New York Times reports a lot of word
. .. ...
Table: [6 x 2]
NO ARTICLE
1 34 The New York Times reports a lot of word
2 12 Greenwire reports a lot of words.
3 31 Only three words.
4 2 The Financial Times reports a lot of wor
5 9 Greenwire short.
6 13 The New York Times reports a lot of word
. .. ...
Table: [6 x 2]
Person Dialogue
1 JAKE The New York Times reports a lot of word
2 JIM Greenwire reports a lot of words.
3 JILL Only three words.
4 GRACE The Financial Times reports a lot of wor
5 JIM Greenwire short.
6 JILL The New York Times reports a lot of word
. ... ...
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.