parse_qcodes: Parse coded text

Description Usage Arguments Details Value Examples

View source: R/parse_qcodes.R

Description

Take a data frame of coded text documents and return a data frame of the codes captured within.

Usage

1

Arguments

x

A data frame containing the text to be coded; requires columns "doc_id" and "document_text"

...

Other parameters optionally passed in

Details

This function takes a text document containing coded text of the form:

1
2
"stuff to ignore (QCODE) coded text we care about (/QCODE){#my_code}
more stuff to ignore"

and turns it into a data frame with one row per coded item, of the form: docid,qcode,text

parse_qcodes assumes that it is being passed a data frame, the parse_one_document function is called to do the heavy lifting extracting the coded text from the document_text column.

Newline characters are replaced with an HTML <br> in the captured text.

If no valid qcodes are found, parse_qcodes returns an empty data frame (no rows).

Value

If the data frame contains coded text in the document_text column, output will be a data frame with three columns: "doc", "qcode", and "text".

1
2
3
4
5
    The \code{doc} is the \code{doc_id} from the input data frame.

    \code{qcode} is the code that the captured text was marked up with.

    \code{text} is the text that was captured.

Examples

1
2
3
4
5
parse_qcodes(my_documents)

# Data frames can be piped into this function
my_documents %>%
  parse_qcodes()

ropenscilabs/qcoder documentation built on Dec. 31, 2021, 9:11 p.m.