findProceduralStatements: Find procedural statements

Description Usage Arguments Details Value Examples

Description

Some statements may hold little value and should be removed. This function will find these statements. TODO: remove the statements as well.

Usage

1
findProceduralStatements(train_proc, train_actual, full_data, html_file, ...)

Arguments

train_actual

A text document where each row represents non-procedural statement from candidate(s) Encoding of the file should be UTF-8.

full_data

A dataset where each line represents data from candidate(s) to be classified as procedural or non-procedural statement. Encoding should be UTF-8.

...

Arguments passed to train_model

train_data

A text document where each row represents procedural statement from candidate(s). Encoding of the file should be UTF-8.

Details

You currently need the custom create_matrix.R because version from GitHub messes things up in create_container.

Value

An HTML file with each statement colored according to which group the model predicted it. Green color means non-procedural, red means procedural (and should possibly be removed).

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
library(tm)
out <- findProceduralStatements(train_proc = "./data/izjave_proc.txt",
                         train_actual = "./data/izjave_vsebina.txt",
                         full_data = "./data/izjava_HAINZ PRIMOZ.txt",
                         html_file = "rezultat.html")


## End(Not run)

romunov/zakonodaja documentation built on May 27, 2019, 1:50 p.m.