create_training_set: Create the training set for the automatic classification

View source: R/Model.R

create_training_setR Documentation

Create the training set for the automatic classification

Description

Given an Annotation file, it creates and join Document Term Matrices (DTM) for the title, abstract, authors, keywords and MESH terms of a record, keeping terms above a given document frequency among the negative labelled records and which are present in at least two records

Usage

create_training_set(Records, min_freq = 0.05)

Arguments

Records

An Annotation data frame.

min_freq

Minimum document frequency (between 0 and 1) in negative labelled records above which a term is considered.

Value

An enriched DTM which is the merge of the title, abstract, authors, keywords and MESH terms DTMs plus the record ID and label if present.

Examples

## Not run: 

Records <- import_data(get_session_files("Session1")$Records)

DMT <- create_training_set(Records)

## End(Not run)

bakaburg1/BaySREn documentation built on March 30, 2022, 12:16 a.m.