my_leipzig_sample: Corpus data

Description Usage Format Source

Description

The corpus is a random subset of 25,000 sentences from one of the Indonesian Leipzig Corpora files, i.e., the "ind_news_2008_300K-sentences.txt". This corpus file originally contains 300,000 sentences of Indonesian online newspapers.

Usage

1

Format

A character vector of 25,000 elements of sentences.

Source

http://wortschatz.uni-leipzig.de/en/download


gederajeg/wordpairs documentation built on May 23, 2019, 2:46 p.m.