data_magyar_nemzet_large: Magyar Nemzet front page articles

data_magyar_nemzet_largeR Documentation

Magyar Nemzet front page articles

Description

The dataset contains 35 021 front page articles from the print Hungarian daily, Magyar Nemzet. This dataset is used in the 8th chapter of the textbook (https://tankonyv.poltextlab.com/embedding.html).

Usage

data_magyar_nemzet_large

Format

It is a data.frame, with 35 021 observation, 2 variables:

doc_id

A unique document id, the source file name in this case. The syntax is dailyname_year_month_day_nr.txt

text

The unprocessed article text

Source

https://cap.tk.hu/en/dataoverview

References

Sebők, Miklós, and Zoltán Kacsuk (2021). The Multiclass Classification of Newspaper Articles with Machine Learning: The Hybrid Binary Snowball Approach.. Political Analysis, 29(2): 236-249.


aakosm/HunMineR documentation built on Sept. 27, 2024, 5:22 p.m.