file_metadata_1830: File metadata for the decade 1830

file_metadata_1830R Documentation

File metadata for the decade 1830

Description

The Hansard corpus file metadata retains source information from the digitized debates hosted by UK Parliament. It also includes indexing information added by the author. This data set can be used for locating data within original debates, or for citing the debates. The variables are as follows:

Usage

file_metadata_1830

Format

A data frame with 957327 rows and 5 variables:

sentence_id A unique ID assigned to each sentence of the Hansard corpus.

speech_id A unqiue ID assigned to each consective sentence stated by a speaker during a debate.

debate_id A unique ID assigned to each debate of the hansard corpus.

src_file_id An ID assigned to the digitized file from which the present dataset was scraped, taken from the digitized parliamentary debates.

src_image An ID assigned to the image of the digitized file, taken from the digitized parliamentary debates.

src_column The column of the sentence, taken from the digitized parliamentary debates.

Source

\hrefhttps://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/ZCYJH8Harvard Dataverse

References

Buongiorno, Steph; Kalescky, Robert; Godat, Eric; Cerpa, Omar Alexander; Guldi, Jo (2021) (\hrefhttps://doi.org/10.7910/DVN/ZCYJH8)

Examples

data(file_metadata_1830)


stephbuon/hansardr documentation built on March 1, 2023, 6:42 p.m.