fst_child: Child Barometer 2016 Bullying response data in CoNLL-U format...

fst_childR Documentation

Child Barometer 2016 Bullying response data in CoNLL-U format with NLTK stopwords removed and background variables

Description

This data contains the responses to q7 "Kertoisitko, mitä sinun mielestäsi kiusaaminen on? (Avokysymys)" in the FSD3134 Lapsibarometri 2016 dataset in CoNLL-U format with NLTK stopwords and punctuation removed plus weights and background variables.

Usage

fst_child

Format

## 'fst_child' A dataframe with 1580 rows and 18 columns:

doc_id

the identifier of the document

paragraph_id

the identifier of the paragraph

sentence_id

the identifier of the sentence

sentence

the text of the sentence for which this token is part of

token_id

Word index, integer starting at 1 for each new sentence; may be a range for multi-word tokens; may be a decimal number for empty nodes.

token

Word form or punctuation symbol.

lemma

Lemma or stem of word form.

upos

Universal part-of-speech tag.

xpos

Language-specific part-of-speech tag; underscore if not available.

feats

List of morphological features from the universal feature inventory or from a defined language-specific extension; underscore if not available.

head_token_id

Head of the current word, which is either a value of token_id or zero (0).

dep_rel

Universal dependency relation to the HEAD (root iff HEAD = 0) or a defined language-specific subtype of one.

deps

Enhanced dependency graph in the form of a list of head-deprel pairs.

misc

Any other annotation.

weight

Weight

gender

Gender

major_region

Major region

daycare_before_school

Daycare before pre-school

Source

<https://urn.fi/urn:nbn:fi:fsd:T-FSD3134>


finnsurveytext documentation built on April 4, 2025, 5:07 a.m.