knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) knitr::opts_chunk$set(warning = FALSE, message = FALSE)
library(finnsurveytext) library(survey)
The new updated version of finnsurveytext
works with svydesign
objects which can be created with the survey
R package. There are two ways that svydesign objects can be used:
svydesign
object for use in this tutorial:We will use the dev_coop
sample dataset for the tutorial, and create a svydesign
object from this sample data.
svy_d <- survey::svydesign(id = ~1, weights = ~paino, data = dev_coop)
The relevant function here is fst_prepare_svydesign()
.
svydesign
: the name of the svydesign objectquestion
: the column of 'data' that contains the open-ended questionid
: the ID column in 'data'model
: a language model available for udpipe, "ftb" or "tdt". (If you do not provide a value, this will default to "ftb".)stopword_list
: a list of stopwords to remove from your 'data'. (To find relevant stopwords lists, you can run fst_find_stopwords()
. If you do not provide a value, this will default to "nltk". "manual" can be used to indicate that a manual list will be provided.)language
This should be the two-letter ISO code for the language for the stopword list. The default language
is "fi". use_weights
: optional, a boolean as to whether to include weights from the svydesign. (If you do not provide a value, this will default to TRUE.)add_columns
: any columns you want to add to the formatted data from the svydesign object, such as for use in comparison functions. (If you do not provide any values, this will default to NULL.)manual
: an optional boolean to indicate that a manual list will be provided. (stopword_list = "manual"
can also or instead be used. If you do not provide any values, this will default to FALSE.)manual_list
: the manual list of stopwords if you choose to provide one. (If you do not provide any values, this will default to an empty list.)df <- fst_prepare_svydesign(svydesign = svy_d, question = 'q11_3', id = 'fsd_id', model = 'tdt', stopword_list = 'snowball', use_weights = TRUE, add_cols = c('gender','region') )
The data is now formatted:
knitr::kable(head(df, 5))
svydesign
object in data explorationThe svydesign
object can be used to add weights and other columns during data exploration.
First, let's create formatted data without weights and additional columns ready to use with our svydesign
object.
df2 <- fst_prepare(data = dev_coop, question = 'q11_3', id = 'fsd_id', model = 'ftb', stopword_list = 'nltk', weights = NULL, add_cols = NULL)
Within the data analysis functions, there are 3 parameters (which are in each function) which are used to add information from the svydesign
object. These are:
use_svydesign_weights
: should be set as TRUE if we want to use weights from within a svydesign object.id
: only required if weights are coming from a svydesign
objectsvydesign
: the svydesign
objectWithin the initial functions (the ones which are not used for comparison between groups) these are used to add weights from the svydesign
object.
For example,
fst_wordcloud(df2, pos_filter = c("NOUN", "VERB", "ADJ", "ADV"), max=50, use_svydesign_weights = TRUE, id = 'fsd_id', svydesign = svy_d) fst_freq(df2, number = 10, norm = NULL, pos_filter = NULL, strict = TRUE, name = NULL, use_svydesign_weights = TRUE, id = "fsd_id", svydesign = svy_d, use_column_weights = FALSE)
Within the comparison functions, we have the following additional parameter:
use_svydesign_field
: set to TRUE if you want to get field
for splitting the data from the svydesign
objectfst_ngrams_compare(fst_dev_coop_2, field = 'gender', number = 10, ngrams = 1, norm = NULL, pos_filter = NULL, strict = TRUE, use_svydesign_weights = TRUE, use_svydesign_field = TRUE, id = "fsd_id", svydesign = svy_d, use_column_weights = FALSE, exclude_nulls = TRUE, rename_nulls = 'null_data', unique_colour = "indianred", title_size = 20, subtitle_size = 15)
All of these functions call the function fst_use_svydesign()
in the background to add the svydesign
data to your formatted dataframe.
# FUNCTION DEFINITION: fst_use_svydesign <- function(data, svydesign, id, add_cols = NULL, add_weights = TRUE)
unlink('finnish-ftb-ud-2.5-191206.udpipe') unlink("finnish-tdt-ud-2.5-191206.udpipe")
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.