opi_impact: Statistical assessment of impacts of a specified theme from a...

Description Usage Arguments Details Value References Examples

View source: R/opi_impact.R

Description

This function assesses the impacts of a theme (or subject) on the overall opinion computed for a DTD Different themes in a DTD can be identified by the keywords used in the DTD. These keywords (or words) can be extracted by any analytical means available to the users, e.g. word_imp function. The keywords must be collated and supplied this function through the theme_keys argument (see below).

Usage

1
2
3
opi_impact(textdoc, theme_keys=NULL, metric = 1,
fun = NULL, nsim = 99, alternative="two.sided",
quiet=TRUE)

Arguments

textdoc

An n x 1 list (dataframe) of individual text records, where n is the total number of individual records.

theme_keys

(a list) A one-column dataframe (of any number of length) containing a list of keywords relating to the theme or secondary subject to be investigated. The keywords can also be defined as a vector of characters.

metric

(an integer) Specify the metric to utilize for the calculation of opinion score. Default: 1. See detailed documentation in the opi_score function.

fun

A user-defined function given that parameter metric (above) is set equal to 5. See detailed documentation in the opi_score function.

nsim

(an integer) Number of replicas (ESD) to generate. See detailed documentation in the opi_sim function. Default: 99.

alternative

(a character) Default: "two.sided", indicating a two-tailed test. A user can override this default value by specifying “less” or “greater” to run the analysis as one-tailed test when the observed score is located at the lower or upper regions of the expectation distribution, respectively. Note: for metric=1, the alternative parameter should be set equal to "two.sided" because the opinion score is bounded by both positive and negative values. For an opinion score bounded by positive values, such as when metric = 2, 3 or 4, the alternative parameter should be set as "greater", and set as "less" otherwise. If metric parameter is set equal to 5, with a user-defined opinion score function (i.e. fun not NULL ), the user is required to determine the limits of the opinion scores, and set the alternative argument appropriately.

quiet

(TRUE or FALSE) To suppress processing messages. Default: TRUE.

Details

This function calculates the statistical significance value (p-value) of an opinion score by comparing the observed score (from the opi_score function) with the expected scores (distribution) (from the opi_sim function). The formula is given as p = (S.beat+1)/(S.total+1), where S_total is the total number of replicas (nsim) specified, S.beat is number of replicas in which their expected scores are than the observed score (See further details in Adepeju and Jimoh, 2021).

Value

Details of statistical significance of impacts of a secondary subject B on the opinion concerning the primary subject A.

References

(1) Adepeju, M. and Jimoh, F. (2021). An Analytical Framework for Measuring Inequality in the Public Opinions on Policing – Assessing the impacts of COVID-19 Pandemic using Twitter Data. https://doi.org/10.31235/osf.io/c32qh

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Application in marketing:

#`data` -> 'reviews_dtd'
#`theme_keys` -> 'refreshment_theme'

#RQ2a: "Do the refreshment outlets impact customers'
#opinion of the services at the Piccadilly train station?"

##execute function
output <- opi_impact(textdoc = reviews_dtd,
          theme_keys=refreshment_theme, metric = 1,
          fun = NULL, nsim = 99, alternative="two.sided",
          quiet=TRUE)

#To print results
print(output)

#extracting the pvalue in order to answer RQ2a
output$pvalue

opitools documentation built on July 29, 2021, 5:06 p.m.