# adventr: hypothesis testing In adventr: Interactive R Tutorials to Accompany Field (2016), "An Adventure in Statistics"

library(learnr)
library(tidyverse)
library(BayesFactor)
library(effsize)

knitr::opts_chunk$set(echo = FALSE, warning = FALSE) tutorial_options(exercise.cap = "Exercise") #Read dat files needed for the tutorial teddy_tib <- adventr::teddy_dat; zhang_female_tib <- adventr::zhang_female_dat #setup objects for code blocks teddy_sum <- teddy_tib %>% dplyr::group_by(study_n, group) %>% dplyr::summarize( mean = mean(self_esteem), sd = sd(self_esteem), ci_low = ggplot2::mean_cl_normal(self_esteem)$ymin,
ci_upp = ggplot2::mean_cl_normal(self_esteem)ymax )  Now try plotting an error bar graph using facet_wrap() to display graphs for the two studies side by side. use coord_cartesian() to set the y-limits to be 0-20, scale_y_continuous() to set ticks along the axis every two (e.g., 0, 2, 4, 6 ....), and labs() to define labels foe the x- and y-axes.   ted_plot <- ggplot2::ggplot(teddy_tib, aes(group, self_esteem)) ted_plot + stat_summary(fun.data = "mean_cl_normal", size = 1) + facet_wrap(~ study_n) + coord_cartesian(ylim = c(0,20)) + scale_y_continuous(breaks = seq(0, 20, 2)) + labs(x = "Experimental condition", y = "Self-esteem (0-20)") + theme_bw()  ## Effect sizes ### Cohen's d A useful measure of effect size is Cohen’s d, which is the difference between two means divided by some estimate of the standard deviation of those means: $$\hat{d} = \frac{\bar{X_1}-\bar{X_2}}{s}$$ I have put a hat on the d to remind us that we’re really interested in the effect size in the population, but because we can’t measure that directly, we estimate it from the samples (The hat means ‘estimate of’). By dividing by the standard deviation we are expressing the difference in means in standard deviation units (a bit like a z–score). The standard deviation is a measure of ‘error’ or ‘noise’ in the data, so d is effectively a signal-to-noise ratio. However, if we’re using two means, then there will be a standard deviation associated with each of them so which one should we use? There are three choices: 1. If one of the group is a control group it makes sense to use that groups standard deviation to compute d (the argument being that the experimental manipulation might affect the standard deviation of the experimental group, so the control group SD is a ‘purer’ measure of natural variation in scores) 2. Sometimes we assume that group variances (and therefore standard deviations) are equal (homogeneity of variance) and if they are we can pick a standard deviation from either of the groups because it won’t matter. 3. We use what’s known as a ‘pooled estimate’, which is the weighted average of the two group variances. This is given by the following equation: $$s_p = \sqrt{\frac{(N_1-1) s_1^2+(N_2-1) s_2^2}{N_1+N_2-2}}$$ Say we wanted to estimate d for the difference between self-esteem after cuddling the teddy compared to the box (control), assuming you successfully created the tibble (teddy_sum) then you should have this information: knitr::kable(teddy_sum, caption = "Summary statistics for the teddy and box groups within the two experiments")  We have a logical control group so d is simply: $$\hat{d}_\text{teddy vs box} = \frac{15-10}{5.96} = 0.839$$ If we use the pooled s then we'd get: \begin{aligned} \ s_p &= \sqrt{\frac{(N_1-1) s_1^2+(N_2-1) s_2^2}{N_1+N_2-2}} \ \ &= \sqrt{\frac{(10-1)5.96^2+(10-1)6.02^2}{10+10-2}} \ \ &= \sqrt{\frac{645.86}{18}} \ \ &= 5.99 \end{aligned} Basically, because the group standard deviations are more or less the same (5.96 and 6.02) the pooled estimate is very similar too (it falls between those two values). Consequently, the resulting effect size is not going to change much when we use the pooled estimate: $$\hat{d}_\text{teddy vs box} = \frac{15-10}{5.99} = 0.835$$ Let's look at computing Cohen's d using the pooled estimate in R. We're going to do this separately for the two studies in out tibble (teddy_tib). As a brief recap, this tibble contains 4 variables: • id: Participant ID • group: whether the participant cuddled a teddy bear or the cardboard box that contained the teddy bear • study_n: Factor that distinguishes the two studies in the data set. The first was based on a total N of 20 and the second was based on a total N of 200. We're going to use this variable to dplyr::filter() the tibble before we compute d. • self_esteem: Self-reported self-esteem scores Like many things in R, there are numerous packages that offer functions to compute effect sizes. We're going to use the cohen.d() function from the effsize package [@RN11406] primarily because it computes it from the raw data (rather than from summary statistics) and because it has an option to compute it in designs that use repeated measures (i.e. the means across conditions are dependent). The general format of this function is: cohen.d(formula = outcome ~ group_variable, pooled = TRUE, paired = FALSE, na.rm = FALSE, hedges.correction = FALSE, conf.level = 0.95) The arguments within this function are: • formula = outcome ~ group_variable: this allows us to specify the outcome variable (in this case self_esteem) and the variable that defines the two groups (in this case group). This format of specifying a model as outcome ~ predictor will become very familiar to you as we progress through these tutorials. • pooled = TRUE: by default the pooled standard deviation,s_p\$, will be used. If you set this option to FALSE then the standard deviation of the whole sample will be used. Typically you'd leave this option alone (i.e., use the pooled estimate).
• paired = FALSE: by default the function considers scores as coming from different entities (i.e. an independent designs). That's the design our teddy bear experiments use. However, if you have a repeated measures design (i.e. in our example participants cuddle both a teddy and a box at different points in time) then set paired to TRUE.
• na.rm = FALSE: we've seen this option before. By default missing values are not removed and you should set this option to TRUE if you have missing values.
• hedges.correction = FALSE: the option determines whether a correction is applied to d. This converts d to something called Hedges' g. By default the option isn't applied, but it is a good to apply it.
• conf.level = 0.95: by default the confidence interval will be a 95% one, but you can adjust this option to change the width of the confidence interval. The default is fine.

The function returns an object that contains the value of d, the confidence interval and some other information. We're going to use a pipe (%>%) to link:

• teddy_tib: our input will be the tibble of raw data
• dplyr::filter(study_n == "Total N = 20"): we're going to filter that tibble to extract the experiment based on a sample size of 20
• cohen.d() the function described above will compute the effect size.
teddy_tib %>%
dplyr::filter(study_n == "Total N = 20") %>%
effsize::cohen.d(formula = self_esteem ~ group, data = .)


This command takes the teddy_tib tibble, extracts the study based on 20 participants, then applies the cohen.d() function. Within this function we have left the defaults as they are, our formula specifies that we're predicting self_esteem from the variable group (which defines whether the individual hugs a teddy or a box), and we specify the data within the function with a period (data = .). By using a period we're telling the function to use the data coming through from the pipe. Copy this command in the code box below and run the code. Then edit it to apply Hedges' correction, then edit the code again to get the effect size for the experiment that had 200 participants.



# To get the Hedges' correction we'd add hedges.correction = T:

teddy_tib %>%
dplyr::filter(study_n == "Total N = 20") %>%
effsize::cohen.d(formula = self_esteem ~ group, data = ., hedges.correction = T)

# To get effectsizes we'd change the filter() function to extract the study based on 200 participants:

teddy_tib %>%
dplyr::filter(study_n == "Total N = 200") %>%
effsize::cohen.d(formula = self_esteem ~ group, data = ., hedges.correction = T)


Applying the Hedges' correction to both studies, we'd get these outputs:

teddy_tib %>%
dplyr::filter(study_n == "Total N = 20") %>%
effsize::cohen.d(formula = self_esteem ~ group, data = ., hedges.correction = T)

teddy_tib %>%
dplyr::filter(study_n == "Total N = 200") %>%
effsize::cohen.d(formula = self_esteem ~ group, data = ., hedges.correction = T)


This shows that for the smaller study, the effect of cuddling a teddy compared to a box on self-esteem was g = -0.80 [-1.78, 0.18]. Whether d (or g) has a positive or negative sign reflects which way around you subtracted the group means. The teddy group had a mean of 15 and and box group a mean of 10. The difference between these means will be 5 if you subtract the box mean from the teddy mean (15 - 10 = 5), but -5 if you subtract the teddy mean from the box mean (10 - 15 = -5). So, to interpret the effect size look at the group means (not the plus of minus sign of g). In this case, self-esteem is 0.80 of a standard deviation higher after cuddling a teddy than after cuddling a box. The confidence interval for this effect size suggests (if we assume this sample is one of the 95% that contain the population value) that the population effect could range between -1.78 and 0.18. Crucially this means the effect could be 0, and could reflect higher self-esteem in the box group OR in the teddy group.

  question("In the larger study the confidence interval ranged from -1.13 to -0.54. What does this tell us?",
answer("If this confidence interval is one of the 95% that contains the population value then the population value of the difference between group means lies between -1.13 to -0.54.", correct = TRUE),
answer("There is a 95% chance that the population value of the difference between group means lies between -1.13 to -0.54.", message = "You cannot make probability statements from a confidence interval. We don't know whether this particular CI is one of the 95% that contains the population value of the difference between means."),
answer("The probability of this confidence interval containing the population value is 0.95.", message = "The probability of this confidence interval containing the population value is either 0 (it doesn't) or 1 (it does) but it's impossible to know which."),
answer("I can be 95% confident that the population value of the difference between group means lies between -1.13 to -0.54.", message = "Confidence intervals do not quantify your subjective confidence."),
correct = "Correct - well done!",
allow_retry = T
)


### An optional tricky section

I've struggled to find a function that will automatically calculate d using the standard deviation of the control group. However, it is simple enough to get R to subtract two values and then divide by another value. For example, to add x and y and then divide by z you could execute (x + y)/z. Simple. What's not simple is re-arranging the data so that we have the group means in columns. It can be done but involves restructuring the data in a crazy pipe involving lots of functions that we haven't learnt! I'm going to just plough through how it's done, explain what everything does but not cover the new functions in detail. So, by all means ignore this section if you like.

Earlier we created a tibble called teddy_sum that includes the means, standard deviations and confidence intervals of those means. The problem is that the means and sds for the box and teddy groups are in different rows and we want them in different columns so that we can subtract them. This is to remind you of what the tibble currently contains:

knitr::kable(teddy_sum, caption = "Summary statistics for the teddy and box groups within the two experiments")


I'm going to restructure this tibble so that all of the information from a given experiment is in a single row (that is, create columns contain the means, sds and confidence intervals for each group). That's the hard part. Then I will add a column that compotes d (that's the easy part!).

Restructuring data using the tidyverse package tidyr is something that I find endlessly confusing, so if you find it confusing too then don't worry - you're in good company! To restructure the data we're going to use this pipe:

teddy_wide <-  teddy_sum %>%
tidyr::gather(measure, value, -c(study_n, group)) %>%
tidyr::unite(gp_measure, group, measure) %>%


It's a pipe that could carry gas across the Atlantic. Let's break it down.

• teddy_sum %>%: we begin with the teddy_sum tibble and carry that into the next function using the pipe operator (%>%).
• tidyr::gather(measure, value, -c(study_n, group)): this takes each variable in teddy_sum and places its value in a variable called value (you can name it what you like, I chose 'value') and creates a variable called measure (again, you can name it differently) in which it identifies to what the value relates. For example, the if the value is from the column mean then the value placed in measure will be the word 'mean'. Crucially, I have used -c(study_n, group) to exclude the variables study_n and group from this process. These variables retain their existing values. To get a feel for what's happening execute the pipe just to this point in the code box (i.e., execute teddy_sum %>% tidyr::gather(measure, value, -c(study_n, group))). Essentially the variables mean, sd, ci_low and ci_upp which were in 4 columns have been replaced by two columns: one (measure) contains the name of the original column and the other (value) contains the value.
• tidyr::unite(gp_measure, group, measure) is used to unite the variables group and measure into a single variable called gp_measure (again, choose a different name if you like). It basically combines the values from the columns and separates them with an underscore. For example, if the value of group is Box and the value of measure is mean then the resulting value in gp_measure will be Box_mean. In effect, we're creating a variable to contain column names to use when we transform the data back. Again, if you want to see what's going on use the code box to run the pipe up to this point (teddy_sum %>% tidyr::gather(measure, value, -c(study_n, group)) %>% tidyr::unite(gp_measure, group, measure)).
• tidyr::spread(gp_measure, value): Finally, we spread the data back out into separate columns using gp_measure to define the column names and value to define the values placed with each column.


teddy_wide <-  teddy_sum %>%
tidyr::gather(measure, value, -c(study_n, group)) %>%
tidyr::unite(gp_measure, group, measure) %>%
teddy_wide


Run the full pipe to create this new tibble (which I've called teddy_wide). Execute the name of the tibble to look at it. The resulting tibble contains these variables

• study_n: Factor that distinguishes the two studies in the data set. The first was based on a total N of 20 and the second was based on a total N of 200.
• Box_ci_low: lower limit of the confidence interval in the box group.
• Box_ci_upp: upper limit of the confidence interval in the box group.
• Box_mean: mean self-esteem in the box group.
• Box_sd: standard deviation of self-esteem in the box group.
• Teddy Bear_ci_low: lower limit of the confidence interval in the teddy group.
• Teddy Bear_ci_upp: upper limit of the confidence interval in the teddy group.
• Teddy Bear_mean: mean self-esteem in the teddy group.
• Teddy Bear_sd: standard deviation of self-esteem in the teddy group.

The variables for the teddy bear group have been named a bit clunkily (because we'd defined the group as 'Teddy Bear'). Let's tidy these names up using the rename() function:

teddy_wide <- teddy_wide %>%
rename(
teddy_ci_low = Teddy Bear_ci_low,
teddy_ci_upp = Teddy Bear_ci_upp,
teddy_mean = Teddy Bear_mean,
teddy_sd = Teddy Bear_sd
)


This code recreates the teddy_wide tibble from itself but creates a variable teddy_ci_low from the variable currently called Teddy Bear_ci_low and do on. In effect I'm renaming the 'Teddy Bear' variables to omit spaces and remove upper case letters (and be shorter). Try this in the code box and inspect the resulting tibble.

teddy_wide <-  teddy_sum %>%
tidyr::gather(measure, value, -c(study_n, group)) %>%
tidyr::unite(gp_measure, group, measure) %>%



teddy_wide <- teddy_wide %>%
rename(
teddy_ci_low = Teddy Bear_ci_low,
teddy_ci_upp = Teddy Bear_ci_upp,
teddy_mean = Teddy Bear_mean,
teddy_sd = Teddy Bear_sd
)
teddy_wide


Well done if you're still with me! Let's now do the (relatively) easy bit. Now we have the group means in columns, it's relatively straightforward to use mutate() to add a column that contains d:

teddy_wide <- teddy_wide %>%
dplyr::mutate(
d = (teddy_mean - Box_mean)/Box_sd
)


This code re-creates the teddy_wide tibble from itself, but then uses mutate() to add a variable called d, which is defined as (teddy_mean - Box_mean)/Box_sd. In other words it subtracts the scores in the column Box_mean from the scores in the column teddy_mean and divides the result by the scores in the column Box_sd. In other words, it takes the difference between the group means and divides by the standard deviation for the control group! Try this below and then view the resulting tibble to see the values of d.

teddy_wide <-  teddy_sum %>%
tidyr::gather(measure, value, -c(study_n, group)) %>%
tidyr::unite(gp_measure, group, measure) %>%

teddy_wide <- teddy_wide %>%
rename(
teddy_ci_low = Teddy Bear_ci_low,
teddy_ci_upp = Teddy Bear_ci_upp,
teddy_mean = Teddy Bear_mean,
teddy_sd = Teddy Bear_sd
)



teddy_wide <- teddy_wide %>%
dplyr::mutate(
d = (teddy_mean - Box_mean)/Box_sd
)
teddy_wide

question("Which of these statements about Cohen's *d* is **NOT** correct?",
answer("The value of *d* cannot exceed 1.", correct = TRUE, message = "This statement is false and so is the coirrect answer."),
answer("*d* is the difference between two means expressed in standard deviation units.", message = "This statement is true so is not the correct answer."),
answer("A *d* of 0.2 would be considered small", message = "This statement is true so is not the correct answer."),
answer("*d* can be computed using a control group standard deviation, the standard deviation of all scores or a pooled standard deviation.", message = "This statement is true so is not the correct answer."),
correct = "Correct - well done!",
allow_retry = T
)


## Bayes factors

### Bayes factors and priors

The Bayes factor quantifies the probability of the data given the alternative hypothesis (in this case that self-esteem differs in the teddy and box groups) relative to the probability of the data given the null hypothesis (in this case that self-esteem is the same in the teddy and box groups):

$$\text{Bayes factor} = \frac{p(\text{data}|\text{alternative})}{p(\text{data}|\text{null})} \$$

The Bayesian t-test uses a default prior distribution based on a value, r, which scales the distribution [@RN9316]. The advantage of this approach is that it enables people to use a Bayesian method without needing to choose a prior from an almost endless set of possibilities. By con straining the decisions, some of the hard thinking is removed. The disadvantage is that defaults can tempt people to use them without any thought, and it’s tricky to understand what’s going on under the hood. To encourage you to give some thought to your prior, let’s look under the hood (just a bit). The prior distribution is constrained to be from a family known as the Cauchy distribution. Members of the Cauchy family look a bit like a normal distributions, but they’re not. Figure 1 shows Cauchy distributions with different scale factors, r: that are medium, r = 0.5 (left); wide, r = 0.7071 (middle), and ultrawide, r = 1 (right). Each curve has a shaded region with boundaries from −r to +r and the region under the curve between these boundaries represents 50% of the area. Note that the width of this region gets wider as r increases. When you set the value of r you are assigning a 50% probability that the effect size (Cohen’s d) lies between −r and +r. Given the distribution is symmetrical, you’re also assigning a 25% prior belief that d is greater than r and the remaining 25% that d is less than −r.

If you use the default value of r = 0.7071 you are, therefore, saying that your prior belief is that there’s a 50% probability that the effect size (Cohen’s d) lies between −0.7071 and +0.7071, a 25% probability that it is greater than +0.7071 and a 25% probability that it is less than −0.7071. Remembering what d represents, you’re saying that there’s a 50% probability that the difference between the two means lies between −0.7071 and +0.7071 standard deviations. If you set r = 0.5 (Figure 1, left) the Cauchy distribution becomes narrower and you are similarly narrowing your prior beliefs to a 50% probability that the difference between the group means lies between −0.5 and +0.5 standard deviations. Setting r = 1 (Figure 1, right) widens the distribution, representing wider beliefs. You’d be placing a 50% probability that the difference between the group means lies between −1 and +1 standard deviations. Remembering that d = 0.8 is a large effect, you’d be assigning a 50% probability to the difference between means being somewhere between a very large effect in one direction and a very large effect in the opposite direction. This is too wide a belief for most situations (bearing in mind that another 50% of the probability is assigned to the effect being outside those limits).

### Bayes factors in R

The BayesFactor package contains various functions for computing Bayes factors. We'll use it at various points for other models. For now, we'll use the ttestBF() function, which is designed for situations where you want a Bayes factor that quantifies differences between two means. It has a general form of:

ttestBF(formula = outcome ~ group_variable, mu = 0, paired = FALSE, data = NULL, rscale = "medium")

It has some other arguments not listed above, but these are the key ones for our purposes:

• formula = outcome ~ group_variable: this allows us to specify the outcome variable (in this case self_esteem) and the variable that defines the two groups (in this case group). This is basically the same as the cohen.d() function.
• mu = 0: for this example we can ignore this but if you want to use the function on a one-sample or paired design then you can use this to specify the null value of the mean (or mean difference in a paired design). The default of 0 is usually what you would want.
• paired = FALSE: by default the function considers scores as coming from different entities (i.e. an independent designs). That's the design our teddy bear experiments use. However, if you have a repeated measures design (i.e. in our example participants cuddle both a teddy and a box at different points in time) then set paired to TRUE.
• data = Null: use this option to specify the tibble containing the data.
• rscale = "medium": This option specifies the prior. You can input a numeric value or use one of three pre-defined default priors. These defaults are "medium", "wide", and "ultrawide", which correspond to values of sqrt(2)/2 (i.e., 0.707), 1, and sqrt(2) respectively. I explained these priors earlier.

Like when we computed an effect size we're going to use a pipe (%>%) to link:

• teddy_tib: our input will be the tibble of raw data
• dplyr::filter(study_n == "Total N = 20"): we're going to filter that tibble to extract the experiment based on a sample size of 20
• ttestBF() the function described above will compute the Bayes factor.
teddy_tib %>%
dplyr::filter(study_n == "Total N = 20") %>%
BayesFactor::ttestBF(formula = self_esteem ~ group, data = .)


This command takes the teddy_tib tibble, extracts the study based on 20 participants, then applies the ttestBF() function. You'll get a warning about data coerced from a atibble to a dataframe because the function works with dataframes so it converts your tibble to a data frame. The warning is nothing to worry about.

Within this function we have left the defaults as they are (i.e. a 'medium' prior scale value of 0.707), our formula specifies that we're predicting self_esteem from the variable group (which defines whether the individual hugs a teddy or a box), and we specify the data within the function with a period (data = .). By using a period we're telling the function to use the data coming through from the pipe. Copy this command in the code box below and run the code. Then edit it to get the Bayes factor for the experiment that had 200 participants.



teddy_tib %>%
dplyr::filter(study_n == "Total N = 200") %>%
BayesFactor::ttestBF(formula = self_esteem ~ group, data = .)


The resulting Bayes factor is 1.27 which means that the probability of the data given the alternative hypothesis is about the same as the probability of the data given the null. The value here suggests that we should not change our prior beliefs. By using the default prior we assigned a 50% probability to the effect size (d) lying between −0.7071 and +0.7071, and this Bayes factor tells us not to change this belief.

For the larger sample the Bayes factor is 754768.4, which is huge. The probability of the data given the alternative hypothesis is 754768.4 greater than the probability of the data given the null. The value here suggests that we should shift our prior beliefs towards the alternative hypothesis by a factor of 754768.4. In other words, having looked at the data we should hold a very much stronger belief that the means are different.

Remember that the effect sizes in the two studies were basically identical, so it may seem strange that the Bayes factors are so different. However, if you think about it, Bayesian methods update prior beliefs using the data. A small amount of data will have less impact on beliefs than a huge sample (if the priors are the same). Imagine you're asked the probability that a sports team will win their next game. You guess at 33% change that they'll win (you assume win, lose or draw are equally likely). Take two scenarios: you're told that this team won their last game. This might increase your belief in a win, but probably not by much - perhaps that game was a fluke. The second scenario is that you're told that they have won their last 20 games. You're belief in a win will probably shift quite strongly towards a win - they have consistently shown a winning mentality. In both cases you're being told that they have won 100% of past games, but in the first case that is based on a tiny amount of data, in the latter case it's based on a lot of date. This illustrates the principle of how (other things being equal) larger data sets have increasing influence.

question("What is a Bayes factor??",
answer("The relative probability of the data given the alternative to the data given the null.", correct = TRUE),
answer("The ratio of the probability of the alternative hypothesis given the data to the probability of the null hypothesis given the data.", message = "This statement describes the posterior odds."),
answer("The ratio of the probability of the alternative hypothesis to the probability of the null hypothesis.", message = "This statement describes the prior odds."),
answer("Tell me a terrible and possible inappropriate joke about Bayesians.", message = "Q. Why do Bayesians make good proctologists? A. Because they're always looking at the posterior."),
correct = "Correct - well done!",
allow_retry = T

)


The tibble zhang_female_tib contains a small random subsample of females from a study that looked at performance on a maths test when the test is taken under the participant's own name, or a fake name. There are two variables:

• name: Specifies whether participants completed the test under their own name or a fake name
• accuracy: Accuracy on the maths test (%)

In the code box, produce some code to:

• Produce an error bar plot of the two group means and their confidence intervals
• Hedges g for the difference between mean accuracy scores in the two groups


zhang_plot <- ggplot2::ggplot(zhang_female_tib, aes(name, accuracy))
zhang_plot +
stat_summary(fun.data = "mean_cl_normal", size = 1) +
coord_cartesian(ylim = c(0,100)) +
scale_y_continuous(breaks = seq(0, 100, 10)) +
labs(x = "Experimental condition", y = "Accuracy on maths test (%)") +
theme_bw()

#to get g:
zhang_female_tib %>%
effsize::cohen.d(formula = accuracy ~ name, data = ., hedges.correction = T)

#to get a Bayes factor:

zhang_female_tib %>%
BayesFactor::ttestBF(formula = accuracy ~ name, data = .)


## Other resources

### Statistics

• The tutorials typically follow examples described in detail in @RN10163, so for most of them there's a thorough account in there. You might also find @RN4832 useful for the R stuff.
• There are free lectures and screen casts on my YouTube channel
• There are free statistical resources on my website www.discoveringstatistics.com