TwoSample or PairedSample Randomization (Permutation) Test for Location
Description
Perform a twosample or pairedsample randomization (permutation) test for location based on either means or medians.
Usage
1 2 3 
Arguments
x 
numeric vector of observations from population 1.
Missing ( 
y 
numeric vector of observations from population 2.
Missing ( In the case when 
fcn 
character string indicating which location parameter to compare between the two
groups. The possible values are 
alternative 
character string indicating the kind of alternative hypothesis. The possible values
are 
mu1.minus.mu2 
numeric scalar indicating the hypothesized value of the difference between the
means or medians. The default value is 
paired 
logical scalar indicating whether to perform a paired or twosample permutation
test. The possible values are 
exact 
logical scalar indicating whether to perform the exact permutation test (i.e.,
enumerate all possible permutations) or simply sample from the permutation
distribution. The default value is 
n.permutations 
integer indicating how many times to sample from the permutation distribution when

seed 
positive integer to pass to the R function 
tol 
numeric scalar indicating the tolerance to use for computing the pvalue for the
twosample permutation test. The default value is 
Details
Randomization Tests
In 1935, R.A. Fisher introduced the idea of a randomization test
(Manly, 2007, p. 107; Efron and Tibshirani, 1993, Chapter 15), which is based on
trying to answer the question: “Did the observed pattern happen by chance,
or does the pattern indicate the null hypothesis is not true?” A randomization
test works by simply enumerating all of the possible outcomes under the null
hypothesis, then seeing where the observed outcome fits in. A randomization test
is also called a permutation test, because it involves permuting the
observations during the enumeration procedure (Manly, 2007, p. 3).
In the past, randomization tests have not been used as extensively as they are now
because of the “large” computing resources needed to enumerate all of the
possible outcomes, especially for large sample sizes. The advent of more powerful
personal computers and software has allowed randomization tests to become much
easier to perform. Depending on the sample size, however, it may still be too
time consuming to enumerate all possible outcomes. In this case, the randomization
test can still be performed by sampling from the randomization distribution, and
comparing the observed outcome to this sampled permutation distribution.
TwoSample Randomization Test for Location (paired=FALSE
)
Let \underline{x} = x_1, x_2, …, x_{n1} be a vector of n1
independent and identically distributed (i.i.d.) observations
from some distribution with location parameter (e.g., mean or median) θ_1,
and let \underline{y} = y_1, y_2, …, y_{n2} be a vector of n2
i.i.d. observations from the same distribution with possibly different location
parameter θ_2.
Consider the test of the null hypothesis that the difference in the location parameters is equal to some specified value:
H_0: δ = δ_0 \;\;\;\;\;\; (1)
where
δ = θ_1  θ_2 \;\;\;\;\;\; (2)
and δ_0 denotes the hypothesized difference in the meansures of location (usually δ_0 = 0).
The three possible alternative hypotheses are the upper onesided alternative
(alternative="greater"
)
H_a: δ > δ_0 \;\;\;\;\;\; (3)
the lower onesided alternative (alternative="less"
)
H_a: δ < δ_0 \;\;\;\;\;\; (4)
and the twosided alternative
H_a: δ \ne δ_0 \;\;\;\;\;\; (5)
To perform the test of the null hypothesis (1) versus any of the three alternatives (3)(5), you can use the twosample permutation test. The two sample permutation test is based on trying to answer the question, “Did the observed difference in means or medians happen by chance, or does the observed difference indicate that the null hypothesis is not true?” Under the null hypothesis, the underlying distributions for each group are the same, therefore it should make no difference which group an observation gets assigned to. The twosample permutation test works by simply enumerating all possible permutations of group assignments, and for each permutation computing the difference between the measures of location for each group (Manly, 2007, p. 113; Efron and Tibshirani, 1993, p. 202). The measure of location for a group could be the mean, median, or any other measure you want to use. For example, if the observations from Group 1 are 3 and 5, and the observations from Group 2 are 4, 6, and 7, then there are 10 different ways of splitting these five observations into one group of size 2 and another group of size 3. The table below lists all of the possible group assignments, along with the differences in the group means.
Group 1  Group 2  Mean 1  Mean 2 
3, 4  5, 6, 7  2.5 
3, 5  4, 6, 7  1.67 
3, 6  4, 5, 7  0.83 
3, 7  4, 5, 6  0 
4, 5  3, 6, 7  0.83 
4, 6  3, 5, 7  0 
4, 7  3, 5, 6  0.83 
5, 6  3, 4, 7  0.83 
5, 7  3, 4, 6  1.67 
6, 7  3, 4, 5  2.5 
In this example, the observed group assignments and difference in means are shown in the second row of the table.
For a onesided upper alternative (Equation (3)), the pvalue is computed as the proportion of times that the differences of the means (or medians) in the permutation distribution are greater than or equal to the observed difference in means (or medians). For a onesided lower alternative hypothesis (Equation (4)), the pvalue is computed as the proportion of times that the differences in the means (or medians) in the permutation distribution are less than or equal to the observed difference in the means (or medians). For a twosided alternative hypothesis (Equation (5)), the pvalue is computed as the proportion of times the absolute values of the differences in the means (or medians) in the permutation distribution are greater than or equal to the absolute value of the observed difference in the means (or medians).
For this simple example, the onesided upper, onesided lower, and twosided pvalues are 0.9, 0.2 and 0.4, respectively.
Note: Because of the nature of machine arithmetic and how the permutation
distribution is computed, a onesided upper pvalue is computed as the proportion
of times that the differences of the means (or medians) in the permutation
distribution are greater than or equal to
[the observed difference in means (or medians)  a small tolerance value], where the
tolerance value is determined by the argument tol
. Similarly, a onesided
lower pvalue is computed as the proportion of times that the differences in the
means (or medians) in the permutation distribution are less than or equal to
[the observed difference in the means (or medians) + a small tolerance value].
Finally, a twosided pvalue is computed as the proportion of times the absolute
values of the differences in the means (or medians) in the permutation distribution
are greater than or equal to
[the absolute value of the observed difference in the means (or medians)  a small tolerance value].
In this simple example, we assumed the hypothesized differences in the means under the null hypothesis was δ_0 = 0. If we had hypothesized a different value for δ_0, then we would have had to subtract this value from each of the observations in Group 1 before permuting the group assignments to compute the permutation distribution of the differences of the means. As in the case of the onesample permutation test, if the sample sizes for the groups become too large to compute all possible permutations of the group assignments, the permutation test can still be performed by sampling from the permutation distribution and comparing the observed difference in locations to the sampled permutation distribution of the difference in locations.
Unlike the twosample Student's ttest, we do not have to worry about the normality assumption when we use a permutation test. The permutation test still assumes, however, that under the null hypothesis, the distributions of the observations from each group are exactly the same, and under the alternative hypothesis there is simply a shift in location (that is, the whole distribution of group 1 is shifted by some constant relative to the distribution of group 2). Mathematically, this can be written as follows:
F_1(t) = F_2(t  δ), \;\; ∞ < t < ∞ \;\;\;\;\; (6)
where F_1 and F_2 denote the cumulative distribution functions for
group 1 and group 2, respectively. If δ > 0, this implies that the
observations in group 1 tend to be larger than the observations in group 2, and
if δ < 0, this implies that the observations in group 1 tend to be
smaller than the observations in group 2. Thus, the shape and spread (variance)
of the two distributions should be the same whether the null hypothesis is true or
not. Therefore, the Type I error rate for a permutation test can be affected by
differences in variances between the two groups.
Confidence Intervals for the Difference in Means or Medians
Based on the relationship between hypothesis tests and confidence intervals, it is
possible to construct a twosided or onesided (1α)100\% confidence
interval for the difference in means or medians based on the twosample permutation
test by finding the values of δ_0 that correspond to obtaining a
pvalue of α (Manly, 2007, pp. 18–20, 114). A confidence interval
based on the bootstrap however, will yield a similar type of confidence interval
(Efron and Tibshirani, 1993, p. 214); see the help file for
boot
in the R package boot.
PairedSample Randomization Test for Location (paired=TRUE
)
When the argument paired=TRUE
, the arguments x
and y
are
assumed to have the same length, and the n1 = n2 = n differences
y_i = x_i  y_i, i = 1, 2, …, n are assumed to be independent
observations from some symmetric distribution with mean μ. The
onesample permutation test can then be applied
to the differences.
Value
A list of class "permutationTest"
containing the results of the hypothesis
test. See the help file for permutationTest.object
for details.
Note
A frequent question in environmental statistics is “Is the concentration of chemical X in Area A greater than the concentration of chemical X in Area B?”. For example, in groundwater detection monitoring at hazardous and solid waste sites, the concentration of a chemical in the groundwater at a downgradient well must be compared to “background”. If the concentration is “above” the background then the site enters assessment monitoring. As another example, soil cleanup at a Superfund site may involve comparing the concentration of a chemical in the soil at a “cleaned up” site with the concentration at a “background” site. If the concentration at the “cleaned up” site is “greater” than the background concentration, then further investigation and remedial action may be required. Determining what it means for the chemical concentration to be “greater” than background is a policy decision: you may want to compare averages, medians, 95'th percentiles, etc.
Hypothesis tests you can use to compare “location” between two groups include: Student's ttest, Fisher's randomization test (described in this help file), the Wilcoxon rank sum test, other twosample linear rank tests, the quantile test, and a test based on a bootstrap confidence interval.
Author(s)
Steven P. Millard (EnvStats@ProbStatInfo.com)
References
Efron, B., and R.J. Tibshirani. (1993). An Introduction to the Bootstrap. Chapman and Hall, New York, Chapter 15.
Manly, B.F.J. (2007). Randomization, Bootstrap and Monte Carlo Methods in Biology. Third Edition. Chapman & Hall, New York, Chapter 6.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with SPLUS. CRC Press, Boca Raton, FL, pp.426–431.
See Also
permutationTest.object
, plot.permutationTest
,
oneSamplePermutationTest
,
twoSamplePermutationTestProportion
,
Hypothesis Tests, boot
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155  # Generate 10 observations from a lognormal distribution with parameters
# mean=5 and cv=2, and and 20 observations from a lognormal distribution with
# parameters mean=10 and cv=2. Test the null hypothesis that the means of the
# two distributions are the same against the alternative that the mean for
# group 1 is less than the mean for group 2.
# (Note: the call to set.seed allows you to reproduce the same data
# (dat1 and dat2), and setting the argument seed=732 in the call to
# twoSamplePermutationTestLocation() lets you reproduce this example by
# getting the same sample from the permutation distribution).
set.seed(256)
dat1 < rlnormAlt(10, mean = 5, cv = 2)
dat2 < rlnormAlt(20, mean = 10, cv = 2)
test.list < twoSamplePermutationTestLocation(dat1, dat2,
alternative = "less", seed = 732)
# Print the results of the test
#
test.list
#Results of Hypothesis Test
#
#
#Null Hypothesis: mu.xmu.y = 0
#
#Alternative Hypothesis: True mu.xmu.y is less than 0
#
#Test Name: TwoSample Permutation Test
# Based on Differences in Means
# (Based on Sampling
# Permutation Distribution
# 5000 Times)
#
#Estimated Parameter(s): mean of x = 2.253439
# mean of y = 11.825430
#
#Data: x = dat1
# y = dat2
#
#Sample Sizes: nx = 10
# ny = 20
#
#Test Statistic: mean.x  mean.y = 9.571991
#
#Pvalue: 0.001
# Plot the results of the test
#
dev.new()
plot(test.list)
#==========
# The guidance document "Statistical Methods for Evaluating the Attainment of
# Cleanup Standards, Volume 3: ReferenceBased Standards for Soils and Solid
# Media" (USEPA, 1994b, pp. 6.226.25) contains observations of
# 1,2,3,4Tetrachlorobenzene (TcCB) in ppb at a Reference Area and a Cleanup Area.
# These data are stored in the data frame EPA.94b.tccb.df. Use the
# twosample permutation test to test for a difference in means between the
# two areas vs. the alternative that the mean in the Cleanup Area is greater.
# Do the same thing for the medians.
#
# The permutation test based on comparing means shows a significant differnce,
# while the one based on comparing medians does not.
# First test for a difference in the means.
#
mean.list < with(EPA.94b.tccb.df,
twoSamplePermutationTestLocation(
TcCB[Area=="Cleanup"], TcCB[Area=="Reference"],
alternative = "greater", seed = 47))
mean.list
#Results of Hypothesis Test
#
#
#Null Hypothesis: mu.xmu.y = 0
#
#Alternative Hypothesis: True mu.xmu.y is greater than 0
#
#Test Name: TwoSample Permutation Test
# Based on Differences in Means
# (Based on Sampling
# Permutation Distribution
# 5000 Times)
#
#Estimated Parameter(s): mean of x = 3.9151948
# mean of y = 0.5985106
#
#Data: x = TcCB[Area == "Cleanup"]
# y = TcCB[Area == "Reference"]
#
#Sample Sizes: nx = 77
# ny = 47
#
#Test Statistic: mean.x  mean.y = 3.316684
#
#Pvalue: 0.0206
dev.new()
plot(mean.list)
#
# Now test for a difference in the medians.
#
median.list < with(EPA.94b.tccb.df,
twoSamplePermutationTestLocation(
TcCB[Area=="Cleanup"], TcCB[Area=="Reference"],
fcn = "median", alternative = "greater", seed = 47))
median.list
#Results of Hypothesis Test
#
#
#Null Hypothesis: mu.xmu.y = 0
#
#Alternative Hypothesis: True mu.xmu.y is greater than 0
#
#Test Name: TwoSample Permutation Test
# Based on Differences in Medians
# (Based on Sampling
# Permutation Distribution
# 5000 Times)
#
#Estimated Parameter(s): median of x = 0.43
# median of y = 0.54
#
#Data: x = TcCB[Area == "Cleanup"]
# y = TcCB[Area == "Reference"]
#
#Sample Sizes: nx = 77
# ny = 47
#
#Test Statistic: median.x  median.y = 0.11
#
#Pvalue: 0.936
dev.new()
plot(median.list)
#==========
# Clean up
#
rm(test.list, mean.list, median.list)
graphics.off()
