Description Usage Arguments Value References Examples
This function provides four graph-based two-sample tests for discrete data.
1 | g.tests_discrete(E, counts, test.type = "all", maxtype.kappa = 1.14, perm = 0)
|
E |
An edge matrix representing a similarity graph on the distinct values with the number of edges in the similarity graph being the number of rows and 2 columns. Each row records the subject indices of the two ends of an edge in the similarity graph. |
counts |
A K by 2 matrix, where K is the number of distinct values. It specifies the counts in the K distinct values for the two samples. |
test.type |
The default value is "all", which means all four tests are performed: the orignial edge-count test (Chen and Zhang (2013)), extension of the generalized edge-count test (Chen and Friedman (2016)), extension of the weighted edge-count test (Chen, Chen and Su (2016)) and extension of the maxtype edge-count tests (Zhang and Chen (2017)). Set this value to "original" or "o" to permform only the original edge-count test; set this value to "generalized" or "g" to perform only extension of the generalized edge-count test; set this value to "weighted" or "w" to perform only extension of the weighted edge-count test; and set this value to "maxtype" or "m" to perform only extension of the maxtype edge-count tests. |
maxtype.kappa |
The value of parameter(kappa) in the extension of the maxtype edge-count tests. The default value is 1.14. |
perm |
The number of permutations performed to calculate the p-value of the test. The default value is 0, which means the permutation is not performed and only approximate p-value based on asymptotic theory is provided. Doing permutation could be time consuming, so be cautious if you want to set this value to be larger than 10,000. |
test.statistic_a |
The test statistic using 'average' method to construct the graph. |
test.statistic_u |
The test statistic using 'union' method to construct the graph. |
pval.approx_a |
Using 'average' method to construct the graph, the approximated p-value based on asymptotic theory. |
pval.approx_u |
Using 'union' method to construct the graph, the approximated p-value based on asymptotic theory. |
pval.perm_a |
Using 'average' method to construct the graph, the permutation p-value when argument 'perm' is positive. |
pval.perm_u |
Using 'union' method to construct the graph, the permutation p-value when argument 'perm' is positive. |
Friedman J. and Rafsky L. Multivariate generalizations of the WaldWolfowitz and Smirnov two-sample tests. The Annals of Statistics, 7(4):697-717, 1979.
Chen, H. and Zhang, N. R. Graph-based tests for two-sample comparisons of categorical data. Statistica Sinica, 2013.
Chen, H. and Friedman, J. H. A new graph-based two-sample test for multivariate and object data. Journal of the American Statistical Association, 2016.
Chen, H., Chen, X. and Su, Y. A weighted edge-count two sample test for multivariate and object data. Journal of the American Statistical Association, 2017.
Zhang, J. and Chen, H. Graph-based two-sample tests for discrete data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | # the "example_discrete" data contains three two-sample counts data
# represted in the matrix form: counts1, counts2, counts3
# and the corresponding distance matrix on the distinct values: ds1, ds2, ds3.
data(example_discrete)
# counts1 is a K by 2 matrix, where K is the number of distinct values.
# It specifies the counts in the K distinct values for the two samples.
# ds1 is the corresponding distance matrix on the distinct values.
# The data is generated from two samples with mean shift.
Knnl = 3
E1 = getGraph(counts1, ds1, Knnl, graph = "nnlink")
g.tests_discrete(E1, counts1)
# counts2 is a K by 2 matrix, where K is the number of distinct values.
# It specifies the counts in the K distinct values for the two samples.
# ds2 is the corresponding distance matrix on the distinct values.
# The data is generated from two samples with spread difference.
Kmst = 6
E2 = getGraph(counts2, ds2, Kmst, graph = "mstree")
g.tests_discrete(E2, counts2)
# counts3 is a K by 2 matrix, where K is the number of distinct values.
# It specifies the counts in the K distinct values for the two samples.
# ds3 is the corresponding distance matrix on the distinct values.
# The data is generated from two samples with mean shift and spread difference.
Knnl = 3
E3 = getGraph(counts3, ds3, Knnl, graph = "nnlink")
g.tests_discrete(E3, counts3)
## Uncomment the following line to get permutation p-value with 200 permutations.
# Knnl = 3
# E1 = getGraph(counts1, ds1, Knnl, graph = "nnlink")
# g.tests_discrete(E1, counts1, test.type = "all", maxtype.kappa = 1.31, perm = 300)
|
$original
$original$test.statistic_a
[1] -1.296305
$original$pval.approx_a
[1] 0.09743521
$original$test.statistic_u
[1] -1.043946
$original$pval.approx_u
[1] 0.1482552
$generalized
$generalized$test.statistic_a
[1] 5.794162
$generalized$pval.approx_a
[1] 0.05518408
$generalized$test.statistic_u
[1] 17.08936
$generalized$pval.approx_u
[1] 0.0001945777
$weighted
$weighted$test.statistic_a
[1] 2.401853
$weighted$pval.approx_a
[1] 0.008156133
$weighted$test.statistic_u
[1] 4.130076
$weighted$pval.approx_u
[1] 1.813213e-05
$maxtype
$maxtype$test.statistic_a
[1] 2.738112
$maxtype$pval.approx_a
[1] 0.01428503
$maxtype$test.statistic_u
[1] 4.708287
$maxtype$pval.approx_u
[1] 2.063016e-05
$original
$original$test.statistic_a
[1] -1.103914
$original$pval.approx_a
[1] 0.1348153
$original$test.statistic_u
[1] 2.157404
$original$pval.approx_u
[1] 0.9845129
$generalized
$generalized$test.statistic_a
[1] 2.683046
$generalized$pval.approx_a
[1] 0.2614471
$generalized$test.statistic_u
[1] 14.12116
$generalized$pval.approx_u
[1] 0.0008582815
$weighted
$weighted$test.statistic_a
[1] 0.3813054
$weighted$pval.approx_a
[1] 0.3514883
$weighted$test.statistic_u
[1] 0.5212234
$weighted$pval.approx_u
[1] 0.3011056
$maxtype
$maxtype$test.statistic_a
[1] 1.593001
$maxtype$pval.approx_a
[1] 0.1832904
$maxtype$test.statistic_u
[1] 3.721489
$maxtype$pval.approx_u
[1] 0.000746299
$original
$original$test.statistic_a
[1] -1.318211
$original$pval.approx_a
[1] 0.09371657
$original$test.statistic_u
[1] -0.6559303
$original$pval.approx_u
[1] 0.2559345
$generalized
$generalized$test.statistic_a
[1] 6.955576
$generalized$pval.approx_a
[1] 0.03087563
$generalized$test.statistic_u
[1] 10.0928
$generalized$pval.approx_u
[1] 0.006432458
$weighted
$weighted$test.statistic_a
[1] 2.588199
$weighted$pval.approx_a
[1] 0.004823963
$weighted$test.statistic_u
[1] 3.110183
$weighted$pval.approx_u
[1] 0.0009348578
$maxtype
$maxtype$test.statistic_a
[1] 2.950547
$maxtype$pval.approx_a
[1] 0.007980781
$maxtype$test.statistic_u
[1] 3.545608
$maxtype$pval.approx_u
[1] 0.001326199
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.