Description Usage Arguments Details Value References Examples
View source: R/multivariance-functions.R
Determines the dependence structure as described in [3].
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
x |
matrix, each row of the matrix is treated as one sample |
vec |
vector, it indicates which columns are initially treated together as one sample |
verbose |
boolean, if |
detection.aim |
|
type |
the method used for the detection, one of ' |
structure.type |
either the ' |
c.factor |
numeric, larger than 0, a constant factor used in the case of ' |
list.cdm |
not required, the list of doubly centered distance matrices corresponding to |
alpha |
numeric between 0 and 1, the significance level used for the tests |
p.adjust.method |
a string indicating the p-value adjustment for multiple testing, see |
stop.too.many |
numeric, upper limit for the number of tested tuples. A warning is issued if it is used. Use |
... |
these are passed to |
Performs the detection of the dependence structure as described in [3]. In the clustered structure variables are clustered and treated as one variable as soon as a dependence is detected, the full structure treats always each variable separately. The detection is either based on tests with significance level alpha or a consistent estimator is used. The latter yields (in the limit for increasing sample size) under very mild conditions always the correct dependence structure (but the convergence might be very slow).
If fixed.rejection.level is not provided, the significance level alpha is used to determine which multivariances are significant using the distribution-free rejection level. As default the Holm method is used for p-value correction corresponding to multiple testing.
The resulting graph can be simplified (pairwise dependence can be represented by edges instead of vertices) using clean.graph.
Advanced:
The argument detection.aim is currently only implemented for structure.type = clustered. It can be used to check, if an expected dependence structure was detected. This might be useful for simulation studies to determine the empirical power of the detection algorithm. Hereto detection.aim is set to a list of vectors which indicate the expected detected dependence structures (one for each run of find.cluster). The vector has as first element the k for which k-tuples are detected (for this aim the detection stops without success if no k-tuple is found), and the other elements, indicate to which clusters all present vertices belong after the detection, e.g. c(3,2,2,1,2,1,1,2,1) expects that 3-tuples are detected and in the graph are 8 vertices (including those representing the detected 3 dependencies), the order of the 2's and 1's indicate which vertices belong to which cluster. If detection.aim is provided, the vector representing the actual detection is printed, thus one can use the output with copy-paste to fix successively the expected detection aims.
Note that a failed detection might invoke the warning:
1 2 |
returns a list with elements:
multivariancescalculated multivariances,
cdmscalculated doubly centered distance matrices,
graphgraph representing the dependence structure,
detectedboolean, this is only included if a detection.aim is given,
number.of.dep.tuplesvector, with the number of dependent tuples for each tested order. For the full dependence structure a value of -1 indicates that all tuples of this order are already lower order dependent, a value of -2 indicates that there were more than stop.too.many tuples,
structure.typeeither clustered or full,
typethe type of p-value estimation or consistent estimation used,
total.number.of.testsnumeric vector, with the number of tests for each group of tests,
typeI.error.probestimated probability of a type I error,
alphasignificance level used if a p-value estimation procedure is used,
c.factorfactor used if a consistent estimation procedure is used,
parameter.rangesignificance levels (or 'c.factor' values) which yield the same detection result.
For the theoretic background see the reference [3] given on the main help page of this package: multivariance-package.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | # structures for the datasets included in the package
dependence.structure(dep_struct_several_26_100)
dependence.structure(dep_struct_star_9_100)
dependence.structure(dep_struct_iterated_13_100)
dependence.structure(dep_struct_ring_15_100)
# basic examples:
x = coins(100) # 3-dependent
dependence.structure(x)
colnames(x) = c("A","B","C")
dependence.structure(x) # names of variables are used as labels
dependence.structure(coins(100),vec = c(1,1,2))
# 3-dependent rv of which the first two rv are used together as one rv, thus 2-dependence.
dependence.structure(x,vec = c(1,1,2)) # names of variables are used as labels
dependence.structure(cbind(coins(200),coins(200,k=5)),verbose = TRUE)
#1,2,3 are 3-dependent, 4,..,9 are 6-dependent
# similar to the the previous example, but
# the pair 1,3 is treated as one sample,
# anagously the pair 2,4. In the resulting structure one does not
# see anymore that the dependence of 1,2,3,4 with the rest is due
# to 4.
dependence.structure(cbind(coins(200),coins(200,k=5)),
vec = c(1,2,1,2,3,4,5,6,7),verbose = TRUE)
### Advanced:
# How to check the empirical power of the detection algorithm?
# Use a dataset for which the structure is detected, e.g. dep_struct_several_26_100.
# run:
dependence.structure(dep_struct_several_26_100,
detection.aim = list(c(ncol(dep_struct_several_26_100))))
# The output provides the first detection aim. Now we run the same line with the added
# detection aim
dependence.structure(dep_struct_several_26_100,detection.aim = list(c(3,1, 1, 1, 2, 2, 2, 3, 4,
5, 6, 7, 8, 8, 8, 9, 9, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 1, 2, 8, 9),
c(ncol(dep_struct_several_26_100))))
# and get the next detection aim ... thus we finally obtain all detection aims.
# now we can run the code with new sample data ....
N = 100
dependence.structure(cbind(coins(N,2),tetrahedron(N),coins(N,4),tetrahedron(N),
tetrahedron(N),coins(N,3),coins(N,3),rnorm(N)),
detection.aim = list(c(3,1, 1, 1, 2, 2, 2, 3, 4, 5, 6, 7, 8, 8, 8,
9, 9, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 1, 2, 8, 9),
c(4,1, 1, 1, 2, 2, 2, 3, 4, 5, 6, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11,
11, 12, 1, 2, 8, 9, 10, 11),
c(5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 1,
2, 4, 5, 6, 7, 3),
c(5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 1,
2, 4, 5, 6, 7, 3)))$detected
# ... and one could start to store the results and compute the rate of successes.
# ... or one could try to check how many samples are necessary for the detection:
re = numeric(100)
for (i in 2:100) {
re[i] =
dependence.structure(dep_struct_several_26_100[1:i,],verbose = FALSE,
detection.aim = list(c(3,1, 1, 1, 2, 2, 2, 3, 4, 5, 6, 7, 8,
8, 8, 9, 9, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 1, 2, 8, 9),
c(4,1, 1, 1, 2, 2, 2, 3, 4, 5, 6, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 10, 11, 11,
11, 11, 12, 1, 2, 8, 9, 10, 11),
c(5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7,
8, 1, 2, 4, 5, 6, 7, 3),
c(5, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7,
8, 1, 2, 4, 5, 6, 7, 3)))$detected
print(paste("First", i,"samples. Detected?", re[i]==1))
}
cat(paste("Given the 1 to k'th row the structure is not detected for k =",which(re == FALSE),"\n"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.