Description Usage Arguments Details Value References

Calculates statistically significant difference in co-occurrence counts.

1 |

`A` |
A |

`B` |
A |

`nodes` |
A |

`fdr` |
The desired level at which to control the False Discovery Rate.
Default value is |

`collocates` |
A |

This function implements the method described in Hennessey and Wiegand (2017).

`A` and `B` are `data.frame`

s of the form

1 2 3 4 5 |

The `data.frame`

s encapsulate the co-occurrence counts for the
`(x, y)`

term pairs within a corpus. For a description of the
columns see the details section of the `surface`

function.

The `nodes` essentially act as a filter on the `A$x` and
`B$x` columns. For a description of the use of nodes see
Hennessey and Wiegand (2017).

`fdr` indicates the level at which the False Discovery Rate will be
controlled. For a description of the form of FDR used see
Benjamini and Hochberg (1995). For a description of the use of FDR in
this context see Hennessey and Wiegand (2017). For description of the
`p_adjusted` column in the returned structure see
`p.adjust`

.

The returned data structure is a `data.table`

.
A `data.table`

is also a `data.frame`

and will behave exactly
as such if the `data.table`

library is not loaded.

The returned `data.table`

contains details of all the
co-occurrences for which there is evidence of a difference in
co-occurrence between the two supplied data sets.
The effect size is calculated as the log base 2 of the odds ratio.
The effects size and its confidence interval are captured in the
`effect_size`, `CI_lower` and `CI_upper` columns.
The `p_value` column contains the non-adjusted p-value from the
Fisher's Exact Test.
For more details see Hennessey and Wiegand (2017).

For an example of usage see the ‘Proof of Concept’ vignette.

A `data.table`

of the form

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ```
Classes ‘data.table’ and 'data.frame': 11 variables:
$ x : chr
$ y : chr
$ H_A : int
$ M_A : int
$ H_B : int
$ M_B : int
$ effect_size : num
$ CI_lower : num
$ CI_upper : num
$ p_value : num
$ p_adjusted : num
- attr(*, "sorted")= chr "x" "y"
- attr(*, ".internal.selfref")=<externalptr>
- attr(*, "coco_metadata")=List of 4
..$ nodes : chr
..$ fdr : num
..$ PACKAGE_VERSION:Classes 'package_version', 'numeric_version'
.. ..$ : int
..$ date : Date, format: "2016-11-01"
``` |

Y. Benjamini and Y. Hochberg (1995) *Controlling the False Discovery
Rate: A Practical and Powerful Approach to Multiple Testing.*
Journal of the Royal Statistical Society. Series B (Methodological)
**57 (1)289–300**.

A. Hennessey and V. Wiegand and C. R. Tench and M. Mahlberg (2017)
*Comparing co-occurrences between corpora.* In preparation.

ravingmantis/CorporaCoCo documentation built on March 19, 2018, 9:08 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.