Description Usage Arguments Details Value Author(s) See Also
estimateSamplingUnit
Estimates the sampling unit for corpus sampling
1 2 | estimateSamplingUnit(korpus, sampleSizes = c(100, 500, 1000, 2000),
numSamples = 30)
|
korpus |
List containing the meta data for the corpus |
sampleSizes |
Integer vector of sample sizes to be evaluated |
numSamples |
Integer indicating number of samples to evaluate |
This function takes as its parameters, the korpus meta data and the POS tags selected for this analysis and compares the distributions of lexical features across pairs of samples of varying sizes. The results of chi-squared tests for selected features are averaged over the samples. The function returns a data frame indicating average chi-squared p-values for each feature and sampling unit size.
analysis A list containing:
sampleSizeSample size being tested
scores(long)Long dataframe of chi-squared scores at various sample sizes
scores(wide)Wide dataframe of chi-squared scores at various sample sizes
John James, j2sdatalab@gmail.com
analyzeLexicalFeatures
text2spc.fnc
lnre
lnre.spc
N
V
EV
chisq.test
Other sample size estimate functions: estimateCorpusSize
,
estimateRegisterSize
,
estimateSampleSize
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.