estimateCorpusSize: estimateCorpusSize

Description Usage Arguments Details Value Author(s) See Also

View source: R/B04.estimateCorpusSize.R

Description

estimateCorpusSize Estimates corpus size based upon lexical features

Usage

1
estimateCorpusSize(korpus, sampleSize = 2000, numSamples = 100)

Arguments

korpus

List containing the corpus meta data

sampleSize

Numeric indicating the sampling unit size

numSamples

Numeric indicating the number of samples to analyize

Details

This function takes as its parameters, the korpus meta data and the POS tags selected for this analysis, the returns an estimate of total corpus size based upon the distribution of lexical features per n000-word samples of the text. This analysis is based upon Representativeness in Corpus Design Biber 1993 https://www.researchgate.net/publication/31460364_Representativeness_in_Corpus_Design

Value

corpusSize List cointaining:

Author(s)

John James, j2sdatalab@gmail.com

See Also

analyzeLexicalFeatures

Other sample size estimate functions: estimateRegisterSize, estimateSampleSize, estimateSamplingUnit


DataScienceSalon/predictifyR.3.0 documentation built on May 23, 2019, 8:25 p.m.