manual.md
In countfitteR: Comprehensive Automatized Evaluation of Distribution Models for Count Data

output: html_document: self_contained: no theme: cerulean toc: yes word_document: toc: yes

countfitteR

Each chart may be saved in the .svg format. A download button is always located at the top of a chart.

Tables may be downloaded in a specified format or printed using the buttons at the top of the table.

The data format suitable for upload is the .csv document, where a single column corresponds to a single count. Use raw counts instead a pivot table. If your document does not have headers, specify it using the "Header" checkbox. In this case, countfitter will automatically name your counts with the index number of a count prefixed by 'C'.

This panel contains descriptive statistics and summaries of the input data. It consists of three subpanels: 1. Count table: the input dataset in the tabular format allowing sanity checks and manual modification. 2. Summary: summary statistics of the input data (respectively mean, standard deviation, median, median absolute deviate, minimum, maximum, number of elements). 3. Distribution: a bar chart of the counts distribution followed by a pivot table.

Mean value estimates: the estimated value of mean ($\lambda$) and its confidence intervals. The BIC indicates the most appropriate distribution.
Coefficients: coefficients of fitted models.
Decision: the most appropriate model and the strength of the evidence (Kass and Raftery, 1995).

The bar charts represent theoretical counts depending on the chosen distribution. Red dots describe the real number of counts. The visualized data is also available in the tabular format.

Separate experiments: if this checkbox is marked, the countfitter assumes that experiments are separate and fits the distribution to each individually and independently. Otherwise, all counts are simultaneously fitted to a single model. It is more appropriate for situations, where there is assumption that all samples may be describe by the same distribution, for example in case of technical replicates.

Confidence level: the confidence level of confidence intervals.

Count distributions: count distributions fitted to data: Poisson, Negative Binomial, Zero-Inflated Poisson, Zero-Inflated Negative Binomial. Possible distributions are described further in the manual.

Overdispersion

One of the important features of the Poisson distribution is the equality of variance and expected value. Although count data is commonly assumed to be Poisson-distributed, we often encounter overdispersed datasets, when the variance is bigger than the mean. Three distributions included in countfitter: Zero-Inflated Poisson (ZIP), Negative Binomial (NB) and Zero-negative Binomial (ZINB) model overdispersed counts.

Overdispersion may be caused by the increased variability of counts, for example when a counting algorithm under- and overcounts. In such situation the data might have the NB distribution. The other cause of overdispersion is called zero-inflation and occurs in datasets, where some factor introduced faulty zeros. That means that some counts, regardless of their real state, are treated as zeros. In this case, data has the ZIP distribution. If both faulty zeros and increased variance affect the data, the ZINB distribution is the most appropriate.

Parameters:

$\lambda$ - Poisson parameter (average number of foci per cell).
$r$ - zero inflation (fraction of cells treated by system as having no foci regardless of their real state).
$\theta$ - dispersion parameter.

Usually the NB distribution is parameterized using $\mu$ and $\theta$, but to make comparison clearer, we use $\lambda$ instead of $\mu$. In this parameterization, NB and ZINB are treated as the mixture of Poisson and Gamma ($\Gamma$) distributions.

Distribution name | pmf -------------------|------------- Poisson |$$P{X = k} = \frac{\lambda^k \exp^{-\lambda}}{k!} $$ Zero-inflated Poisson |$$P{X = k} = \begin{cases} r + ( 1- r) \exp^{-\lambda},\text{if } k = 0\ r \frac{\lambda^k \exp^{-\lambda}}{k!},\text{if } k = 1, 2, \ldots \end{cases} $$ Negative Binomial |$$P{X = k} = \frac{\Gamma (\theta + k)}{\Gamma(\theta) k!} \left(\left( \frac{\theta}{\theta + \lambda} \right)^\theta \left( \frac{\lambda}{\theta + \lambda} \right) \right)^k$$ Zero-inflated Negative Binomial |$$P{X = k} = \begin{cases}r + (1 - r) \left( \frac{\theta}{\theta + \lambda} \right)^\theta,\text{if } k = 0\(1 - r) \frac{\Gamma (\theta + k)}{\Gamma(\theta) k!} \left(\left( \frac{\theta}{\theta + \lambda} \right)^\theta \left( \frac{\lambda}{\theta + \lambda} \right) \right)^k,\text{if } k = 1, 2, \ldots\end{cases}$$

Poisson and Negative Binomial distributions have the same expected value. In case of ZIP and ZINB, the expected value is smaller than the real average number of foci per cell.

Distribution name | Expected value -------------------|------------- Poisson |$$E(X) = \lambda $$ ZIP |$$E(X) = (1 - r) \lambda $$ NB |$$E(X) = \lambda $$ ZINB |$$E(X) = (1 - r) \lambda $$

Depending on the value of $r$ the variance of ZIP and ZINB may be smaller or bigger than the variance of Poisson distribution. In case of the NB distribution, the variance is always bigger than for the Poisson distribution, although the difference becomes negligible, when the $\theta$ is much bigger than $\lambda^2$.

Distribution name | Variance -------------------|------------- Poisson |$$\textrm{var}(X) = \lambda $$ ZIP |$$\textrm{var}(X) = \lambda (1 - r)(1 + \lambda r)$$ NB |$$\textrm{var}(X) = \lambda + \frac{\lambda^2}{\theta} $$ ZINB |$$\textrm{var}(X) = (1 - r) \lambda \left( 1 + r\lambda + \frac{1}{\theta} \right)$$

Any scripts or data that you put into this service are public.

countfitteR documentation built on July 1, 2025, 5:08 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

countfitteR
Comprehensive Automatized Evaluation of Distribution Models for Count Data

inst/countfitteR/readmes/manual.md
In countfitteR: Comprehensive Automatized Evaluation of Distribution Models for Count Data

countfitteR

General tips

Charts

Tables

Data upload

Count data

Fitted models

Compare distributions

Settings

Overdispersion

Overdispersed count data distributions

Try the countfitteR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

countfitteR Comprehensive Automatized Evaluation of Distribution Models for Count Data

inst/countfitteR/readmes/manual.md In countfitteR: Comprehensive Automatized Evaluation of Distribution Models for Count Data

countfitteR

General tips

Charts

Tables

Data upload

Count data

Fitted models

Compare distributions

Settings

Overdispersion

Overdispersed count data distributions

Try the countfitteR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

countfitteR
Comprehensive Automatized Evaluation of Distribution Models for Count Data

inst/countfitteR/readmes/manual.md
In countfitteR: Comprehensive Automatized Evaluation of Distribution Models for Count Data