Epiconcept is made up of a team of doctors, epidemiologists, data scientists and digital specialists. For more than 20 years, Epiconcept has contributed to the improvement of public health by writing software, carrying out epidemiological studies, research, evaluation and training to better detect, monitor and prevent disease and to improve treatment.
The EpiStats package is a set of functions aimed at epidemiologists. They include commands for measures of association and impact for case control studies and cohort studies. They may be particularly useful for outbreak investigations and include univariate and stratified analyses.
The generic function crossTable provides a contingency table with optional parameters percent and statistic
The functions for cohort studies include the CS, CSTable and CSInter commands.
The functions for case control studies include the CC, CCTable and CCInter commands.
All variables used need to be numeric binary variables and coded as 0 and 1 or as factors.
The cohort study functions relate to cohort studies that measure risks, rather than rates in person- time.
The CS function provides a 2 by 2 table and measures the association between the outcome and one exposure. It includes the risk ratio and its 95% confidence intervals, the attributable fraction among the exposed and unexposed, and a chi square test and its p-value.
The CSTable function displays the measures of association between the outcome and a set of exposures in a table (risk ratios, confidence intervals and p-values). This helps the researcher to compare between exposures and provides a nice table for reports.
The CSInter function investigates the effect of a third variable on the association between an exposure and the outcome. It presents two by two tables stratified by the levels of a third value. It provides the Woolf test for homogeneity between stratum-specific risk ratios. It provides the crude risk ratio between an exposure and an outcome and the risk ratio adjusted by the third variable. CSInter helps the researcher understand whether a third variable may have an effect modifying or confounding effect on the association between an exposure and the outcome.
The CC function provides a 2 by 2 table and measures the association between the outcome and one exposure. It includes the odds ratio and its 95% confidence intervals, the attributable fraction among the exposed, and a chi square test and its p-value.
The CCTable function displays the measures of association between the outcome and a set of exposures in a table (odds ratios, confidence intervals and p-values). This helps the researcher to compare between exposures and provides a nice table for reports.
The CCInter function investigates the effect of a third variable on the association between an exposure and the outcome. It presents two by two tables stratified by the levels of a third value. It provides the Woolf test for homogeneity between stratum-specific odds ratios. It provides the crude odds ratio between an exposure and an outcome and the odds ratio adjusted by the third variable. CCInter helps the researcher understand whether a third variable may have an effect modifying or confounding effect on the association between an exposure and the outcome.
The dataset used in this vignette is from an outbreak investigation carried out in Germany in 1998 by Anja Hauri, Robert Koch Institute. It is used in case studies by organisations including EPIET, ECDC and EpiConcept.
The CSTable, CSInter, CCTable and CCInter functions are based on commands written in Stata by Gilles Desve, who we gratefully acknowledge.
library(EpiStats) library(dplyr) library(knitr) options(knitr.kable.NA = '') #options(width=200) data(Tiramisu) DF <- Tiramisu DF <- DF %>% # Recoding all variables as binary '0' or '1' mutate(salmon = ifelse(salmon == 9, NA, salmon)) %>% mutate(horseradish = ifelse(horseradish == 9, NA, horseradish)) %>% mutate(pork = ifelse(pork == 9, NA, pork)) %>% mutate(sex01 = case_when(sex == "females" ~ 1, sex == "males" ~ 0)) %>% mutate(agegroup = case_when(age < 30 ~ 0, age >= 30 ~ 1)) %>% mutate(tportion = case_when(tportion == 0 ~ 0, tportion == 1 ~ 1, tportion >= 2 ~ 2)) %>% mutate(tportion = as.factor(tportion)) %>% as.data.frame(stringsAsFactors=TRUE) Colnames <- DF %>% select(-ill, -age, -sex, -dateonset, -uniquekey, -tportion, -mportion) %>% colnames()
\newpage
Creates a contingency table of variable of interest and exposure. Percentage are optionals by row or by column. It can provides an optional statistic (fisher or chisquare).
crosTable(data, var1, var2, percent="none", statistic="none")
Recoding some data to have ordered factors
DF2 <- DF DF2$ill <- factor(DF2$ill, levels=c(1,0), ordered = TRUE) DF2$beer <- factor(DF2$beer, levels=c(1,0), ordered = TRUE) DF2$tira <- factor(DF2$tira, levels=c(1,0), ordered = TRUE) DF2$sex <- factor(DF2$sex, levels = c("males", "females"), ordered = TRUE)
ret <- crossTable(DF2, var1="ill", var2="tira") ret kable(ret, align="r")
ret <- crossTable(DF2, "ill", "sex", "col", "chi2") kable(ret, align="r", caption = "with columns %")
\newpage
NB: All parameters are unquoted
ret <- crossTable(DF2, ill, sex, row, fisher) ret kable(ret, align="r")
NB: All parameters are unquoted
ret <- crossTable(DF2, beer, sex, both, chi2) ret kable(ret, align="r", caption = "% rows and columns")
\newpage
CS analyses cohort studies with equal follow-up time per subject. The risk (the proportion of individuals who become cases) is calculated overall and among the exposed and unexposed. Note that all variables need to be numeric and binary and coded as "0" and "1".
Point estimates and confidence intervals for the risk ratio and risk difference are calculated, along with attributable or preventive fractions for the exposed and the total population. Additionally you can select if you want to display the Fisher's exact test, by specifying exact = TRUE. If you specify full = TRUE you can easily access useful statistics from the output tables.
CS(x, cases, exposure, exact, full=FALSE)
CS(DF, "ill", "mousse", exact = FALSE)
\newpage
The following results tables are outputs in "markdown" using the kable function.
result <- CS(DF, "ill", "beer", exact = TRUE, full = TRUE) kable(result$df1, align = "r") kable(result$df2, align = result$df2.align )
By storing the results in the object "result", you are able to use the result tables in Markdown as shown above. By specifying "full = TRUE" you can also easily use individual elements of the results. For example if you would like to view just the risk ratio, you can view it by typing:
result$st$risk_ratio$point_estimate
\newpage
CSTable is used for univariate analysis of cohort studies with several exposures. The results are summarised in one table with one row per exposure making comparisons between exposures easier and providing a useful table for integrating into reports. Note that all variables need to be numeric and binary and coded as "0" and "1".
The results of this function contain: The name of exposure variables, the total number of exposed, the number of exposed cases, the attack rate among the exposed, the total number of unexposed, the number of unexposed cases, the attack rate among the unexposed, risk ratios, 95% confidence intervals, 95% p-values.
You can optionally choose to display the Fisher's exact p-value instead of the Chi squared p-value, with the option exact = TRUE.
You can specify the sort order, with the option sort="rr" to order by risk ratios. The default sort order is by p-values.
The option "full = TRUE" provides you with useful formatting information, which can be handy if you're using "markdown".
CSTable(x, cases, exposure=c(), exact=FALSE, sort = "pvalue", full=FALSE)
CSTable(DF, "ill", exposure = c("sex01", "agegroup", "tira", "beer", "mousse", "wmousse", "dmousse", "redjelly", "fruitsalad", "tomato", "mince", "salmon", "horseradish", "chickenwin", "roastbeef", "pork"))
\newpage
The following results tables are outputs in "markdown" using the kable function.
res = CSTable(DF, "ill", sort = "rr", exposure = Colnames, full = TRUE) kable(res$df, digits=res$digits, align=res$align)
\newpage
The following results tables are outputs in "markdown" using the kable function.
res = CSTable(DF, "ill", exact = TRUE, exposure = Colnames, full = TRUE) kable(res$df, digits=res$digits, align=res$align)
By storing the results in the object "res", you are able to use the result table in Markdown as shown above. You can also use individual elements of the results. For example if you would like to view just the risk ratio, you can view it by typing (for example):
res$df$RR[2]
\newpage
CSInter is useful to determine the effects of a third variable on the association between an exposure and an outcome. CSInter produces 2 by 2 tables with stratum specific risk ratios, attributable risk among exposed and population attributable risk. Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2".
CSInter displays a summary with the crude RR, the Mantel Haenszel adjusted RR and the result of a "Woolf" test for homogeneity of stratum-specific RR.
The option "full = TRUE" provides you with useful formatting information, which can be handy if you're using "markdown".
CSInter(x, cases, exposure, by, full=FALSE)
CSInter(DF, cases="ill", exposure = "wmousse", by = "tira")
\newpage
The following results tables are outputs in "markdown" using the kable function.
res <- CSInter(DF, "ill", "beer", "tira", full = TRUE)
kable(res$df1, align="r") kable(res$df2, align="r")
\newpage
The following results tables are outputs in "markdown" using the kable function.
res <- CSInter(DF, "ill", "beer", "tportion", full = TRUE) kable(res$df1, align="r") kable(res$df2, align="r")
By storing the results in the object "res", you are able to use the result table in Markdown as shown above. You can also use individual elements of the results. For example if you would like to view just the Mantel-Haenszel risk ratio for beer adjusted for tportion, you can view it by typing:
res$df2$Stats[3]
\newpage
CC is used for case control studies to determine the association between an exposure and an outcome. Variables need to be binary and coded as "0" and "1". Point estimates and confidence intervals for the odds ratio are calculated along with attributable or preventive fractions for the exposed and total population. Additionally you can select if you want to display the Fisher's exact test, by specifying exact = TRUE. If you specify full = TRUE you can easily access useful statistics from the output tables.
CC(x, cases, exposure, exact, full=FALSE)
\newpage
cc(DF, "ill", "mousse", exact = TRUE)
\newpage
The following results tables are outputs in "markdown" using the kable function.
result <- CC(DF, "ill", "beer", exact = TRUE, full = TRUE) kable(result$df1, align="r") kable(result$df2, align=result$df2.align)
By storing the results in the object "result", you are able to use the result tables in Markdown as shown above. By specifying "full = TRUE" you can also easily use individual elements of the results.For example if you would like to view just the odds ratio, you can view it by typing:
result$st$odds_ratio$point_estimate
\newpage
CCTable is used for univariate analysis of case control studies with several exposures. The results are summarised in one table with one row per exposure making comparisons between exposures easier and providing a useful table for integrating into reports. Note that all variables need to be numeric and binary and coded as "0" and "1".
The results of this function contain: The name of exposure variables, the total number of cases, the number of exposed cases, the percentage of exposed among cases, the number of controls, the number of exposed controls, the percentage of exposed among controls, odds ratios, 95%CI intervals, p-values.
You can optionally choose to display the Fisher's exact p-value instead of the Chi squared p-value, with the option exact = TRUE.
You can specify the sort order, with the option sort="or" to order by odds ratios. The default sort order is by p-values.
The option "full = TRUE" provides you with useful formatting information, which can be handy if you're using "markdown".
CCTable(x, cases, exposure=c(), exact=FALSE, sort = "pvalue", full=FALSE)
CCTable(DF, "ill", exposure = c("sex01", "agegroup", "tira", "beer", "mousse", "wmousse", "dmousse", "redjelly", "fruitsalad", "tomato", "mince", "salmon", "horseradish", "chickenwin", "roastbeef", "pork"))
\newpage
The following results tables are outputs in "markdown" using the kable function.
res = CCTable(DF, "ill", sort = "or", exposure = Colnames) kable(res$df)
\newpage
The following results tables are outputs in "markdown" using the kable function.
res = CCTable(DF, "ill", exposure = Colnames, exact=TRUE) kable(res$df)
By storing the results in the object "res", you are able to use the result table in Markdown as shown above. You can also use individual elements of the results. For example if you would like to view just the odds ratio, you can view it by typing (for example):
res$df$OR[1]
\newpage
CCInter is useful to determine the effects of a third variable on the association between an exposure and an outcome. CCInter produces 2 by 2 tables with stratum specific odds ratios, attributable risk among exposed and population attributable risk.
Note that the outcome and exposure variable need to be numeric and binary and coded as "0" and 1". The third variable needs to be numeric, but may have more categories, such as "0", "1" and "2".
CCInter displays a summary with the crude OR, the Mantel Haenszel adjusted OR and the result of a Woolf test for homogeneity of stratum-specific OR.
The option "full = TRUE" provides you with useful formatting information, which can be handy if you're using "markdown".
CCInter (x, cases, exposure, by, full=FALSE)
CCInter(DF, cases="ill", exposure = "wmousse", by = "tira")
The following results tables are outputs in "markdown" using the kable function.
res <- CCInter(DF, cases="ill", exposure = "beer", by = "tira", full = TRUE) kable(res$df1, align=res$df1.align) kable(res$df2)
\newpage
The following results tables are outputs in "markdown" using the kable function.
res <- CCInter(DF, cases="ill", exposure = "beer", by = "tportion", full = TRUE) kable(res$df1, align=res$df1.align) kable(res$df2, align=res$df2.align)
By storing the results in the object "res", you are able to use the result table in Markdown as shown above. You can also use individual elements of the results. For example if you would like to view just the Mantel-Haenszel odds ratio for beer adjusted for tportion, you can view it by typing:
res$df2$Stats[3]
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.