Centaur is an R package for the implementation of observational cohort studies. It provides several alternative workflows to control for observed confounding by providing a set of configurable options to compute propensity scores, balance covariates between two exposure groups, evaluate the quality of balance and perform outcome analysis.

Centaur also uses a more traditional approach than the OHDSI cohort method and places the responsibility on the user to include all observed covariates likely to have an impact on the treatment choice or the outcome. The cohort data is currently loaded by a user as an R dataframe and can come from any source. It is therefore left up to the user to design the cohort appropriately. In contrast, the OHDSI cohort method creates the cohort dataset by a direct configurable query to a CDM instance and then includes all possible exposures, conditions etc as covariates by default.

At the same time, the OHDSI Cohort Method can also be directly called within Centaur to facilitate using regularized regression for cohorts with large numbers of covariates and to compare results using different methods to calculate a propensity score.


Features In Progress

Workflow Summary

Workflow Summary


The default available methods are determined by the number of covariates in the dataset, and the total number of subjects. These limits have largely been determined empirically based on performance. Depending on your available hardware, it may be feasible to use a given method with more (or fewer) covariates and/or subjects. Each of these limits can be overridden.


Simple visual inspection of the area of common support.

Score Distribution

"Violin" plots show the distribution of matched and unmatched control and treatment propensity scores.

Score Distribution

Using the stratification approach, compare the distribution in the treatment/control groups of a single covariate in each strata.

Age Distribution


?? R package ??

System Requirements

System requirements are highly dependent on the size of the dataset being analyzed. For any "real-world" dataset, we recommend at least a core i7 (or equivalent) and at least 8GB RAM.


(list of packages)

Getting Started

  1. On Windows, make sure RTools is installed.
  2. In R, use the following commands to download and install Centaur:

r install.packages("devtools") library(devtools) install_github("ohdsi/Centaur")

Read the whitepaper and Try the vignette! (Coming soon!)

Installation instructions for various systems

Getting Involved

Links to vignette, whitepaper and manual.



Build Status



The authors acknowledge the following team from AstraZeneca pharmaceuticals, Robert LoCasale, Michael Goodman, Ramin Arani, Yiduo Zhang, and Sudeep Karve for contributing to the requirements with their expertise in epidemiology, safety informatics, health economics and biostatistics and for reviewing the final product. The authors also acknowledge Jonathan Herz and Pramod Kumar for help with testing early versions of the package.

OHDSI/Centaur documentation built on May 9, 2017, 3:24 p.m.