stargazer: stargazer

View source: R/stargazer.R

stargazerR Documentation

stargazer

Description

The stargazer command produces LaTeX code, HTML code and ASCII text for well-formatted tables that hold regression analysis results from several models side-by-side. It can also output summary statistics and data frame content. stargazer supports a large number model objects from a variety of packages. Please see stargazer models.

Usage

stargazer(  ..., 
            type = "latex", title = "", style = "default", 
            summary = NULL, out = NULL, out.header = FALSE,
            column.labels = NULL, column.separate = NULL,
            covariate.labels = NULL, dep.var.caption = NULL,
            dep.var.labels = NULL, dep.var.labels.include = TRUE,
            align = FALSE, 
            coef = NULL, se = NULL, t = NULL, p = NULL,
            t.auto = TRUE, p.auto = TRUE,
            ci = FALSE, ci.custom = NULL,
            ci.level = 0.95, ci.separator = NULL,
            add.lines = NULL, 
            apply.coef = NULL, apply.se = NULL, 
            apply.t = NULL, apply.p = NULL, apply.ci = NULL,
            colnames = NULL,
            column.sep.width = "5pt",
            decimal.mark = NULL, df = TRUE,
            digit.separate = NULL, digit.separator = NULL,
            digits = NULL, digits.extra = NULL, flip = FALSE,
            float = TRUE, float.env="table",
            font.size = NULL, header = TRUE,
            initial.zero = NULL, 
            intercept.bottom = TRUE, intercept.top = FALSE, 
            keep = NULL, keep.stat = NULL,
            label = "", model.names = NULL, 
            model.numbers = NULL, multicolumn = TRUE,
            no.space = NULL,
            notes = NULL, notes.align = NULL, 
            notes.append = TRUE, notes.label = NULL, 
            object.names = FALSE,
            omit = NULL, omit.labels = NULL, 
            omit.stat = NULL, omit.summary.stat = NULL,
            omit.table.layout = NULL,
            omit.yes.no = c("Yes", "No"), 
            order = NULL, ord.intercepts = FALSE, 
            perl = FALSE, report = NULL, rownames = NULL,
            rq.se = "nid", selection.equation = FALSE, 
            single.row = FALSE,
            star.char = NULL, star.cutoffs = NULL, 
            suppress.errors = FALSE, 
            table.layout = NULL, table.placement = "!htbp",
            zero.component = FALSE, 
            summary.logical = TRUE, summary.stat = NULL,
            nobs = TRUE, mean.sd = TRUE, min.max = TRUE, 
            median = FALSE, iqr = FALSE )

Arguments

...

one or more model objects (for regression analysis tables) or data frames/vectors/matrices (for summary statistics, or direct output of content). They can also be included as lists (or even lists within lists).

type

a character vector that specifies what type of output the command should produce. The possible values are "latex" (default) for LaTeX code, "html" for HTML/CSS code, "text" for ASCII text output.

title

a character vector with titles for the tables.

style

a character string that specifies what style, typically designed to resemble an existing academic journal, should be used in producing the tables. This argument is not case-sensitive. See list of supported styles.

summary

a logical value indicating whether the package should output a summary statistics table when given a data frame. If FALSE, the package will instead output the contents of the data frame.

out

a character vector that contains the path(s) of output files. Depending on the file extension (.tex, .txt, .htm or .html), either a LaTeX/HTML source file or an ASCII text output file will be produced. For any other file extension, the value of the type argument will determine the type of output file.

out.header

a logical value that indicates whether the LaTeX or HTML file output should contain a code header (if TRUE) or just the chunk of code that creates the output (if FALSE).

column.labels

a character vector of labels for columns in regression tables. Their layout, in terms of the number of columns associated with each label, is given by the argument column.separate.

column.separate

a numeric vector that specifies how column.labels should be laid out across regression table columns. A value of c(2, 1, 3), for instance, will apply the first label to the two first columns, the second label to the third column, and the third label will apply to the following three columns (i.e., columns number four, five and six). If the argument's value is NULL or the regression table contains more columns than are referred to in column.separate, a value of 1 is assumed for each "excess" column label.

covariate.labels

a character vector of labels for covariates in regression tables. A value of NA for any element means that stargazer will print the corresponding variable name. In the default case of NULL, variable names are printed.

dep.var.caption

a character vector that specifies the caption to appear above dependent variable labels. A value of NULL denotes the default caption for the chosen style. An empty string (i.e., "") will lead stargazer to omit the caption.

dep.var.labels

a character vector of labels for the dependent variables in regression tables. A value of NA for any element means that stargazer will print the corresponding variable name. In the default case of NULL, variable names are printed.

dep.var.labels.include

a logical value that toggles whether dependent variable labels will be included in the regression table.

align

a logical value indicating whether numeric values in the same column should be aligned at the decimal mark in LaTeX output. Requires \usepackage{dcolumn} in LaTeX preamble.

coef

a list of numeric vectors that will replace the default coefficient values for each model. Element names will be used to match coefficients to individual covariates, and should therefore match covariate names. A NULL vector indicates that, for a given model, the default set of coefficients should be used. By contrast, an NA vector means that all of the model's coefficients should be left blank.

se

a list of numeric vectors that will replace the default coefficient values for each model. Behaves exactly like the argument coef.

t

a list of numeric vectors that will replace the default test statistics (e.g., t-scores, or z-scores) for each model. Like coef and se, test statistics are matched to covariates by their element names.

p

a list of numeric vectors that will replace the default p-values for each model. Matched by element names. These will form the basis of decisions about significance stars.

t.auto

a logical value that indicates whether stargazer should calculate the test statistics (i.e., the z-scores) automatically if coefficients or standard errors are supplied by the user (from arguments coef and se) or modified by a function (from arguments apply.coef or apply.se). If FALSE, the package will use model's default values if t is NULL.

p.auto

a logical value that indicates whether stargazer should calculate the p-values, using the standard normal distribution, if coefficients or standard errors are supplied by the user (from arguments coef and se) or modified by a function (from arguments apply.coef or apply.se). If FALSE, the package will use model's default values if p is NULL.

ci

a logical vector that indicates, for each column, whether stargazer should, in regression tables, replace standard errors by confidence intervals. If the value is NA or unspecified, then the value from the last preceding specified column is used.

ci.custom

a list of two-column numeric matrices that will replace the default confidence intervals for each model. The first and second columns represent the lower and the upper bounds, respectively. Matched by element names.

ci.level

a numeric vector that specifies, for each column, the confidence level to be used in regression tables when argument ci is set to TRUE. By default, stargazer will report 95 percent confidence intervals. If the value is NA or unspecified, then the value from the last preceding specified column is used.

ci.separator

a character string that will serve as the separator between the lower and upper bounds of reported confidence intervals.

add.lines

a list of vectors (one vector per line) containing additional lines to be included in the table. Each element of the listed vectors will be put into a separate column.

apply.coef

a function that will be applied to the coefficients.

apply.se

a function that will be applied to the standard errors.

apply.t

a function that will be applied to the test statistics.

apply.p

a function that will be applied to the p-values.

apply.ci

a function that will be applied to the lower and upper bounds of the confidence intervals.

colnames

a logical value that toggles column names on or off when printing data frames, vectors or matrices.

column.sep.width

a character string that specifies, in LaTeX code, the width of the space that separates columns in LaTeX tables. The default value is "5pt".

decimal.mark

a character string that will serve as the decimal mark. For instance, the string "," will represent decimal commas, while "." means tables will use decimal points.

df

a logical value that indicates whether the degrees of freedom of model statistics should be reported.

digit.separate

a numeric vector that indicates where digit separators should be placed. The first element of the vector indicates the number of digits (counted from the decimal mark to the left) that will be separated. The second element indicates the number of digits that will be separated from that 'first' separator, and so on. A value of 3 corresponds to a thousands separator, while a value of 0 indicates no separation. Alternatively, digit.separate can be one of the following character strings: "lakh" (equivalent to c(4,3)), "china" or "japan" (both equivalent to a value of 4).

digit.separator

a character string that will serve as the digit (e.g., thousands) separator. Commonly used strings include "," for a comma separator, " " for a single space separator, and "" for no separation.

digits

an integer that indicates how many decimal places should be used. A value of NA indicates that no rounding should be done at all, and that all available decimal places should be reported.

digits.extra

an integer indicating the maximum number of additional decimal places to be used if a number, rounded to digits decimal places, is equal to zero.

flip

a logical value that flips the vertical and horizontal axes when printing summary statistic tables or vector, matrix and data frame content.

float

a logical value that indicates whether the resulting table will be a floating table (set off, for instance, by \begin{table} and \end{table}).

float.env

a character string that specifies the floating environment of the resulting LaTeX table (when argument float is set to TRUE). Possible values are "table" (default), "table*" and "sidewaystable" (requires \usepackage{dcolumn} in LaTeX preamble).

font.size

a character string that specifies the font size used in the table. The font can be one of the following: "tiny", "scriptsize", "footnotesize", "small", "normalsize", "large", "Large", "LARGE", "huge", "Huge". If NULL (default), no particular font is imposed.

header

a logical value indicating whether a header (containing the name and version of the package, the author's name and contact information, and the date and time of table creation) should appear in comments at the beginning of the LaTeX code.

initial.zero

a logical value indicating whether an initial zero should be printed before the decimal mark if a number is between 0 and 1.

intercept.bottom

a logical value indicating whether the intercept (or constant) coefficients should be on the bottom of the table.

intercept.top

a logical value indicating whether the intercept (or constant) coefficients should be on the top of the table.

keep

a vector of regular expressions that specifies which of the explanatory variables should be kept in the table. Alternatively, this argument can be a numeric vector whose elements indicate which variables (from top to bottom, or left to right) should be kept. The default value of NULL means that all variables will be kept.

keep.stat

a character vector that specifies which model statistics should be kept in the regression table output. For instance keep.stat = c("n","ll") will produce a table that only includes statistics for the number of observations and log likelihood. See the list of statistic codes. This argument is not case-sensitive.

label

a character string containing the \label{} TeX markers for the tables.

model.names

a logical value indicating whether model names (e.g., "OLS" or "probit") should be included in the table.

model.numbers

a logical value indicating whether models should be numbered. No number is used whenever a regression table includes only one model.

multicolumn

a logical value indicating whether dependent variables and model names (e.g., "OLS" or "probit") should be reported across several columns if they remain identical.

no.space

a logical value indicating whether all empty lines should be removed from the table.

notes

a character vector containing notes to be included below the table. The character strings can include special substrings that will be replaced by the corresponding cutoffs for statistical significance 'stars': [*], [**], and [***] will be replaced by the cutoffs, in percentage terms, for one, two and three 'stars,' respectively (e.g., 10, 5, and 1). Similarly, [0.*], [0.**] and [0.***] will be replaced by the numeric value of cutoffs for one, two and three 'stars' (e.g., 0.1, 0.05, and 0.01). [.*], [.**] and [.***] will omit the leading zeros (e.g., .1, .05, .01).

notes.align

a character string that specifies how notes should be aligned under the table. One of three strings can be used: "l" for left alignment, "r" for right alignment, and "c" for centering. This argument is not case-sensitive.

notes.append

a logical value that indicates whether notes should be appended to the standard note(s) associated with the table's style (typically an explanation of significance cutoffs). If the argument's value is set to FALSE, the character strings provided in notes will replace any existing/default notes.

notes.label

a character string containing a label for the notes section of the table.

object.names

a logical value indicating whether object names should be included in the table.

omit

a vector of regular expressions that specifies which of the explanatory variables should be omitted from presentation in the table. Alternatively, this argument can be a numeric vector whose elements indicate which variables (from top to bottom, or left to right) should be omitted. This argument might be used, for instance, to exclude fixed effects dummies from being presented. The default value of NULL means that no variables will be excluded.

omit.labels

a character vector of labels that correspond to each of the regular expressions in omit, and that will be used in a sub-table that indicates whether variables have been omitted from a given model. omit and omit.labels must be equal in length.

omit.stat

a character vector that specifies which model statistics should be omitted from regression table output. For instance omit.stat = c("ll","rsq") will omit the log-likelihood and the R-squared statistics. See the list of statistic codes. This argument is not case-sensitive.

omit.summary.stat

a character vector that specifies which summary statistics should be omitted from summary statistics table output. See the list of summary statistic codes. This argument is not case-sensitive.

omit.table.layout

a character string that specifies which parts of the table should be omitted from the output. Each letter in the string indicates a particular part of the table, as specified by the table layout characters. For instance, omit.table.layout = "sn" will omit the model statistics and notes.

omit.yes.no

a character vector of length 2 that contains the 'yes' and 'no' strings to indicate whether, in any specific model, variables were omitted from the table, as specified by "omit".

order

a vector of regular expressions (or of numerical indexes) that indicates the order in which variables will appear in the output.

ord.intercepts

a logical value indicating whether intercepts for models with ordered dependent variables (such as ordered probit, or ordered logit) are included in the table.

perl

a logical value indicating whether perl-compatible regular expressions should be used. If FALSE, the package will assume the default extended regular expressions.

report

a character string containing only elements of "v", "c", "s","t", "p", "*" that determines whether, and in which order, variable names ("v"), coefficients ("c"), standard errors/confidence intervals ("s"), test statistics ("t") and p-values ("p") should be reported in regression tables. If one of the aforementioned letters is followed by an asterisk ("*"), significance stars will be reported next to the corresponding statistic.

rownames

a logical value that toggles row names on or off when printing data frames, vectors or matrices.

rq.se

a character string that specifies the method used to compute standard errors for rq (quantile regression) objects. Possible values are "iid", "nid", "ker" and "boot".

single.row

a logical value that indicates whether regression and standard errors (or confidence intervals) should be reported on the same row. For convenience in formatting the resulting table, argument no.space is automatically set to TRUE when single.row is TRUE.

selection.equation

a logical value that indicates whether the selection equation (when argument is set to TRUE) or the outcome equation (default) will be reported for heckit and selection models from the package sampleSelection.

star.char

a character string to be used as the 'star' to denote statistical significance.

star.cutoffs

a numeric vector that indicates the statistical signficance cutoffs for the statistical significance 'stars.' For elements with NA values, the corresponding 'star' will not be used.

suppress.errors

a logical value that indicates whether stargazer should suppress the output of its error messages.

table.layout

a character string that specifies which parts of the table should be included in the output, in the order provided by the user. Each letter in the string indicates a particular part of the table, as specified by the table layout characters. For instance, table.layout = "#tn" will report the model numbers, coefficient table and notes only.

table.placement

a character string containing only elements of "h", "t","b", "p", "!", "H" that determines the table placement in its LaTeX floating environment.

zero.component

a logical value indicating whether to report coefficients for the zero component of zeroinfl and hurdle estimation results. If FALSE, the count component is displayed.

summary.logical

a logical value indicating whether logical variables should be reported in summary statistics table. If so, they will be treated as if they had values of 0 (corresponding to FALSE) and 1 (TRUE).

summary.stat

a character vector that specifies which summary statistics should be included in the summary statistics table output. See the list of summary statistic codes. This argument is not case-sensitive.).

nobs

a logical value that toggles whether the number of observations (N) for each variable is shown in summary statistics tables.

mean.sd

a logical value that toggles whether variable means and standard deviations are shown in summary statistics tables.

min.max

a logical value that toggles whether variable minima and maxima are shown in summary statistics tables.

median

a logical value that toggles whether variable medians are shown in summary statistics tables.

iqr

a logical value that toggles whether the 25th and 75th percentiles for each variable are shown in summary statistics tables. ('iqr' stands for interquartile range.)

Details

Arguments with a value of NULL will use the default settings of the requested style.

Value

stargazer uses cat() to output LaTeX/HTML code or ASCII text for the table. To allow for further processing of this output, stargazer also returns the same output invisibly as a character vector. You can include the produced tables in your paper by inserting stargazer LaTeX output into your publication's TeX source. Alternatively, you can use the out argument to save the output in a .tex or .txt file.

To include stargazer tables in Microsoft Word documents (e.g., .doc or .docx), please follow the following procedure: Use the out argument to save output into an .htm or .html file. Open the resulting file in your web browser. Copy and paste the table from the web browser to your Microsoft Word document.

Acknowledgments and New Features

I would like to thank everyone who has tested this package, or provided useful comments and suggestions. Please see stargazer package acknowledgments.

See stargazer news for a list of new models and features in each release of stargazer.

Please cite as:

Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables. R package version 5.2.3. https://CRAN.R-project.org/package=stargazer

Author(s)

Dr. Marek Hlavac < marek.hlavac at gmail.com >
Social Policy Institute, Bratislava, Slovakia

Examples

## create summary statistics table for 'attitude' data frame
stargazer(attitude)

## list the content of the data frame 'attitude'
stargazer(attitude, summary=FALSE)

##  2 OLS models
linear.1 <- lm(rating ~ complaints + privileges + learning 
                        + raises + critical, data=attitude)

linear.2 <- lm(rating ~ complaints + privileges + learning, data=attitude)

## create an indicator dependent variable, and run a probit model
 
attitude$high.rating <- (attitude$rating > 70)
probit.model <- glm(high.rating ~ learning + critical + advance, data=attitude,
                    family = binomial(link = "probit"))
 
stargazer(linear.1, linear.2, probit.model, title="Regression Results")

## report ASCII text for a table with 90 percent confidence
## intervals reported on the same row as coefficients
## and omitting F statistics and the residual standard error

stargazer(linear.1, linear.2, probit.model, type="text",
          title="Regression Results", single.row=TRUE,
          ci=TRUE, ci.level=0.9, omit.stat=c("f", "ser"))
          
### re-order the models and only keep explanatory
### variables that contain "complaints", "learning", 
### "raises" and "critical"; report these with standard
### errors, and put "learning" and "raises" before
### the other explanatory variables; of the summary
### statistics, only keep the number of observations

stargazer(probit.model, linear.1, linear.2, type="text",
          keep=c("complaints","learning","raises","critical"),
          keep.stat="n", order=c("learning", "raises"))

### apply a function to the coefficients and standard errors
### that will multiply them by ten; you can think of this
### as a change in units

multiply.by.10 <- function(x) (x * 10)

stargazer(probit.model, linear.1, linear.2,
          apply.coef=multiply.by.10, apply.se=multiply.by.10)
          
### print out HTML code for a correlation matrix

correlation.matrix <- cor(attitude)
stargazer(correlation.matrix, type="html")




stargazer documentation built on March 18, 2022, 7:13 p.m.