vtable: Variable Table Function

View source: R/vtable.R

vtableR Documentation

Variable Table Function

Description

This function will output a descriptive variable table either to the console or as an HTML file that can be viewed continuously while working with data. vt() is the same thing but requires fewer key presses to type.

Usage

vtable(
  data,
  out = NA,
  file = NA,
  labels = NA,
  class = TRUE,
  values = TRUE,
  missing = FALSE,
  index = FALSE,
  factor.limit = 5,
  char.values = FALSE,
  data.title = NA,
  desc = NA,
  note = NA,
  note.align = "l",
  anchor = NA,
  col.width = NA,
  col.align = NA,
  align = NA,
  fit.page = NA,
  summ = NA,
  lush = FALSE,
  opts = list()
)

vt(
  data,
  out = NA,
  file = NA,
  labels = NA,
  class = TRUE,
  values = TRUE,
  missing = FALSE,
  index = FALSE,
  factor.limit = 5,
  char.values = FALSE,
  data.title = NA,
  desc = NA,
  note = NA,
  note.align = "l",
  anchor = NA,
  col.width = NA,
  col.align = NA,
  align = NA,
  fit.page = NA,
  summ = NA,
  lush = FALSE,
  opts = list()
)

Arguments

data

Data set; accepts any format with column names. If variable labels are set with the haven package, set_label() from sjlabelled, or label() from Hmisc, vtable will extract them automatically.

out

Determines where the completed table is sent. Set to "browser" to open HTML file in browser using browseURL(), "viewer" to open in RStudio viewer using viewer(), if available. Use "htmlreturn" to return the HTML code to R. Use "return" to return the completed variable table to R in data frame form or "kable" to return it as a knitr::kable(). Additional options include "csv" to write to CSV in conjunction with file (although this will drop most additional formatting), "latex" for a LaTeX table or "latexpage" for a full buildable LaTeX page. Defaults to "viewer" if RStudio is running, "browser" if it isn't, or a "kable" passed through kableExtra::kable_styling() defaults if it's an RMarkdown document being built with knitr.

file

Saves the completed variable table file to HTML or .tex with this filepath. May be combined with any value of out, although note that out = "return" and out = "kable" will still save the standard vtable HTML file as with out = "viewer" or out = "browser".

labels

Variable labels. labels will accept three formats: (1) A vector of the same length as the number of variables in the data, in the same order as the variables in the data set, (2) A matrix or data frame with two columns and more than one row, where the first column contains variable names (in any order) and the second contains labels, or (3) A matrix or data frame where the column names (in any order) contain variable names and the first row contains labels. Setting the labels parameter will override any variable labels already in the data. Set to "omit" if the data set has embedded labels but you don't want any labels in the table.

class

Set to TRUE to include variable classes in the variable table. Defaults to TRUE.

values

Set to TRUE to include the range of values of each variable: min and max for numeric variables, list of factors for factor or ordered variables, and 'TRUE FALSE' for logicals. values will detect and use value labels set by the sjlabelled or haven packages, as long as every value is labelled. Defaults to TRUE.

missing

Set to TRUE to include the number of NAs in the variable. Defaults to FALSE.

index

Set to TRUE to include the index number of the column with the variable name. Defaults to FALSE.

factor.limit

Sets maximum number of factors that will be included if values = TRUE. Set to 0 for no limit. Defaults to 5.

char.values

Set to TRUE to include values of character variables as though they were factors, if values = TRUE. Or, set to a character vector of variable names to list values of only those character variables. Defaults to FALSE. Has no effect if values = FALSE.

data.title

Character variable with the title of the dataset.

desc

Character variable offering a brief description of the dataset itself. This will by default include information on the number of observations and the number of columns. To remove this, set desc='omit', or include any description and then include 'omit' as the last four characters.

note

Table note to go after the last row of the table.

note.align

Set the alignment for the multi-column table note. Usually "l", but if you have a long note in LaTeX you might want to set it with "p"

anchor

Character variable to be used to set an anchor link in HTML tables, or a label tag in LaTeX.

col.width

Vector of page-width percentages, on 0-100 scale, overriding default column widths in HTML table. Must have a number of elements equal to the number of columns in the resulting table.

col.align

For HTML output, a character vector indicating the HTML text-align attributes to be used in the table (for example col.align = c('left','center','center'). Defaults to all left-aligned. If you want to get tricky, you can add a ";" afterwards and keep putting in whatever CSS attributes you want. They will be applied to the whole column.

align

For LaTeX output, string indicating the alignment of each column. Use standard LaTeX syntax (i.e. l|ccc). Defaults to all p{} columns with widths set using the same defaults as with col.width. Be sure to escape special characters, in particular backslashes (i.e. p{.25\\textwidth} instead of p{.25\textwidth}).

fit.page

For LaTeX output, uses a resizebox to force the table to a certain width. Set to NA to omit. Often '\textwidth'.

summ

Character vector of summary statistics to include for numeric and logical variables, in the form 'function(x)'. This option is flexible, and allows any summary statistic function that takes in a column and returns a single number. For example, summ=c('mean(x)','mean(log(x))') will provide the mean of each variable as well as the mean of the log of each variable. Keep in mind the special vtable package helper functions designed specifically for this option propNA, countNA, and notNA, which report counts and proportions of NAs, or counts of not-NAs, in the vectors, nuniq, which reports the number of unique values, and pctile, which returns a vector of the 100 percentiles of the variable. NAs will be omitted from all calculations other than propNA(x) and countNA(x).

lush

Set to TRUE to select a set of options with more information: sets char.values and missing to TRUE, and sets summ to c('mean(x)', 'sd(x)', 'nuniq(x)'). summ can be overwritten by setting summ to something else.

opts

The same vtable options as above, but in a named list format. Useful for applying the same set of options to multiple vtables.

Details

Outputting the variable table as a help file will make it easy to search through variable names or labels, or to refer to information about the variables easily.

This function is in a similar spirit to promptData(), but focuses on variable documentation rather than dataset documentation.

If you would like to include a vtable in an RMarkdown document, it should just work! If you leave out blank, it will default to a nicely-formatted knitr::kable(), although this will drop some formatting elements like multi-column cells (or do out="kable" to get an unformatted kable that you can format yourself). If you prefer the vtable package formatting, then use out="latex" if outputting to LaTeX or out="htmlreturn" for HTML, both with results="asis" in the code chunk. Alternately, in HTML, you can use the file option to write to file and use a <iframe> to include it.

Examples


if(interactive()){
df <- data.frame(var1 = 1:4,var2=5:8,var3=c('A','B','C','D'),
    var4=as.factor(c('A','B','C','C')),var5=c(TRUE,TRUE,FALSE,FALSE))

#Demonstrating different options:
vtable(df,labels=c('Number 1','Number 2','Some Letters',
    'Some Labels','You Good?'))
vtable(subset(df,select=c(1,2,5)),
    labels=c('Number 1','Number 2','You Good?'),class=FALSE,values=FALSE)
vtable(subset(df,select=c('var1','var4')),
    labels=c('Number 1','Some Labels'),
    factor.limit=1,col.width=c(10,10,40,35))

#Different methods of applying variable labels:
labelsmethod2 <- data.frame(var1='Number 1',var2='Number 2',
    var3='Some Letters',var4='Some Labels',var5='You Good?')
vtable(df,labels=labelsmethod2)
labelsmethod3 <- data.frame(a =c("var1","var2","var3","var4","var5"),
    b=c('Number 1','Number 2','Some Letters','Some Labels','You Good?'))
vtable(df,labels=labelsmethod3)

#Using value labels and pre-labeled data:
library(sjlabelled)
df <- set_label(df,c('Number 1','Number 2','Some Letters',
    'Some Labels','You Good?'))
df$var1 <- set_labels(df$var1,labels=c('A little','Some more',
'Even more','A lot'))
vtable(df)

#efc is data with embedded variable and value labels from the sjlabelled package
library(sjlabelled)
data(efc)
vtable(efc)

#Displaying the values of a character vector
data(USJudgeRatings)
USJudgeRatings$Judge <- row.names(USJudgeRatings)
vtable(USJudgeRatings,char.values=c('Judge'))

#Adding summary statistics for variable mean and proportion of data that is missing.
vtable(efc,summ=c('mean(x)','propNA(x)'))

}

vtable documentation built on Oct. 26, 2023, 5:08 p.m.