knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Before R became available, I was a heavy user of the S-PLUS™ software, a commercial declination of R's ancestor S. It had a very practical function called objects.summary()
, which would list objects from an environment in a tabular form (basically as a data.frame
) with some interesting attributes including class, mode, dimensions, and size. I couldn't find its equivalent in R, so I wrote one 😊
You can install the current stable version of osum
from CRAN:
install.packages("osum")
Windows and macOS binary packages are available from here.
You can install the development version of osum
including latest features from GitHub:
require(remotes) install_github("zivankaraman/osum")
library(osum)
First, we need to populate the session environment with a few objects.
a <- month.name b <- sample(c("FALSE", "TRUE"), size = 5, replace = TRUE) cars <- mtcars .hidden <- -1L .secret <- "Shhht!" x1 <- rnorm(n = 10) x2 <- runif(n = 20) x3 <- rbinom(n = 30, size = 10, prob = 0.5) lst <- list(first = x1, second = x2, third = x3) fun <- function(x) {sqrt(x)}
By default, the environment of the call to objects.summary
is used, here .GlobalEnv
.
objects.summary()
The hidden objects are not shown by default. One has to provide argument all.objects=TRUE
to see them (not unlike the all.names
argument to the ls
function)
objects.summary(all.objects = TRUE)
If the objects.summary
is called inside the function, it is the calling function's environment that is used by default.
# shows an empty list because inside myfunc no variables are defined myfunc <- function() {objects.summary()} myfunc() # define a local variable inside myfunc myfunc <- function() {y <- 1; objects.summary()} myfunc()
We can limit the output to objects with names matching the regular expression provided as the pattern
argument. Alternatively, we can provide a character vector naming objects to summarize in the names
argument.
objects.summary(pattern = "^x") objects.summary(names = c("a", "b"))
We can list the objects from any environment, not just the current environment. The environment can be provided as an integer indicating the position in the search list or a character giving the name of an environment in the search list.
idx <- grep("package:graphics", search()) objects.summary(idx, pattern = "^plot") objects.summary("package:graphics", pattern = "^plot")
We can also explicitly provide an environment.
e <- new.env() e$a <- 1:10 e$b <- rnorm(25) e$df <- iris e$arr <- iris3 objects.summary(e)
rm(e, myfunc)
Unless an explicit environment is provided, where
argument should designate an element of the search list. However, if it is a character of the form "package:pkg_name" and if the package named "pkg_name" is installed, it is silently loaded, its objects retrieved, and then it is unloaded when the function exits. Depending on the time it takes to load the package, the execution might be slower than getting the information about an attached package.
# check if the package foreign is attached length(grep("package:foreign", search())) > 0L objects.summary("package:foreign", pattern = "^write") # check if the package foreign is attached length(grep("package:foreign", search())) > 0L
We don't need to display all the attributes, the what
argument controls which information is returned. Partial matching is used, so only enough initial letters of each string element are needed to guarantee unique recognition. For example, "data[.class]
", "stor[age.mode]
", "ext[ent]
", "obj[ect.size]
".
objects.summary(what = c("data.class", "storage.mode", "extent", "object.size")) objects.summary(what = c("data", "stor", "ext", "obj"))
In fact, just providing the first letter is sufficient, since all the possible values start with a different letter. The order of columns in the summary respects the order in which their names are listed in the what
argument.
objects.summary(what = c("m", "s", "t", "o", "d", "e"))
It should be noted that attributes storage.mode
, mode
, and typeof
are somewhat redundant, so you can select only those that are relevant to you. You can set your personal preferences using the osum.options
function, as explained in [Options].
The subset of objects from the environment where
which should be selected for summary is specified with either an explicit vector of names provided in argument names
, or with some combination of the subsetting criteria pattern
(as seen in [Restricting the Objects List]), data.class
, storage.mode
, mode
, and typeof
. If argument names
is given, the other criteria are ignored. If more than one criterion is given, only objects which satisfy all of them are selected. In the absence of both names
and criteria, all objects in where
are selected.
objects.summary("package:datasets", pattern = "^[sU]", what = c("dat", "typ", "ext", "obj"), data.class = c("data.frame", "matrix"))
Objects can have more than one class, but only the first class element is used by default. Specifying all.classes=TRUE
allows to consider the entire class vector of an object, both in selection based on argument data.class
and in the returned summary.
objects.summary("package:datasets", what = c("dat", "typ", "ext", "obj"), data.class = "array") objects.summary("package:datasets", what = c("dat", "typ", "ext", "obj"), all.classes = TRUE, data.class = "array")
Besides simple filtering criteria by values of attributes, we can also filter on logical expression indicating elements (rows) to keep. The expression is evaluated in the data frame with object attributes, so columns should be referred to (by unquoted attribute name) as variables in the expression (not unlike the select
argument of the base subset
function). This can be particularly helpful when we want to exclude some values, avoiding explicit listing of all other (possible) values, as shown in the example below.
objects.summary("package:grDevices", filter = mode != "function")
The filter expression can involve more than one attribute.
objects.summary("package:datasets", filter = mode != storage.mode)[1:10, ]
It can also be quite complex, as long as it yields a logical value for every object (row).
objects.summary("package:datasets", all.classes = TRUE, filter = sapply(data.class, length) > 2L)
By default, the object entries (printed as rows) in the summary are sorted alphabetically by object name. By providing the order
argument, they can be sorted on any other column(s). The order
argument should be (unquoted) column names. For numeric columns, one can precede the name by "-" to sort in descending order, with the expression enclosed in parentheses (see examples). To sort on more than one column, the expression must be provided as a vector c(., .)
(again see examples). Feature inspired by the standard R order
function.
# filter on 'mode' and sort on 'data.class' objects.summary("package:datasets", what = c("dat", "typ", "ext", "obj"), mode = "numeric", order = data.class)[1:10, ] # filter on 'mode' and sort (descending) on 'object.size' objects.summary("package:datasets", what = c("dat", "typ", "ext", "obj"), mode = "numeric", order = (-object.size))[1:10, ] objects.summary("package:datasets", what = c("dat", "typ", "ext", "obj"), order = c(data.class, -object.size))[1:10, ]
It should be noted that although the extent
is by default printed (by the specific print method for objects of class objects.summary
) as a product of dimensions (d1 x d2), it is internally stored as a list, which allows sorting on a number of rows or columns, for example.
# get all two-dimensional objects of from the datasets package, with more than 7 columns, # sorted by number on columns (ascending) and then on number of rows (descending) objects.summary("package:datasets", what = c("dat", "typ", "ext", "obj"), filter = sapply(extent, length) == 2L & sapply(extent, "[", 2L) > 7L, order = c(sapply(extent, "[", 2L), -sapply(extent, "[", 1L)))
The entries are sorted in ascending order by default. They can be sorted in descending order by specifying reverse=TRUE
.
# get five biggest objects from package datasets objects.summary("package:datasets", what = c("dat", "typ", "ext", "obj"), reverse=TRUE)[1:10, ]
It should be noted that the objects in the summary can be filtered and/or sorted by the columns that will not be part of the summary (i.e. are not listed in the what
argument).
objects.summary("package:datasets", what = c("dat", "typ", "ext"), pattern = "st", filter = mode %in% c("list", "numeric"), order = object.size)
The objects.summary
function creates an object of class objects.summary
, which is an extension of the data.frame
class. The purpose of this class is being able to propose custom print
and summary
methods.
The number of rows printed can be limited by the max.rows
argument, which allows more straightforward control than the max
argument of the print.data.frame
.
When all.classes
argument is set to TRUE
, the entire class vector is returned, and the data.class
column is a list of character vectors. When such data is printed, the output is limited to a fixed number of characters (12 by default), longer strings being shown as e.g. "matrix, ..." or "nfnGroup....". The data.class.width
argument to the print
method allows users to change this value (probably to increase it), in order to see (almost) all the classes.
os <- objects.summary("package:datasets", what = c("dat", "ext", "obj"), all.classes = TRUE, order = object.size, reverse = TRUE) print(os, data.class.width = 25, max.rows = 12) multi_class_objects <- row.names(objects.summary("package:datasets", all.classes = TRUE, filter = sapply(data.class, length) > 1L)) os <- objects.summary("package:datasets", names = multi_class_objects, all.classes = TRUE, what = c("dat", "ext", "obj")) print(os, data.class.width = 32, max.rows = 12)
As already mentioned in [Sorting Objects], the extent
column is internally stored as a list, and we can explicitly control how it is printed by the format.extent
argument.
multi_dim_objects <- row.names(objects.summary("package:datasets", all.classes = TRUE, data.class = c("array", "table"))) os <- objects.summary("package:datasets", names = multi_dim_objects, what = c("dat", "ext", "obj")) print(os[rev(order(sapply(os$extent, length))), ], format.extent = TRUE, max.rows = 12) # default print(os[rev(order(sapply(os$extent, length))), ], format.extent = FALSE, max.rows = 12)
Other options can be passed down to the print.data.frame
function (not necessarily very useful).
print(objects.summary("package:datasets", what = c("dat", "typ", "ext", "obj")), format.extent = TRUE, max.rows = 12, right = FALSE, quote = TRUE)
The summary
method shares the same specific arguments as the print
except for max.rows
.
os <- objects.summary("package:datasets", all.classes = TRUE, what = c("dat", "ext", "obj"), filter = sapply(data.class, length) > 1L) summary(os, data.class.width = 32, format.extent = FALSE)
Again, other options can be passed down to the summary.data.frame
function.
summary(os, data.class.width = 32, maxsum = 10, quantile.type = 5)
There are a few custom options dedicated to the package. The function osum.options
, crafted after the base
package options
, allows the user to set and examine them. The custom options mainly allow for providing the default values for the specific arguments to the print
and summary
methods (data.class.width
, format.extent
, and max.rows
), as seen in [Printing and Summarizing].
# see all current options osum.options()
# set some values old_opt <- osum.options(osum.data.class.width = 12, osum.max.rows = 25) # previous values of the changed 'osum' options old_opt
It is also possible to select what information will be returned by default by the function objects.summary
. It must be a subset of c("data.class", "storage.mode", "mode", "typeof", "extent", "object.size")
, partial matching is allowed.
# set which attributes are retrieved by default osum.options(osum.information = c("dat", "mod", "ext", "obj")) # get the current value of the option osum.options("osum.information") # if the argument 'what' is not specified, the new default values are used objects.summary("package:base", filter = data.class != "function")
*Created on `r format(Sys.Date(), "%Y-%m-%d")`.*
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.