rdsearch: Search Available Datasets

View source: R/rdsearch.R

rdsearchR Documentation

Search Available Datasets

Description

Search available datasets from the Rdatasets archive by regular expression.

Usage

rdsearch(
  pattern,
  field = NULL,
  fixed = FALSE,
  perl = FALSE,
  ignore.case = FALSE
)

Arguments

pattern

String. Search pattern. Can be a regular expression or literal string depending on the fixed argument.

field

String. Which field to search in. One of "package", "dataset", "title". If NULL (default), searches in all three fields.

fixed

logical. If TRUE, pattern is a string to be matched as is. Overrides all conflicting arguments.

perl

logical. Should Perl-compatible regexps be used?

ignore.case

logical. if FALSE, the pattern matching is case sensitive and if TRUE, case is ignored during matching.

Value

A data frame containing matching datasets with the following columns:

  • Package: Character. The name of the R package that contains the dataset

  • Dataset: Character. The name of the dataset

  • Title: Character. A descriptive title for the dataset

  • Rows: Integer. Number of rows in the dataset

  • Cols: Integer. Number of columns in the dataset

  • n_binary: Integer. Number of binary variables in the dataset

  • n_character: Integer. Number of character variables in the dataset

  • n_factor: Integer. Number of factor variables in the dataset

  • n_logical: Integer. Number of logical variables in the dataset

  • n_numeric: Integer. Number of numeric variables in the dataset

  • CSV: Character. URL to download the dataset in CSV format

  • Doc: Character. URL to the dataset's documentation

Global Options

The following global options control package behavior:

  • Rdatasets_cache: Logical

    • Whether to cache downloaded data and index for faster subsequent access. Default: TRUE. Please keep this option TRUE as it makes repeated access faster and avoids overloading the Rdatasets server. Only set to FALSE if local memory is severely limited.

    • Ex: 'options(Rdatasets_cache = TRUE)“

  • Rdatasets_class: String

    • Output class of the returned data. One of "data.frame" (default), "tibble", or "data.table". Default: "data.frame". Requires the respective packages to be installed for "tibble" or "data.table" formats.

    • Ex: options(Rdatasets_class = "tibble")

  • Rdataset_path: String.

    • Base URL for the Rdatasets archive. Default: "https://vincentarelbundock.github.io/Rdatasets/". Advanced users can set this to use a different mirror or local copy.

    • Ex: options(Rdataset_path = "https://vincentarelbundock.github.io/Rdatasets/")

Examples

# Search all fields (default behavior)
rdsearch("iris")

# Case-insensitive search
rdsearch("(?i)titanic")

# Search only in package names
rdsearch("datasets", field = "package")

# Search only in dataset names
rdsearch("iris", field = "dataset")

# Search only in titles
rdsearch("Edgar Anderson", field = "title")

Rdatasets documentation built on June 8, 2025, 11:48 a.m.