| search_data | R Documentation |
Search and filter the dataSDA dataset catalog by metadata criteria including sample size, number of variables, subject area, symbolic format, analytical tasks, keywords, and book reference.
search_data(...)
... |
Filter expressions. Each argument is a comparison expression evaluated against the dataset metadata. Supported columns:
|
For character columns (subject, type, task, tag,
book), the == operator performs a case-insensitive substring
match (using grepl). The type column uses short suffix-based
labels that match the dataset name suffix (e.g., type == "int"
matches all .int datasets).
For numeric columns (n, p), standard comparison operators
are used with exact semantics.
When no arguments are provided, or when tag == "all" is used,
all datasets are returned.
A data frame with one row per matching dataset and the following
columns: name, n, p, subject, type,
task, tag, book.
Billard, L. and Diday, E. (2006). Symbolic Data Analysis: Conceptual Statistics and Data Mining. Wiley, Chichester.
Billard, L. and Diday, E. (2020). Clustering Methodology for Symbolic Data. Wiley.
Diday, E. and Noirhomme-Fraiture, M. (Eds.) (2008). Symbolic Data Analysis and the SODAS Software. Wiley.
# List all datasets
search_data()
# Filter by symbolic format (suffix-based)
search_data(type == "hist")
# Filter by analytical task and size
search_data(task == "Regression", n > 10)
# Filter by book reference
search_data(book == "SDA_2006")
# Combine multiple filters
search_data(type == "int", task == "Clustering", subject == "Biology")
# Filter by size range
search_data(n >= 20, n <= 100, p < 10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.