rf100_document_collection: RF100 Document Collection Datasets

rf100_document_collectionR Documentation

RF100 Document Collection Datasets

Description

RoboFlow 100 Document dataset Collection

Usage

rf100_document_collection(
  dataset,
  split = c("train", "test", "valid"),
  transform = NULL,
  target_transform = NULL,
  download = FALSE
)

Arguments

dataset

Dataset to select within c("tweeter_post", "tweeter_profile", "document_part", "activity_diagram", "signature", "paper_part", "tabular_data", "paragraph").

split

the subset of the dataset to choose between c("train", "test", "valid").

transform

Optional transform function applied to the image.

target_transform

Optional transform function applied to the target.

download

Logical. If TRUE, downloads the dataset if not present at root.

Details

Loads one of the RoboFlow 100 Document datasets with COCO-style bounding box annotations for object detection tasks.

Value

A torch dataset. Each element is a named list with:

  • x: H x W x 3 array representing the image.

  • y: a list containing the target with:

    • image_id: numeric identifier of the x image.

    • labels: numeric identifier of the N bounding-box object class.

    • boxes: a torch_tensor of shape (N, 4) with bounding boxes, each in (x_{min}, y_{min}, x_{max}, y_{max}) format.

The returned item inherits the class image_with_bounding_box so it can be visualised with helper functions such as draw_bounding_boxes().

See Also

Other detection_dataset: coco_detection_dataset(), pascal_voc_datasets, rf100_biology_collection(), rf100_damage_collection(), rf100_infrared_collection(), rf100_medical_collection(), rf100_underwater_collection()

Examples

## Not run: 
ds <- rf100_document_collection(
  dataset = "tweeter_post",
  split = "train",
  transform = transform_to_tensor,
  download = TRUE
)

# Retrieve a sample and inspect annotations
item <- ds[1]
item$y$labels
item$y$boxes

# Draw bounding boxes and display the image
boxed_img <- draw_bounding_boxes(item)
tensor_image_browse(boxed_img)

## End(Not run)


torchvision documentation built on Nov. 6, 2025, 9:07 a.m.