classify_files: Classify FTMS files into categories based on filename...

View source: R/classify_files.R

classify_filesR Documentation

Classify FTMS files into categories based on filename patterns

Description

Classifies entries into categories (blank, standard, pool, sample, …) based on pattern rules applied to a specific search column. The identifiers returned in each category are also configurable.

Usage

classify_files(
  fi,
  search_col = "link_rawdata",
  id_col = "file_id",
  patterns = list(blank = c("blk", "blank", "MQ"), standard = c("srfa", "standard"), pool
    = c("pool")),
  include_blank_check = TRUE,
  return = c("list", "table")
)

Arguments

fi

data.table. Must contain the columns specified in search_col and id_col.

search_col

Character. Name of the column used for pattern matching. Defaults to "link_rawdata".

id_col

Character. Name of the column whose values are returned for each category. Defaults to "file_id".

patterns

Named list of character vectors. Each list entry is a category name, and its value is a vector of patterns.

include_blank_check

Logical; if TRUE and blank_check exists, it is used to assign "blank".

return

Either "list" (default) or "table".

  • "list" → named list of ID vectors

  • "table"fi with added column category_analysis

Details

Default behavior:

  • "blank": blank_check == "blank" or pattern "blk"

  • "standard": pattern "srfa"

  • "pool": pattern "pool"

  • "sample": everything unmatched

Pattern matching is case-insensitive.

Value

Named list or a classified data.table.

Examples

# Minimal demo data
fi <- data.table::data.table(
  file_id       = 1:6,
  filename      = c("NS_blk_01.raw", "SRFA_20.raw", "Pool_A.raw",
                    "Sample_01.raw", "Sample_02.raw", "MQ_blank.raw"),
  blank_check   = c("blank", NA, NA, NA, NA, "blank"),  # optional column
  link_rawdata  = c("NS_blk_01.raw", "SRFA_20.raw", "Pool_A.raw",
                    "Sample_01.raw", "Sample_02.raw", "MQ_blank.raw")
)

# 1) Default behavior: return named list of file_ids by category
classify_files(fi)

# 2) Use a different column for pattern matching
classify_files(fi, search_col = "filename")

# 3) Return another ID field (here: file_id → stays the same for demo)
classify_files(fi, id_col = "file_id")

# 4) Return the full table with new category column
classify_files(fi, return = "table")

ume documentation built on Dec. 13, 2025, 1:06 a.m.