sanitize_metadata: Given an expressionset, sanitize pData columns of interest.

View source: R/metadata.R

sanitize_metadataR Documentation

Given an expressionset, sanitize pData columns of interest.

Description

I wrote this function after spending a couple of hours confused because one cell in my metadata said 'cure ' instead of 'cure' and I could not figure out why chaos reigned in my analyses. There is a sister to this somewhere else which checks that the expected levels of a metadata factor are consistent; this is because in another analysis we essentially had a cell which said 'cyre' and a similar data explosion occurred.

Usage

sanitize_metadata(
  meta,
  columns = NULL,
  na_value = "notapplicable",
  lower = TRUE,
  punct = TRUE,
  factorize = "heuristic",
  max_levels = NULL,
  spaces = FALSE,
  numbers = NULL,
  numeric = FALSE
)

Arguments

meta

Input metadata

columns

Set of columns to check, if left NULL, all columns will be molested.

na_value

Fill NA values with a string.

lower

Set everything to lowercase?

punct

Remove punctuation?

factorize

Set some columns to factors? If set to a vector of length >=1, then set all of the provided columns to factors. When set to 'heuristic', set any columns with <= max_levels different elements to factors.

max_levels

When heuristically setting factors, use this as the heuristic, when NULL it is the number of samples / 6

spaces

Remove any spaces in this column?

numbers

Sanitize numbers by adding a prefix character to them?

numeric

Recast the values as numeric when possible?


elsayed-lab/hpgltools documentation built on May 9, 2024, 5:02 a.m.