find_colname: Find colname by string or pattern

find_colnameR Documentation

Find colname by string or pattern

Description

Find colname by string or pattern, with option to require non-NA values.

Usage

find_colname(
  pattern,
  x,
  max = 1,
  index = FALSE,
  require_non_na = TRUE,
  col_types = NULL,
  exclude_pattern = NULL,
  verbose = FALSE,
  ...
)

Arguments

pattern

character vector of text strings and/or regular expression patterns.

x

data.frame or other object that contains colnames(x).

max

integer maximum number of entries to return.

index

logical indicating whether to return the column index, that is the column number.

require_non_na

logical indicating whether to require at least one non-NA value in the matching colname. When require_non_na=TRUE and all values in a column are NA, that colname is not returned by this function.

exclude_pattern

character vector of colnames or patterns to exclude from returned results.

verbose

logical indicating whether to print verbose output.

...

additional arguments are ignored.

Details

This function is a simple utility function intended to help find the most appropriate matching colname given one or more character strings or patterns.

It returns the first best matching result, but can return multiple results in order of preference if max=Inf.

The order of matching:

  1. Match the exact colname.

  2. Match case-insensitive colname.

  3. Match the beginning of each colname.

  4. Match the end of each colname.

  5. Match anywhere in each colname.

The goal is to use something like c("p.value", "pvalue", "pval") and be able to find colnames with these variations:

  • P.Value

  • ⁠P.Value Group-Control⁠

  • ⁠Group-Control P.Value⁠

  • pvalue

Even if the data contains c("P.Value", "adj.P.Val") as returned by limma::topTable() for example, the pattern c("p.val") will preferentially match "P.Value" and not "adj.P.Val".

See Also

Other jam utility functions: blockArrowMargin(), fold_to_log2fold(), get_se_assaydata(), gradient_rect(), handle_highlightPoints(), log2fold_to_fold(), logAxis(), outer_legend(), points2polygonHull(), update_function_params(), update_list_elements()

Examples

x <- data.frame(
   `Gene`=paste0("gene", LETTERS[1:25]),
   `log2fold Group-Control`=rnorm(25)*2,
   `P.Value Group-Control`=10^-rnorm(25)^2,
   check.names=FALSE);
x[["fold Group-Control"]] <- log2fold_to_fold(x[["log2fold Group-Control"]]);
x[["adj.P.Val Group-Control"]] <- x[["P.Value Group-Control"]];

print(head(x));
find_colname(c("p.val", "pval"), x);
find_colname(c("fold", "fc", "ratio"), x);
find_colname(c("logfold", "log2fold", "lfc", "log2ratio", "logratio"), x);

## use exclude_pattern
## if the input data has no "P.Value" but has "adj.P.Val"
y <- x[,c(1,2,4,5)];
print(head(y));
find_colname(c("p.val"), y, exclude_pattern=c("adj"))


jmw86069/jamma documentation built on Oct. 11, 2024, 7:08 a.m.