locfdr_filter: Filtering variable based on local false discovery rate

View source: R/mt_extra.R

locfdr_filterR Documentation

Filtering variable based on local false discovery rate

Description

Filter data based on local false discovery rate. This function uses z.2 of locfdr.

Usage

locfdr_filter(x, plot = 1, thres = NULL, ...)

Arguments

x

a data matrix

plot

an integer for plotting. 0 gives no plots. 1 gives single plot showing the histogram of zz and fitted densities f and p0*f0.

thres

a user defined threshold for filtering. The default is NULL, which use local FDR as threshold for filtering.

...

other parameters to be passed to locfdr.

Details

  • Keep the variables which have at least one significant element. The significant element is defined as larger than the lower of threshold or less than the upper of threshold.

  • Threshold can be given by user or be estimated by locfdr, i.e. the returned z.2 as a threshold. It is not guaranteed that locfdr z.2. If not, user must provide this value.

  • From R package locfdr vignette: z.2 is the interval along the zz-axis outside of which fdr(z) < 0.2, the locations of the yellow triangles in the histogram plot. If no elements of zz on the left or right satisfy the criterion, the corresponding element of z.2 is NA, and the corresponding triangle does not appear.

Value

a list of with contents:

  • dat the filtered data matrix.

  • idx a vector of filtering index.

  • thres threshold used for filtering.

See Also

locfdr()

Other variable filters: blank_filter(), mv_filter(), qc_filter(), rsd_filter(), var_filter()

Examples

## Not run: 
library(dplyr)
library(tidyr)
library(purrr)
library(readr)

## get ionomics data
dat <- read_csv("https://github.com/wanchanglin/ionflow/raw/master/extra/paper_ko.csv")
dim(dat)

## missing values filling with mean
dat <- dat %>% 
  mutate(across(where(is.numeric), function(x) {
    m <- mean(x, na.rm = TRUE)
    x[is.na(x)] <- m
    x
  }))
dat

res <- locfdr_filter(t(dat[, -1]), plot = 1)
res$thres

## filter data
dat <- dat[res$idx, , drop = FALSE]

## symbolise data
dat_sym <- dat %>% 
  mutate(across(where(is.numeric), ~ dat_symb(., thres = res$thres)))

## End(Not run)

wanchanglin/mtExtra documentation built on Aug. 2, 2024, 5:47 p.m.