prune: R-squared pruning

R-squared pruningR Documentation

R-squared pruning

Description

Pruning features using an R-squared threshold and maximum distance

Usage

Prune(X, alpha = 0.95, 
      pos = NULL, d.max = NULL, 
      centered = FALSE, scaled = FALSE,
      verbose = FALSE) 

Arguments

X

(numeric matrix) A matrix with observations in rows and features (e.g., SNPs) in columns

alpha

(numeric) R-squared threshold used to determine connected sets

pos

(numeric vector) Optional vector with positions (e.g., bp) of features

d.max

(numeric) Maximum distance that connected sets are apart

centered

TRUE or FALSE whether columns in X are centered with mean zero

scaled

TRUE or FALSE whether columns in X are scaled with unit standard deviation

verbose

TRUE or FALSE to whether show progress

Details

The algorithm identifies sets of connected features as those that share an R2 > α and retains only one feature (first appearance) for each set.

The sets can be limited to lie within a distance less or equal to a d.max value.

Value

Returns a list object that contains the elements:

  • prune.in: (vector) indices of selected (unconnected) features.

  • prune.out: (vector) indices of dropped out features.

Examples

  require(SFSI)
  data(wheatHTP)
  
  index = c(154:156,201:205,306:312,381:387,540:544)
  X = M[,index]          # Subset markers
  colnames(X) = 1:ncol(X)
  
  # See connected sets using R^2=0.8
  R2thr = 0.8
  R2 = cor(X)^2
  nw1 = net(R2, delta=R2thr) 
  plot(nw1, show.names=TRUE)

  # Get pruned features
  res = Prune(X, alpha=R2thr)

  # See selected (unconnected) features
  nw2 = net(R2[res$prune.in,res$prune.in], delta=R2thr) 
  nw2$xy = nw1$xy[res$prune.in,]
  plot(nw2, show.names=TRUE)


SFSI documentation built on Nov. 18, 2023, 9:06 a.m.