tgs_knn: Returns k highest values of each column
In tgstat: Amos Tanay's Group High Performance Statistical Utilities

View source: R/knn.R

tgs_knn

R Documentation

Returns k highest values of each column

Description

Returns k highest values of each column.

Usage

tgs_knn(x, knn, diag = FALSE, threshold = 0)

Arguments

`x`	numeric matrix or data frame (see below)
`knn`	the number of highest values returned per column
`diag`	if 'F' values of row 'i' and col 'j' are skipped for each i == j
`threshold`	filter out values lower than threshold

Details

'tgs_knn' returns the highest 'knn' values of each column of 'x' (if 'x' is a matrix). 'x' can be also a sparse matrix given in a data frame of 'col', 'row', 'value' format.

'NA' and 'Inf' values are skipped as well as the values below 'threshold'. If 'diag' is 'F' values of the diagonal (row == col) are skipped too.

Value

A sparse matrix in a data frame format with 'col1', 'col2', 'val' and 'rank' columns. 'col1' and 'col2' represent the column and the row number of 'x'.

Examples


# Note: all the available CPU cores might be used

set.seed(seed = 1)
rows <- 100
cols <- 1000
vals <- sample(1:(rows * cols / 2), rows * cols, replace = TRUE)
m <- matrix(vals, nrow = rows, ncol = cols)
m[sample(1:(rows * cols), rows * cols / 1000)] <- NA
r <- tgs_knn(m, 3)

tgstat documentation built on Sept. 30, 2024, 9:17 a.m.