get_problems: Summarize potential problems in a distance matrix

Description Usage Arguments Value See Also Examples

View source: R/get_problems.R

Description

For the inviduals represented in a distance matrix, collect the self-self, best, and 2nd best distances, and summarize the results in a data frame.

Usage

1
2
3
4
5
6
7
get_problems(
  d,
  dimension = c("row", "column"),
  get_min = TRUE,
  subset = c("problems", "all"),
  threshold = 0
)

Arguments

d

A distance or similarity matrix

dimension

Whether to determine the best distances within rows or columns

get_min

If TRUE, get the minimum (for a distance matrix); if FALSE, get the maximum (for a similarity matrix)

subset

Whether to return just the rows with potential problems, or all of the rows.

threshold

If subset="problems", the threshold on the difference between the self and best distances.

Value

A data frame containing individual ID, distance to self, best distance and corresponding individual, 2nd best distance and the corresponding individual.

See Also

get_self(), get_best(), get_2ndbest(), which_best(), get_nonself()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# align rows in the provided dataset, lineup2ex
aligned <- align_matrix_rows(lineup2ex$gastroc, lineup2ex$islet)
# find correlated columns
selected_genes <- (corr_betw_matrices(aligned[[1]], aligned[[2]], "paired") > 0.75)
# calculate correlation between rows
similarity <- corr_betw_matrices(t(lineup2ex$gastroc[,selected_genes]),
                                 t(lineup2ex$islet[,selected_genes]), "all")
# pull out the problems, looking by row (where best > self + 0.3)
problems_byrow <- get_problems(similarity, get_min=FALSE, threshold=0.3)

# pull out the problems, looking by column (where best > self + 0.3)
problems_bycol <- get_problems(similarity, get_min=FALSE, threshold=0.3,
                               dimension="column")

lineup2 documentation built on June 15, 2021, 9:07 a.m.