find_similar_clms: Finds columns that have a high degree of correlation with...

View source: R/find_similar_clms.R

find_similar_clmsR Documentation

Finds columns that have a high degree of correlation with this_clm

Description

Takes this_clms and compares it to all other columns of my_df using stats::cor.test. If the absolute value of that row is greater than acceptable_similarity, then the either a boolean is returned true or the similar columns are returned.

Usage

find_similar_clms(
  this_clm,
  my_df,
  test_clms = NULL,
  acceptable_similarity = 0.9,
  return_boolean = T,
  my_key = get_default_sample_key(),
  corr_method = "spearman"
)

Arguments

this_clm

string to specify column that will be compared to the others

my_df

data.frame to search

test_clms

character vector of columns to test. If left blank all, but this_clm and my_key will be checked.

acceptable_similarity

number to indicate the max absolute rho value that will be allowed with other columns.

return_boolean

boolean to specify if a boolean is returned or the found_clms

my_key

string to specify key column of the data.frame

corr_method

string to specify the test used for the correlation. Passed to method arg of stats::cor.test

Details

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ find_similar_clms ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Benjamin-Vincent-Lab/binfotron documentation built on Oct. 1, 2024, 8:33 p.m.