cor.parallell: Faster R cor() function using parallellization

Description Usage Arguments Author(s) Examples

Description

#' This function iterates one row at a time and then writes out each correlation to a file. It is done in a manner that requires low memory due to the fact that only one row is is read and calculated at a time, and each single correlation is then appended to a file (instead of heaping up in memory).

Usage

1
2
3
cor.parallell(df, var, var.list = NULL, file = "test.txt",
  correlation_type = "pearson", read.file = F, annotate = F,
  no_cores = "", use = "na.or.complete")

Arguments

df

a numeric data frame or matrix with rows and columns corresponding to variables and samples, respectively.

var

variable to do the correlation with. Must be part of df.

var.list

which variables to correlate to. Defaults to row.names.

file

txt file for storing results

correlation_type

correlation methods may be one of "pearson" (default), "kendall", "spearman".

read.file

defaults to FALSE. If TRUE, then assigns object to global environment using the name specified in the file

no_cores

number of cores used may be specified manually or it will be designated using all available cores - 1 (default)

use

an optional character string giving a method for computing covariances in the presence of missing values. This must be (an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs". See detailed description in ?cor

Author(s)

Peter Utnes utnesp@gmail.com

Examples

1
2
3
4
5
6
cor.parallell(counts, "ENSG00000134323", file = "/path/to/file/MYCN.cor.txt")

# If annotating ensembl_gene_id's, be sure to have set a default mart:
if ( exists("mart") == "FALSE") {
    mart = useMart("ENSEMBL_MART_ENSEMBL", dataset='hsapiens_gene_ensembl')
}

utnesp/Faster-R-cor-function documentation built on May 3, 2019, 2:39 p.m.