big.t: Transpose function for big.matrix objects

Description Usage Arguments Value Examples

View source: R/bigpca.R

Description

At the time of writing, there is no transpose method for big.matrix() This function returns a new filebacked big.matrix which is the transpose of the input big.matrix. max.gb allows periodic manual flushing of the memory to be conducted in case the built-in memory management of R/bigmemory is not working as desired. This method is a non-native (not using the raw C objects from the package but merely standard R accessors and operations) algorithm to transpose a big matrix efficiently for memory usage and speed. A blank matrix is created on disk and the data is block-wise transposed and buffered into the new matrix.

Usage

1
2
3
big.t(bigMat, dir = NULL, name = "t.bigMat", R.descr = NULL,
  max.gb = NA, verbose = F, tracker = NA, file.ok = T,
  delete.existing = getOption("deleteFileBacked"))

Arguments

bigMat

default, a big.matrix(), although if 'file.ok' is set TRUE, then this can be a big.matrix descriptor, or a file location

dir

the directory for the matrix backing file (preferably for both the original and the proposed transposed matrix). If this is left NULL and bigMat contains a path, this path (via dirname(bigMat)) will be used; if it doesn't contain a path the current working directory will be used

name

the basename of the new transposed matrix

R.descr

the name of a binary file that will store the big.matrix.descriptor for the transposed matrix. If "" then the descriptor won't be saved. If NULL, then it will be <name>.RData

max.gb

the maximum number of GB of data to process before flushing the big.matrix

verbose

whether to print messages about each stage of the process

tracker

whether to use a progress bar. NA means it will only be used if the matrix in question is larger than 1GB.

file.ok

whether to accept big.matrix.descriptors or filenames as input for 'bigMat'; if T, then anything that works with get.big.matrix(bigMat,dir) is acceptable

delete.existing

logical, whether to automatically delete filebacked matrices (if they exist) before rewriting. This is because of an update since 20th October 2015 where bigmemory won't allow overwrite of an existing filebacked matrix. If you wish to set this always TRUE or FALSE, use options(deleteFileBacked)

Value

A big.matrix that is the transpose (rows and columns switched) of the original matrix

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
orig.dir <- getwd(); setwd(tempdir()); # move to temporary dir
if(file.exists("test.bck")) { unlink(c("test.bck","test.dsc")) }
bM <- filebacked.big.matrix(200, 500,
       dimnames = list(paste("r",1:200,sep=""), paste("c",1:500,sep="")),
       backingfile = "test.bck",  backingpath = getwd(), descriptorfile = "test.dsc")
bM[1:200,] <- replicate(500,rnorm(200))
prv.big.matrix(bM)
tbM <- big.t(bM,verbose=TRUE)
prv.big.matrix(tbM)
rm(tbM)
rm(bM)  
unlink(c("t.bigMat.RData","t.bigMat.bck","t.bigMat.dsc","test.bck","test.dsc"))
setwd(orig.dir)

Example output

Loading required package: reader
Loading required package: NCmisc

Attaching package: 'reader'

The following objects are masked from 'package:NCmisc':

    cat.path, get.ext, rmv.ext

Loading required package: bigmemory
Loading required package: biganalytics
Loading required package: foreach
Loading required package: biglm
Loading required package: DBI
Warning messages:
1: replacing previous import 'reader::cat.path' by 'NCmisc::cat.path' when loading 'bigpca' 
2: replacing previous import 'reader::get.ext' by 'NCmisc::get.ext' when loading 'bigpca' 
3: replacing previous import 'reader::rmv.ext' by 'NCmisc::rmv.ext' when loading 'bigpca' 
Big matrix with: 200 rows, 500 columns
 - data type: numeric 

              colnames 
Row# rownames       c1       c2  .....      c500 
   1       r1  -0.7424  -0.3242   ...    -0.7402 
   2       r2   0.5503   0.9438   ...     0.0672 
   3       r3  -0.2551  -2.2763   ...     0.1718 
  ..     ....      ...      ...   ...        ... 
 200     r200   0.1255  -0.3663   ...     1.3799 

 creating 500 x 200 target matrix, t.bigMat ...done

Adding names
 added colnames
 added rownames
 transposing 'bigMat' into new big.matrix object:
 combining complete, converting result to big matrix
 created big.matrix description file: t.bigMat.dsc 
 created big.matrix backing file: t.bigMat.bck 
 created big.matrix binary description file: t.bigMat.RData 
Warning message:
In big.t(bM, verbose = TRUE) :
  number of columns quite small, may cause issues
Big matrix with: 500 rows, 200 columns
 - data type: numeric 

              colnames 
Row# rownames       r1       r2  .....      r200 
   1       c1  -0.7424   0.5503   ...     0.1255 
   2       c2  -0.3242   0.9438   ...    -0.3663 
   3       c3  -0.3374  -1.5546   ...     -0.685 
  ..     ....      ...      ...   ...        ... 
 500     c500  -0.7402   0.0672   ...     1.3799 

bigpca documentation built on Nov. 22, 2017, 1:02 a.m.