In montilab/BS831: Boston University - Genomics Data Mining

knitr::opts_chunk$set(message=FALSE, warning=FALSE)

Here we show a simple example of the gains in efficiency that can be achieved by taking advantage of matrix operations rather than multiple calls to a function. For more on the topic, see this RCS module.

In this example, we compare three approaches to computing the row sums of a large matrix. In the first (inefficient) approach, we make use of a double for loop. In the second (still inefficient) approach, we make use of the function apply. In the third (most efficient) approach, we make use of the cartesian product operator %*%.

Use of `for` loop

set.seed(123) # for reproducible results
N <- 10000000 # 10^7
X <- matrix(rnorm(N),nrow=N/10,ncol=10) # a 10^6-by-10 matrix

## use of for loop
forSum <- function(X)
{
    rowS <- rep(0,nrow(X))
    for ( i in 1:nrow(X) )
        for ( j in 1:ncol(X) )
            rowS[i] <- rowS[i] + X[i,j]
    return( rowS )
}
Tfor <- system.time(tmp0 <- forSum(X))
print(Tfor)

Use of `apply`

## use of apply function
Tsum <- system.time(tmp1 <- apply(X,1,sum))
print(Tsum)

Use of inner product (`%*%`)

## use of matrix multiplication
In <- rep(1,10) # define a unit vector
Tprd <- system.time(tmp2 <- X %*% In)
print(Tprd)

## let's measure the speed-up
Tsum["sys.self"]/Tprd["sys.self"] # CPU
Tsum["user.self"]/Tprd["user.self"] # CPU + R/W
Tfor["user.self"]/Tprd["user.self"] # CPU + R/W

As you can see, the matrix-based sum achieves a r round(Tsum["user.self"]/Tprd["user.self"])-fold speed-up relative to the apply-based implementation, and a r round(Tfor["user.self"]/Tprd["user.self"])-fold speed-up relative to the for loop-based implementation.]

It should be noted that there are actually native R functions for the efficient sum of a matrix rows and columns (rowSums and colSums). However, the point of this basic demonstration holds.

The gain in efficiency is due to the fact that in the for loop formulation, all the operations are performed in R (which is an interpreted language, hence inefficient), and in the apply formulation there are 1M function calls (1M calls of the sum function), which is also inefficient. In the %*% formulation, there is a single R function call, and all the number crunching is performed in whatever (compiled) language the %*% is implemented, most likely Fortran.

In fact, notice that the gain in efficiency of the %*% over apply is largely lost when computing the column sum, since there are only 10 columns, hence only ten calls of the sum function. The gain in efficiency with respect to the for loop is still there.

## use of apply function
T3 <- system.time(tmp3 <- forSum(t(X)))
print(T3)

## use of apply function
T4 <- system.time(tmp4 <- apply(X,2,sum))
print(T4)

## use of matrix multiplication
In <- rep(1,1000000) # define a unit vector
T5 <- system.time(tmp5 <- t(X) %*% In)
print(T5)

Native row- (and col-)wide R functions

As noted, cognizant of this issue, R provides efficient functions to perform row and column sums.

T6 <- system.time(rowSums(X))
print(T6)

T7 <- system.time(colSums(X))
print(T7)

montilab/BS831 documentation built on June 9, 2025, 4:01 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

montilab/BS831
Boston University - Genomics Data Mining

In montilab/BS831: Boston University - Genomics Data Mining

Use of `for` loop

Use of `apply`

Use of inner product (`%*%`)

Native row- (and col-)wide R functions

R Package Documentation

Browse R Packages

We want your feedback!

montilab/BS831 Boston University - Genomics Data Mining

In montilab/BS831: Boston University - Genomics Data Mining

Use of for loop

Use of apply

Use of inner product (%*%)

Native row- (and col-)wide R functions

R Package Documentation

Browse R Packages

We want your feedback!

montilab/BS831
Boston University - Genomics Data Mining

Use of `for` loop

Use of `apply`

Use of inner product (`%*%`)