Maximal Sum Subarray

Description

Find a subarray with maximal positive sum.

Usage

1
2
3
maxsub(x, inds = TRUE, compiled = TRUE)

maxsub2d(A)

Arguments

x

numeric vector.

A

numeric matrix

inds

logical; shall the indices be returned?

compiled

logical; shall the compiled version be used?

Details

maxsub finds a contiguous subarray whose sum is maximally positive. This is sometimes called Kadane's algorithm.
maxsub will use a compiled and very fast version with a running time of O(n) where n is the length of the input vector x.

maxsub2d finds a (contiguous) submatrix whose sum of elements is maximally positive. The approach taken here is to apply the one-dimensional routine to summed arrays between all rows of A. This has a run-time of O(n^3), though a run-time of O(n^2 log n) seems possible see the reference below.
maxsub2d uses a Fortran workhorse and can solve a 1000-by-1000 matrix in a few seconds—but beware of biggere ones

Value

Either just a maximal sum, or a list this sum as component sum plus the start and end indices as a vector inds.

Note

In special cases, the matrix A may be sparse or (as in the example section) only have one nonzero element in each row and column. Expectation is that there may exists a more efficient (say O(n^2)) algorithm in this extreme case.

Author(s)

HwB <hwborchers@googlemail.com>

References

Bentley, Jon (1986). “Programming Pearls”, Column 7. Addison-Wesley Publ. Co., Reading, MA.

T. Takaoka (2002). Efficient Algorithms for the Maximum Subarray Problem by Distance Matrix Multiplication. The Australasian Theory Symposion, CATS 2002.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
##  Find a maximal sum subvector
set.seed(8237)
x <- rnorm(1e6)
system.time(res <- maxsub(x, inds = TRUE, compiled = FALSE))
res

##  Standard example: Find a maximal sum submatrix
A <- matrix(c(0,-2,-7,0, 9,2,-6,2, -4,1,-4,1, -1,8,0,2),
            nrow = 4, ncol = 4, byrow =TRUE)
maxsub2d(A)
# $sum:  15
# $inds: 2 4 1 2 , i.e., rows = 2..4, columns = 1..2

## Not run: 
##  Application to points in the unit square:
set.seed(723)
N <- 50; w <- rnorm(N)
x <- runif(N); y <- runif(N)
clr <- ifelse (w >= 0, "blue", "red")
plot(x, y, pch = 20, col = clr, xlim = c(0, 1), ylim = c(0, 1))

xs <- unique(sort(x)); ns <- length(xs)
X  <- c(0, ((xs[1:(ns-1)] + xs[2:ns])/2), 1)
ys <- unique(sort(y)); ms <- length(ys)
Y  <- c(0, ((ys[1:(ns-1)] + ys[2:ns])/2), 1)
abline(v = X, col = "gray")
abline(h = Y, col = "gray")

A <- matrix(0, N, N)
xi <- findInterval(x, X); yi <- findInterval(y, Y)
for (i in 1:N) A[yi[i], xi[i]] <- w[i]

msr <- maxsub2d(A)
rect(X[msr$inds[3]], Y[msr$inds[1]], X[msr$inds[4]+1], Y[msr$inds[2]+1])

## End(Not run)