combn_sub: Random subset of all possible combinations

Description Usage Arguments Details Value Author(s) See Also Examples

Description

This function returns a random subset (of n <= 5000) of possible combinations defined by combn(n, m)

Usage

1
combn_sub(n, m, sub=NA)

Arguments

n

An integer supplied to combn. See Details.

m

Number of elements to choose.

sub

An integer specifying the length of the subset to return. Must be less than choose(n,m), which must in turn be less than 5000. See Details.

Details

n defines the upper limit of a sequence of integers from which to select x items from. See combn for more information. Note that supplying a vector of integers (which is supported by combn is not yet supported here.

The number of returned combinations (sub) has been arbitrarily limited to 5000 items for performance reasons. That being said, 5000 sub-samples from a population is usually large enough for most purposes. Contact the author if you disagree. If sub is not supplied then it is set to choose(n,m), though this will result in an error if this number exceeds the 5000-item limit.

If the number of possible combinations is less than 500000 then all possible combinations will be enumerated using combn(n, m) and a random subset will be selected from this. Again, 500000 is an arbitrary limit based on performance considerations.

If the number of possible combinations is greater than 500000 (i.e. more than 1000-times the maximum allowable number of returnable items) the function will construct a matrix with sub unique combination of x. This proceeds via a for-loop which could be slow for large values of sub (hence the limit of 5000).

Value

A matrix with sub columns. Each column contains a unique combination of x of length m. This is the same format returned by combn. If sub == choose(n, m) then the result should be identical to combn(n, m).

Author(s)

Daniel Pritchard

See Also

choose, combn.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Simple (and quick) examples:
out_a <- combn_sub(41,2, 2)
out_b <- combn_sub(41,20, 2)

# The following will be quite quick because choose(41,37) < 500000.
# In this scenario, all possible combinations are generated 
#  and 5000 items are selected from the result. 
quite_fast <- combn_sub(41,37,5000)

# The following is very slow because choose(41,36) > 500000.
# In this scenario, a for loop is used to generate 5000 unique items.
quite_slow <- combn_sub(41,36,5000)
stopifnot(!any(duplicated(t(quite_slow))))

# Should return all 820 combinations:
out_820 <- combn_sub(41,2)
stopifnot(all(dim(out_820)==c(2,820)))

# Which should be equal to calling combin() directly:
stopifnot(all(out_820 == combn(41,2)))

# Should fail
## Not run: 
	
combn_sub(41,20, 5001)
combn_sub(41,20)

## End(Not run)

dpritchard/dgmisc documentation built on May 15, 2019, 1:50 p.m.