Method 1 creates an empty vector, and grows the object:
n = 100000 myvec = NULL; myvec = c() for (i in 1:n) myvec = c(myvec, i)
Create an object of the final length and then changes the values in the object by subscripting:
myvec = numeric(n) for (i in 1:n) myvec[i] = i
Directly creates the final object:
myvec = 1:n
$n$ | 1 | 2 | 3 ----|---|---|---- $10^5$ | 0.208 | 0.024 | 0.000 $10^6$ | 25.50 | 0.220 | 0.000 $10^7$ | 3827.0 | 2.212 | 0.000
local(source("code/01-vector_growth.R", local = TRUE))
Object growth can be quite insidious since it is easy to hide growing objects in your code. For example:
n = 2 hit = NULL for (i in 1:n) { if (runif(1) < 0.3) hit[i] = TRUE else hit[i] = FALSE }
Exercise Rewrite the above code to avoid object growth
A more common - and possibly more dangerous - problem is with rbind
:
df1 = data.frame(a = character(0), b = numeric(0)) for (i in 1:n) df1 = rbind(df1, data.frame(a = sample(letters, 1), b = runif(1)))
When writing code in R, you need to remember that you are using R and not C (or even F77!). For example,
x = runif(1000) + 1 logsum = 0 for (i in seq_along(x)) logsum = logsum + log(x[i])
This is a piece R code that has a strong, unhealthy influence from C.
Instead, we should write
logsum = sum(log(x))
x = runif(2)
Another common example is subsetting a vector. When writing in C, we would have something like:
x = rnorm(10) ans = NULL for (i in seq_along(x)) { if (x[i] < 0) ans = c(ans, x[i]) }
Exercise: Rewrite the above code in a vectorised format
It's also important to make full use of R functions that use vectors. For example, suppose we wish to estimate [ \int_0^1 x^2 dx ] using a basic Monte-Carlo method.
local(source("code/01-monte_carlo.R", local = TRUE))
hits = 0
i
in 1:N
hits = hits + 1
hits/N
.N = 500000 f = function(N) { hits = 0 for (i in 1:N) { u1 = runif(1); u2 = runif(1) if (u1^2 > u2) hits = hits + 1 } return(hits / N) }
Which in R takes a few seconds:
system.time(f(N))
N = 500000 f = function(N) { hits = 0 for (i in 1:N) { u1 = runif(1); u2 = runif(1) if (u1^2 > u2) hits = hits + 1 } return(hits / N) }
Exercise: Can you vectorise the above code?
Put any object creation outside the loop. For example
jitter = function(x, k) rnorm(1, x, k) parts = rnorm(10) post = numeric(length(parts))
for (i in seq_along(parts)) { k = 1.06 * sd(parts) / length(parts) post[i] = jitter(parts[i], k) }
Can be rewritten as
k = 1.06 * sd(parts) / length(parts) for (i in seq_along(parts)) post[i] = jitter(parts[i], k)
vignette("common", package = "efficientTutorial")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.