Method 1 creates an empty vector, and grows the object:
n = 100000 myvec = NULL; myvec = c() for(i in 1:n) myvec = c(myvec, i)
Create an object of the final length and then changes the values in the object by subscripting:
myvec = numeric(n) for(i in 1:n) myvec[i] = i
Directly creates the final object:
myvec = 1:n
$n$ | 1 | 2 | 3 ----|---|---|---- $10^5$ | 0.208 | 0.024 | 0.000 $10^6$ | 25.50 | 0.220 | 0.000 $10^7$ | 3827.0 | 2.212 | 0.000
local(source("code/01-vector_growth.R", local=TRUE))
Object growth can be quite insidious since it is easy to hide growing objects in your code. For example:
n = 2 hit = NULL for(i in 1:n) { if(runif(1) < 0.3) hit[i] = TRUE else hit[i] = FALSE }
Exercise Rewrite the above code to avoid object growth
A more common - and possibly more dangerous - problem is with rbind
:
df1 = data.frame(a = character(0), b = numeric(0)) for(i in 1:n) df1 = rbind(df1, data.frame(a = sample(letters, 1), b = runif(1)))
When writing code in R, you need to remember that you are using R and not C (or even F77!). For example,
x = runif(1000) + 1 logsum = 0 for(i in 1:length(x)) logsum = logsum + log(x[i])
This is a piece R code that has a strong, unhealthy influence from C.
Instead, we should write
logsum = sum(log(x))
x = runif(2)
Another common example is subsetting a vector. When writing in C, we would have something like:
x = rnorm(10) ans = NULL for(i in 1:length(x)) { if(x[i] < 0) ans = c(ans, x[i]) }
Exercise: Rewrite the above code in a vectorised format
It's also important to make full use of R functions that use vectors. For example, suppose we wish to estimate [ \int_0^1 x^2 dx ] using a basic Monte-Carlo method.
local(source("code/01-monte_carlo.R", local=TRUE))
hits = 0
i
in 1:N
hits = hits + 1
hits/N
.N = 500000 f = function(N){ hits = 0 for(i in 1:N) { u1 = runif(1); u2 = runif(1) if(u1^2 > u2) hits = hits + 1 } return(hits/N) }
Which in R takes a few seconds:
system.time(f(N))
N = 500000 f = function(N){ hits = 0 for(i in 1:N) { u1 = runif(1); u2 = runif(1) if(u1^2 > u2) hits = hits + 1 } return(hits/N) }
Exercise: Can you vectorise the above code?
Put any object creation outside the loop. For example
jitter = function(x, k) rnorm(1, x, k) parts = rnorm(10) post = numeric(length(parts))
for(i in 1:length(parts)){ k = 1.06*sd(parts)/length(parts) post[i] = jitter(parts[i], k) }
Can be rewritten as
k = 1.06*sd(parts)/length(parts) for(i in 1:length(parts)) post[i] = jitter(parts[i], k)
vignette("common", package = "efficientTutorial")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.