knitr::opts_chunk$set(
  comment=NA,
  fig.width=10,
  fig.height=6
)

An R Package of datasets to help predict the timeline preceding an Intelligence Explosion.

Installation

install.packages("https://github.com/AABoyles/AIPredict/archive/master.tar.gz", type = "source")

Data

This package contains the following datasets:

Moore's Law

Roughly stated, Moore's law predicts that the density of transistors in a single processor core grows exponentially. [@Moore1998-ze] It is a widely used and cited metric in predictions about the development of Artificial General Intelligence. Perhaps the best-known of these is futurist Ray Kurzweil's projections in The Singularity is Near [-@Kurzweil2005-mh], which are based on simple, linear extrapolations of Moore's law.

library(AIPredict)
library(dplyr)
library(ggplot2)

ai_moores_law %>%
  ggplot(aes(Year, Transistors)) +
  geom_point() +
  scale_y_log10() +
  stat_smooth(method="lm")

Koomey's Law

Koomey's law states that the electricty required to execute some number of computations declines exponentially over time [@Koomey2011-fc]. While less well-known than Moore's law, it offers another critical benchmark for comparison to the human brain, which computes a mind on a metabolic budget of approximately 10 watts.

ai_koomeys_law %>%
  ggplot(aes(Year, WattsPerMCPS)) +
  geom_point() +
  scale_y_log10() +
  stat_smooth(method="lm")

Every decade, the energy cost of computing falls approximately two orders of magnitude.

Top500 Supercomputers

The Top500 Supercomputers [@noauthor_undated-sy]. Another good source for demonstrating Moore's and Koomey's laws.

ai_top500 %>%
  filter(Rank == 1) %>%
  mutate(RMAX=ifelse(is.na(RMax), Rmax, RMax)) %>%
  filter(!is.na(RMAX)) %>%
  ggplot(aes(Year, RMAX)) +
  geom_jitter(width = 1, height = 0, alpha=.3, size=5) +
  scale_y_log10() +
  stat_smooth(method = "lm")

While generally accessible, these data currently require a good deal of cleaning, which I'll perform and document in due course.

Graph500 Supercomputers

The Graph500 [@noauthor_undated-kj] measures a computer's performance in Traversed Edges per Second (TEPS).

ai_graph_500 %>%
  filter(Rank == 1, !is.na(GTEPS)) %>%
  ggplot(aes(GraphYear, GTEPS)) +
  geom_point(alpha=.3, size=5)

GreenGraph500

Like the Graph500, GreenGraph500 measures TEPS, but penalizes for energy costs, optimizing the ratio of TEPS to Watts (MTEPS/W).

ai_green_graph_500 %>%
  filter(Big == 0, !is.na(`MTEPS/W`)) %>%
  ggplot(aes(List, `MTEPS/W`)) +
  geom_point(alpha=.3, size=5)

Bitcoin Hashrate

The Hashrate of the Bitcoin network provides a useful insight into the growth of financially-motivated expenditure of computing resources to compute the solution of a single problem. I suspect that this will be a useful point of comparison as the network's exercised capacity approaches levels comparable to the the human brain.

ai_bitcoin_hashrate %>%
  ggplot(aes(Date, GHPS)) +
  geom_line()

Animal Brains

As our computational capacity climbs through the ranks of the animal kingdom, there are a variety of metrics which would be useful for comparison. The Number of Neurons, Cortical Neurons (in mammals), Synapses, Brain size, Brain-to-Body Mass Ratio, Encephalization Quotient and Cranial Capacity might all be useful in this line of research. Sadly, I've not yet found any sources (let alone reliable ones) for more than a few species. This dataset was scraped from the Wikipedia's List of Animals by Number of Neurons. Those, in turn, were assembled from a variety of sources, most prominently @Herculano-Houzel2007-qt. (This last set is on my list of sources to add.)

ai_animal_neurons %>%
  ggplot(aes(Neurons, Synapses)) +
  geom_point() +
  scale_y_log10() +
  scale_x_log10() +
  stat_smooth(method = "lm")

The logarithmic scale in both dimensions suggests a power-law relationship, but this is derived from a very small, very noisy sample.

Future of Life Institute Winning Grants

I don't know if there's anything interesting to be inferred from the Future of Life Institute Grant Recipients, but I collected this data when it was first published and this seems as appropriate a venue for its dissemination as any.

ai_fli_winners %>%
  group_by(Institution) %>%
  summarise(Total = sum(Amount)) %>% 
  ggplot(aes(Institution, Total)) +
  geom_bar(stat = "identity") +
  theme(axis.text.x = element_text(angle=90, hjust = 1, vjust = .5))

Data to be added

This is an obviously incomplete project. Besides the sources demonstrated above, I have the following known sources to be prepared for use in this package:

Desired Data

This notably misses many estimates about the computational capacities of animal nervous systems, especially brains. Other types of estimates about the growth of computing power (e.g. global computing power, size of various commercial cloud infrastructure providers, etc.) are also needed. If you know of any reliable sources of these or related to these topics, please email me.

References



AABoyles/AIPredict documentation built on Nov. 15, 2019, 9:20 a.m.