knitr::opts_chunk$set( comment=NA, fig.width=10, fig.height=6 )
An R Package of datasets to help predict the timeline preceding an Intelligence Explosion.
install.packages("https://github.com/AABoyles/AIPredict/archive/master.tar.gz", type = "source")
This package contains the following datasets:
ai_moores_law
- contains observations contributing to estimates of Moore's law.ai_koomeys_law
- contains observations contributing to estimates of Koomey's law. Derived from the dataset constructed by Jonathan Koomey, which is archived in the data-raw/
directory.ai_prediction
- contains public estimates of Artificial Intelligence milestones. Derived from the dataset produced by AI Impacts, which is archived in the data-raw/
directory.ai_bitcoin_hashrate
- contains the instantaneous hashrate of the bitcoin network measured daily at 6:15:05pm UTC.ai_animal_neurons
- contains the merged tables in the Wikipedia Article, List of Animals by Number of Neuronsai_fli_winners
- contains the published data of the winners of the Future of Life Institute's 2015 RFP for grants on research for safe artificial intelligence.Roughly stated, Moore's law predicts that the density of transistors in a single processor core grows exponentially. [@Moore1998-ze] It is a widely used and cited metric in predictions about the development of Artificial General Intelligence. Perhaps the best-known of these is futurist Ray Kurzweil's projections in The Singularity is Near [-@Kurzweil2005-mh], which are based on simple, linear extrapolations of Moore's law.
library(AIPredict) library(dplyr) library(ggplot2) ai_moores_law %>% ggplot(aes(Year, Transistors)) + geom_point() + scale_y_log10() + stat_smooth(method="lm")
Koomey's law states that the electricty required to execute some number of computations declines exponentially over time [@Koomey2011-fc]. While less well-known than Moore's law, it offers another critical benchmark for comparison to the human brain, which computes a mind on a metabolic budget of approximately 10 watts.
ai_koomeys_law %>% ggplot(aes(Year, WattsPerMCPS)) + geom_point() + scale_y_log10() + stat_smooth(method="lm")
Every decade, the energy cost of computing falls approximately two orders of magnitude.
The Top500 Supercomputers [@noauthor_undated-sy]. Another good source for demonstrating Moore's and Koomey's laws.
ai_top500 %>% filter(Rank == 1) %>% mutate(RMAX=ifelse(is.na(RMax), Rmax, RMax)) %>% filter(!is.na(RMAX)) %>% ggplot(aes(Year, RMAX)) + geom_jitter(width = 1, height = 0, alpha=.3, size=5) + scale_y_log10() + stat_smooth(method = "lm")
While generally accessible, these data currently require a good deal of cleaning, which I'll perform and document in due course.
The Graph500 [@noauthor_undated-kj] measures a computer's performance in Traversed Edges per Second (TEPS).
ai_graph_500 %>% filter(Rank == 1, !is.na(GTEPS)) %>% ggplot(aes(GraphYear, GTEPS)) + geom_point(alpha=.3, size=5)
Like the Graph500, GreenGraph500 measures TEPS, but penalizes for energy costs, optimizing the ratio of TEPS to Watts (MTEPS/W).
ai_green_graph_500 %>% filter(Big == 0, !is.na(`MTEPS/W`)) %>% ggplot(aes(List, `MTEPS/W`)) + geom_point(alpha=.3, size=5)
The Hashrate of the Bitcoin network provides a useful insight into the growth of financially-motivated expenditure of computing resources to compute the solution of a single problem. I suspect that this will be a useful point of comparison as the network's exercised capacity approaches levels comparable to the the human brain.
ai_bitcoin_hashrate %>% ggplot(aes(Date, GHPS)) + geom_line()
As our computational capacity climbs through the ranks of the animal kingdom, there are a variety of metrics which would be useful for comparison. The Number of Neurons, Cortical Neurons (in mammals), Synapses, Brain size, Brain-to-Body Mass Ratio, Encephalization Quotient and Cranial Capacity might all be useful in this line of research. Sadly, I've not yet found any sources (let alone reliable ones) for more than a few species. This dataset was scraped from the Wikipedia's List of Animals by Number of Neurons. Those, in turn, were assembled from a variety of sources, most prominently @Herculano-Houzel2007-qt. (This last set is on my list of sources to add.)
ai_animal_neurons %>% ggplot(aes(Neurons, Synapses)) + geom_point() + scale_y_log10() + scale_x_log10() + stat_smooth(method = "lm")
The logarithmic scale in both dimensions suggests a power-law relationship, but this is derived from a very small, very noisy sample.
I don't know if there's anything interesting to be inferred from the Future of Life Institute Grant Recipients, but I collected this data when it was first published and this seems as appropriate a venue for its dissemination as any.
ai_fli_winners %>% group_by(Institution) %>% summarise(Total = sum(Amount)) %>% ggplot(aes(Institution, Total)) + geom_bar(stat = "identity") + theme(axis.text.x = element_text(angle=90, hjust = 1, vjust = .5))
This is an obviously incomplete project. Besides the sources demonstrated above, I have the following known sources to be prepared for use in this package:
This notably misses many estimates about the computational capacities of animal nervous systems, especially brains. Other types of estimates about the growth of computing power (e.g. global computing power, size of various commercial cloud infrastructure providers, etc.) are also needed. If you know of any reliable sources of these or related to these topics, please email me.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.