# acss: ACSS complexity In acss: Algorithmic Complexity for Short Strings

## Description

Functions to obtain the algorithmic complexity for short strings, an approximation of the Kolmogorov Complexity of a short string using the coding theorem method.

## Usage

 ```1 2 3 4 5 6 7 8 9``` ```acss(string, alphabet = 9) local_complexity(string, alphabet = 9, span = 5) likelihood_d(string, alphabet = 9) likelihood_ratio(string, alphabet = 9) prob_random(string, alphabet = 9, prior= 0.5) ```

## Arguments

 `string` `character` vector containing the to be analyzed strings (can contain multiple strings). `alphabet` `numeric`, the number of possible symbols (not necessarily actually appearing in str). Must be one of `c(2, 4, 5, 6, 9)` (can also be `NULL` or contain multiple values for `acss()`). Default is 9. `prior` `numeric`, the prior probability that the underlying process is random. `span` size of substrings to be created from `string`.

## Details

The algorithmic complexity is computed using the coding theorem method: For a given alphabet size (number of different symbols in a string), all possible or a large number of random samples of Turing machines (TM) with a given number of states (e.g., 5) and number of symbols corresponding to the alphabet size were simulated until they reached a halting state or failed to end. The outputs of the TMs at the halting states produces a distribution of strings known as the algorithmic probability of the strings. The algorithmic coding theorem (Levin, 1974) establishes the connection between the complexity of a string K(s) and its algorithmic probability D(s) as:

K(s) = log(D(s))

This package accesses a database containing data on 4.5 million strings from length 1 to 12 simulated on TMs with 2, 4, 5, 6, and 9 symbols.

For a more detailed discussion see Gauvrit, Singmann, Soler-Toscano, and Zenil (2014), http://complexitycalculator.com/methodology.html, or references below.

## Value

"acss"

A matrix in which the rows correspond to the strings entered and the columns to the algorithmic complexity K and the algorithmic probability D of the string (see http://complexitycalculator.com/methodology.html).

"local_complexity"

A list with elements corresponding to the strings. Each list containes a named vector of algorithmic complexities (K) of all substrings in each string with length span.

"likelihood_d"

A named vector with the likelihoods for `string` given a detreministic process.

"likelihood_ratio"

A named vector with the likelihood ratios (or Bayes factors) for `string` given a random rather than detreministic process.

"prob_random"

A named vector with the posterior probabilities that for a random process given the strings and the provided prior for being produced by a random process (default is 0.5, which correspond to a prior of 1 - 0.5 = 0.5 for a detereministic process).

## Note

The first time per session one of the functions described here is used, a relatively large dataset is loaded into memory which can take a considerable amount of time (> 10 seconds).

## References

Delahaye, J.-P., & Zenil, H. (2012). Numerical evaluation of algorithmic complexity for short strings: A glance into the innermost structure of randomness. Applied Mathematics and Computation, 219(1), 63-77. doi:10.1016/j.amc.2011.10.006

Gauvrit, N., Singmann, H., Soler-Toscano, F., & Zenil, H. (2014). Algorithmic complexity for psychology: A user-friendly implementation of the coding theorem method. arXiv:1409.4080 [cs, stat]. http://arxiv.org/abs/1409.4080.

Gauvrit, N., Zenil, H., Delahaye, J.-P., & Soler-Toscano, F. (2014). Algorithmic complexity for short binary strings applied to psychology: a primer. Behavior Research Methods, 46(3), 732-744. doi:10.3758/s13428-013-0416-0

Levin, L. A. (1974). Laws of information conservation (nongrowth) and aspects of the foundation of probability theory. Problemy Peredachi Informatsii, 10(3), 30-35.

Soler-Toscano, F., Zenil, H., Delahaye, J.-P., & Gauvrit, N. (2012). Calculating Kolmogorov Complexity from the Output Frequency Distributions of Small Turing Machines. PLoS ONE, 9(5): e96223.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66``` ```# WARNING: The first call to one of the functions # discussed on this page loads a large data set # and usually takes > 10 seconds. Stay patient. acss(c("HEHHEE", "GHHGGHGHH", "HSHSHHSHSS")) ## K.9 D.9 ## HEHHEE 23.38852 9.106564e-08 ## GHHGGHGHH 33.50168 8.222205e-11 ## HSHSHHSHSS 35.15241 2.618613e-11 acss(c("HEHHEE", "GHHGGHGHH", "HSHSHHSHSS"))[,"K.9"] ## [1] 23.38852 33.50168 35.15241 acss(c("HEHHEE", "GHHGGHGHH", "HSHSHHSHSS"), alphabet = 2) ## K.2 D.2 ## HEHHEE 14.96921 3.117581e-05 ## GHHGGHGHH 25.60208 1.963387e-08 ## HSHSHHSHSS 26.90906 7.935321e-09 acss(c("HEHHEE", "GHHGGHGHUE", "HSHSHHSHSS"), NULL) ## K.2 K.4 K.5 K.6 K.9 ## HEHHEE 14.96921 18.55227 19.70361 20.75762 23.38852 ## GHHGGHGHUE NA 31.75832 33.00795 34.27457 37.78935 ## HSHSHHSHSS 26.90906 29.37852 30.52566 31.76229 35.15241 ## D.2 D.4 D.5 D.6 ## HEHHEE 3.117581e-05 2.601421e-06 1.171176e-06 5.640722e-07 ## GHHGGHGHUE NA 2.752909e-10 1.157755e-10 4.812021e-11 ## HSHSHHSHSS 7.935321e-09 1.432793e-09 6.469341e-10 2.745360e-10 ## D.9 ## HEHHEE 9.106564e-08 ## GHHGGHGHUE 4.209915e-12 ## HSHSHHSHSS 2.618613e-11 ## Not run: likelihood_d(c("HTHTHTHT", "HTHHTHTT"), alphabet = 2) ## HTHTHTHT HTHHTHTT ## 0.010366951 0.003102718 likelihood_ratio(c("HTHTHTHT", "HTHHTHTT"), alphabet = 2) ## HTHTHTHT HTHHTHTT ## 0.3767983 1.2589769 prob_random(c("HTHTHTHT", "HTHHTHTT"), alphabet = 2) ## HTHTHTHT HTHHTHTT ## 0.2736772 0.5573217 ## End(Not run) local_complexity(c("01011010111" ,"GHHGGHGHUE"), alphabet = 5, span=5) ## \$`01011010111` ## 01011 10110 01101 11010 10101 01011 10111 ## 16.22469 16.24766 16.24766 16.22469 16.24322 16.22469 15.93927 ## ## \$GHHGGHGHUE ## GHHGG HHGGH HGGHG GGHGH GHGHU HGHUE ## 16.44639 16.44639 16.24766 16.22469 16.58986 16.86449 local_complexity(c("01011010111" ,"GHHGGHGHUE"), span=7) ## \$`01011010111` ## 0101101 1011010 0110101 1101011 1010111 ## 26.52068 26.52068 26.47782 26.62371 26.29186 ## ## \$GHHGGHGHUE ## GHHGGHG HHGGHGH HGGHGHU GGHGHUE ## 27.04623 26.86992 27.30871 27.84322 ```

acss documentation built on May 29, 2017, 10:40 a.m.