The goal of totervogel is to detect malicious twitter accounts using Benford’s Law.
Please refer to Golbeck (2015) and Golbeck (2019) for more details about the analysis of first significant digits. For the analysis of last significant digits, please refer to Dlugosz & Müller-Funk (2009)
You can install the development version of totervogel from GitHub with:
# install.packages("devtools")
devtools::install_github("chainsawriot/totervogel")
This is how the totervogel of an organic human account looks like. By default, it analyzes the friends.
library(totervogel)
res <- create_totervogel("scott_althaus")
res
#>
#> ── scott_althaus ──
#>
#> ● Type:Friends
#> ● Total: 276
#> ── First significant digit ──
#> ── Friends
#> Correlation: 0.991 / Chi-sq: 3.513
#>
#> ── Statuses
#> Correlation: 0.981 / Chi-sq: 6.143
#>
#> ── Followers
#> Correlation: 0.949 / Chi-sq: 13.892
#>
#> ── Last significant digit ──
#>
#> ── Friends
#> Chi-sq: 3.638
#>
#> ── Statuses
#> Chi-sq: 10.594
#>
#> ── Followers
#> Chi-sq: 16.391
plot(res)
You can also analyze followers.
res_fol <- create_totervogel("scott_althaus", followers = TRUE)
res_fol
#>
#> ── scott_althaus ──
#>
#> ● Type:Followers
#> ● Total: 444
#> ── First significant digit ──
#> ── Friends
#> Correlation: 0.975 / Chi-sq: 12.101
#>
#> ── Statuses
#> Correlation: 0.989 / Chi-sq: 4.37
#>
#> ── Followers
#> Correlation: 0.983 / Chi-sq: 7.891
#>
#> ── Last significant digit ──
#>
#> ── Friends
#> Chi-sq: 2.667
#>
#> ── Statuses
#> Chi-sq: 5.82
#>
#> ── Followers
#> Chi-sq: 5.144
plot(res_fol)
A potentially malicious twitter account’s totervogel results might look like:
(Please don’t visit these accounts.)
malicious_res <- create_totervogel("badluck_jones")
malicious_res
#>
#> ── badluck_jones ──
#>
#> ● Type:Friends
#> ● Total: 5000
#> ── First significant digit ──
#> ── Friends
#> Correlation: 0.954 / Chi-sq: 249.836
#>
#> ── Statuses
#> Correlation: 1 / Chi-sq: 3.486
#>
#> ── Followers
#> Correlation: 0.996 / Chi-sq: 27.699
#>
#> ── Last significant digit ──
#>
#> ── Friends
#> Chi-sq: 13.116
#>
#> ── Statuses
#> Chi-sq: 8.4
#>
#> ── Followers
#> Chi-sq: 3.984
malicious_res2 <- create_totervogel("yoyo13148779", followers = TRUE)
malicious_res2
#>
#> ── yoyo13148779 ──
#>
#> ● Type:Followers
#> ● Total: 4998
#> ── First significant digit ──
#> ── Friends
#> Correlation: 0.994 / Chi-sq: 40.182
#>
#> ── Statuses
#> Correlation: 0.999 / Chi-sq: 66.582
#>
#> ── Followers
#> Correlation: 0.999 / Chi-sq: 56.646
#>
#> ── Last significant digit ──
#>
#> ── Friends
#> Chi-sq: 5.305
#>
#> ── Statuses
#> Chi-sq: 2744.141
#>
#> ── Followers
#> Chi-sq: 1607.346
plot(malicious_res2)
In Golbeck (2015), 89.7% of Twitter users had a correction of over 0.9. Less than 1% had a correlation under 0.5. An account must be very suspicious to have such a low correction.
Accounts with a lower friends count are more likely to be detected with lower Benfordness.
The last digit analysis is experimental. The results should only raise your eyebrows, if more than one aspect (friends, statuses, followers) displays unexpected distribution.
The logo of this package is a remix of Kearney et al’s rtweet’s logo. The original logo is licensed under an MIT License.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.