is_automata: identify non-webcrawler automated traffic
In wikimedia-research/pageviews: Wikimedia Pageview definition-related utilities

Description Usage Arguments Value See Also

View source: R/filter.R

Not all automated traffic is from a webcrawler - much is from people running HTTP libraries in a particularly stupid, selfish and lazy fashion (if you're reading this and you've ever had a service making requests with the user agent "Twisted PageGetter": this means you). is_automata identifies this class of traffic.

1	is_automata(user_agents)

user_agents

a vector of user agents, which can be retrieved with read_sampled_log.

a boolean vector identifying whether the user agent at the equivalent indices in the input vector matched that of an automated service or not.

read_sampled_log for retrieving user agents, and is_automata for identifying non-crawler automata.

wikimedia-research/pageviews documentation built on May 4, 2019, 5:24 a.m.

wikimedia-research/pageviews index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

wikimedia-research/pageviews
Wikimedia Pageview definition-related utilities

is_automata: identify non-webcrawler automated traffic
In wikimedia-research/pageviews: Wikimedia Pageview definition-related utilities

Description

Usage

Arguments

Value

See Also

Related to is_automata in wikimedia-research/pageviews...

R Package Documentation

Browse R Packages

We want your feedback!

wikimedia-research/pageviews Wikimedia Pageview definition-related utilities

is_automata: identify non-webcrawler automated traffic In wikimedia-research/pageviews: Wikimedia Pageview definition-related utilities

Description

Usage

Arguments

Value

See Also

Related to is_automata in wikimedia-research/pageviews...

R Package Documentation

Browse R Packages

We want your feedback!

wikimedia-research/pageviews
Wikimedia Pageview definition-related utilities

is_automata: identify non-webcrawler automated traffic
In wikimedia-research/pageviews: Wikimedia Pageview definition-related utilities