Home

/

GitHub

/

hrbrmstr/urldiversity

/

In hrbrmstr/urldiversity: Quantify 'URL' Diversity and Apply Popular Biodiversity Indices to a 'URL' Collection

urldiversity

Quantify 'URL' Diversity and Apply Popular Biodiversity Indices to a 'URL' Collection

Description

Methods are provided to compute the 'WSDL Diversity Index' http://ws-dl.blogspot.com/2018/05/2018-05-04-exploration-of-url-diversity.html along with selected biodiversity indidces to a corpus (collection) of 'URLs'.

NOTE

All credit goes to Alexander Nwala for the algorithm research and original Python implementation.

TODO

[ ] Handle some edge cases
[ ] Tests
[ ] Better documentation
[ ] Vignette with many citations from the WSDL blog post

What's Inside The Tin

The following functions are implemented:

Core function:

uri_diversity: Quantify URL diversity
url_diversity: (an alias for ^^ b/c I regularly forget it's rightlfully uri)

Processing Helpers:

clean_index_factors: Clean up diversity and evenness names

Scraping Helpers:

body_anchor_urls: Extract all body anchor hypertext references
body_img_urls: Extract all body image URLs
safeGET: Safer version of 'httr::GET()'
safePOST: Safer version of 'httr::POST()'

Installation

devtools::install_github("hrbrmstr/urldiversity")

options(width=120)

Usage

library(urldiversity)

# current verison
packageVersion("urldiversity")

collection <- readLines(system.file("extdat", "corpus.txt", package = "urldiversity"))

print(collection)

x <- uri_diversity(collection)

dplyr::glimpse(x)

x

Code of Conduct

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

hrbrmstr/urldiversity documentation built on May 14, 2019, 4 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

hrbrmstr/urldiversity
Quantify 'URL' Diversity and Apply Popular Biodiversity Indices to a 'URL' Collection

In hrbrmstr/urldiversity: Quantify 'URL' Diversity and Apply Popular Biodiversity Indices to a 'URL' Collection

urldiversity

Description

NOTE

TODO

What's Inside The Tin

Core function:

Processing Helpers:

Scraping Helpers:

Installation

Usage

Code of Conduct

R Package Documentation

Browse R Packages

We want your feedback!

hrbrmstr/urldiversity Quantify 'URL' Diversity and Apply Popular Biodiversity Indices to a 'URL' Collection

In hrbrmstr/urldiversity: Quantify 'URL' Diversity and Apply Popular Biodiversity Indices to a 'URL' Collection

urldiversity

Description

NOTE

TODO

What's Inside The Tin

Core function:

Processing Helpers:

Scraping Helpers:

Installation

Usage

Code of Conduct

R Package Documentation

Browse R Packages

We want your feedback!

hrbrmstr/urldiversity
Quantify 'URL' Diversity and Apply Popular Biodiversity Indices to a 'URL' Collection