Finds the URL to the ‘favicon’ for a website. This is useful if you want to display the ‘favicon’ in an HTML document or web application, especially if the website is behind a firewall.
library(faviconPlease)
faviconPlease("https://github.com/")
## [1] "https://github.githubassets.com/favicons/favicon.svg"
Also check out my blog post on faviconPlease for more background and examples.
Install latest release from CRAN:
install.packages("faviconPlease")
Install development version from GitHub:
install.packages("remotes")
remotes::install_github("jdblischak/faviconPlease")
Please note that the faviconPlease project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
By default, faviconPlease()
uses the following strategy to find the
URL to the favicon for a given website. It stops once it finds a URL and
returns it.
Download the HTML file and search its <head>
for any <link>
elements with rel="icon"
or rel="shortcut icon"
.
Download the HTML file at the root of the server (i.e. discard the
path) and search its <head>
for any <link>
elements with
rel="icon"
or rel="shortcut icon"
.
Attempt to download a file called favicon.ico
at the root of the
server. This is the default location that a browser looks if the
HTML file does not specify an alternative location in a <link>
element. If the file favicon.ico
is successfully downloaded, then
this URL is returned.
If the above steps fail, as a fallback, use the favicon service provided by the search engine DuckDuckGo. This provides a nice default for websites that don’t have a favicon (or can’t be easily found).
The default strategy above is designed to reliably get you a favicon URL for most websites. However, you can customize it as needed.
The default fallback function is faviconDuckDuckGo()
. To instead use
Google’s favicon service, you can set the argument
fallback = faviconGoogle
.
Note that neither DuckDuckGo nor Google have every favicon you might expect. And the availability can change over time. You can see some examples in my blog post. Fortunately they both provide a generic favicon to insert when they don’t have the favicon.
You can use your own custom fallback function instead. It must accept
one argument, which is the server, e.g. "github.com"
. The easiest
approach would be to copy-paste one of the existing fallback functions
and modify it to use your alternative favicon service.
args(faviconDuckDuckGo)
## function (server)
## NULL
body(faviconDuckDuckGo)
## {
## iconService <- "https://icons.duckduckgo.com/ip3/%s.ico"
## favicon <- sprintf(iconService, server)
## return(favicon)
## }
If you have a URL to a generic favicon file that you would like to use as a fallback, you can directly pass this as a character vector. It could also be a path to an image file on the server where your app is running.
The default strategy first checks the <head>
for a link to the favicon
file and then checks for the availability of the file favicon.ico
. You
can change this order, or only perform one of them, by changing the
argument functions
passed to faviconPlease()
. It should be a list of
functions.
# default
functions = list(faviconLink, faviconIco)
# Switch the order
functions = list(faviconIco, faviconLink)
# Only search <head>
functions = list(faviconLink)
# Only check for favicon.ico
functions = list(faviconIco)
# Skip the favicon functions entirely and just use the fallback
functions = NULL
You can also create your own custom favicon function to pass to
faviconPlease()
. By default it must accept 3 arguments. It will be
passed the URL’s scheme (e.g. "https"
), server (e.g. "github.com"
),
and path (e.g. "/jdblischak/faviconPlease"
). Your function should
return the URL to a favicon or an empty string, ""
, if it can’t find
one.
# Favicon functions must accept at least 3 positional arguments
args(faviconLink)
## function (scheme, server, path)
## NULL
As a concrete example, here is a custom function for searching for
favicon.ico
on Ubuntu 20.04, which has increased security settings
(see troubleshooting section below).
faviconIcoUbuntu20 <- function(scheme, server, path) {
faviconIco(scheme, server, path, method = "wget",
extra = c("--no-check-certificate",
"--ciphers=DEFAULT:@SECLEVEL=1"))
}
It calls faviconIco()
with the specific settings needed by
download.file()
to work on Ubuntu 20.04. You could then use your
custom function instead of the default faviconIco()
by calling
faviconPlease()
with
functions = list(faviconLink, faviconIcoUbuntu20)
.
Note that the example function faviconIcoUbuntu20()
will likely fail
on Windows, macOS, and Ubuntu versions prior to 20.04.
Unfortunately it’s not easy to make this fool proof for all operating systems and all websites. Here are some known issues:
download.file()
, used by faviconIco()
, is known to have
cross-platform issues. Thus the official documentation in
?download.file
recommends:
Setting the
method
should be left to the end user.
Accordingly, faviconIco()
exposes the arguments method
, extra
,
and headers
, which are passed directly to download.file()
.
Alternatively you can set the global options
"download.file.method"
or "download.file.extra"
.
Ubuntu 20.04 increased its default security settings for downloading
files from the internet
(details).
Unfortunately many websites have not updated their SSL certificates
to comply with the increased security restrictions. faviconLink()
has a workaround for this situation, but not faviconIco()
. As an
example, here’s how you could detect the availability of favicon.ico
for the Ensembl website on Ubuntu 20.
r
faviconIco("https", "www.ensembl.org", "",
method = "wget", extra = c("--no-check-certificate",
"--ciphers=DEFAULT:@SECLEVEL=1"))
Alternatively, if it’s an option for you, you could avoid this
workaround by using the previous Ubuntu LTS release 18.04. Also note
that the above command will fail on Ubuntu 18.04 because the default
wget
installed doesn’t have the argument --ciphers
.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.