knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

In This Vignette

Introducing PetFindr

'PetFindr' provides a fast and friendly way to extract data from Petfinder, the largest online pet adoption website, and facilitates data analysis and other tasks from the extracted data.

The package provides a useful tool to enable R users to search for their favorite pets in their own R session, and to browse pictures of the pets. All of this is possible thanks to API (Application Programming Interface) which is a set of instructions and standards for accessing structured information on the web. The Petfinder API (V2) allows us to access a database of hundreds of thousands of pets ready for adoption and over ten thousand animal welfare organizations.

This vignette introduces the package,PetFindr, and walks through examples of using the Petfinder API (V2) for authenticating and extracting data from the database.

Getting Started

First, install the development version from GitHub and load the package with:

``` {r, eval= F}

install.packages("devtools")

devtools::install_github("earl88/PetFindr") library(PetFindr)

### Authentication


In order to use the Petfinder API (V2), each user needs a secure authentication. This is done by registering for a developer account on [Petfinder](https://www.petfinder.com/user/register/), which provides each user a unique 'key' and 'secret'. The `pf_setup` function guides the user through the registration process.

```r
pf_setup()
# If picture doesn't work, display message manually

cat("Welcome to PetFindr! Before you can search for sweet puppers and kitty cats \nin R, you'll need to register for the official PetFinder API (V2) at
https://www.petfinder.com/developers/. Would you like to do this now? (Selecting \n'Yes' will open browser.) \n\n1: Yeah\n2: No way\n3: Nope\n")
knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/setup_message.png")

If the user responds "yes" (or any other affirmative option), pf_setup will will open a browser to the [Petfinder developer portal] (https://www.petfinder.com/developers/) and prompt the user to register for Petfinder's API (V2). After registering, the user will receive a Petfinder API (V2) key and secret.

The uniqe key and secret will be used to request an access token to Petfinder API.

knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/secretandkey.png")
pf_save_credentials(key, secret)
````

Once the user creates an account on [Petfinder](https://www.petfinder.com/) and receives a unique `key` and a `secret`, the user now can choose to use the function `petfindr_save_credentials()` to store their key and secret to their .Rprofile so that they do not have to run `pf_save_credentials(key, secret)` each time they use the package.

Using `pf_save_credentials(key, secret)` makes it easier to access the API in the future, but this function is not required to access the database.

Before the user can make searches with the API, there is one final step: requesting an access token to get an access to API database.

```r
pf_accesstoken(key, secret)

Each access token lasts an hour; after that time, a new access token must be generated. Once the user has an access token, they are ready to access to the Petfinder API to search for pets and organizations.

Usage

The Petfinder API allows users to search and view pet listings based on pet characteristics, location, and status. In addition, they can search for and display animal shelters based on the organization's name, ID, and location.

The following section introduces the functions in PetFindr package and how to interact with the Petfinder API with some cute examples of extracting data from the database.

Functions

pf_find_pets

pf_find_pets allows users to retrieve a dataframe of information about pets that are listed on Petfinder.com. Searches may be filtered by characteristics of animals such as animal type, breed, and gender, and organization information such as IDs, location, and distance from the selected location. Users can also specify output parameters, such as the maximum number of pets in the output and type of data sorting (e.g., most recent, the closest from the location, or random). By including a dash in the R command (e.g., "-recent", "-distance"), a reverse-order sort is also available. For more details, list of input arguments can be found here.

Let's say a user would like to search female cats in Austin, Texas. Note that in the following example, only a selection of output columns are shown.

library(dplyr)
pf_find_pets(token, type = "cat", gender = "female", 
             location = "Austin, TX", page = 1) %>% 
  select(organization_id, type, breeds.primary, age, 
         gender, size, name, status) %>% 
  head(n = 5)
knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/pf_find_pets.png")

List types and breeds of pets

The functions pf_list_types and pf_list_breeds return a list of all available types and breeds of pets on the Petfinder API, respectively. If output lists all available information on types and breeds of animals, it would take a huge amount of time for a user to filter the information he/she is actually looking for. Or, sometimes a user does not know what kind of information can be extracted using the functions. Note that there is alimiting number fo daily queries, the developers made the functions below to make the searching process easier.

pf_list_types

pf_list_types returns a dataframe with all available animal types from Petfinder.com, along with each type's available coat, color, and gender options. There is only one argument required for this function which is user's token. Then a tibble listing all available types with thier respective coat, color, and gender is returned by this function.

pf_list_types(token = token)
pf_list_types(token = token)$names

Anytime users call for the type of pets, the function makes a query to the API and lists all types of pets based on the latest information. There is also a list of types of pets available in the package called pf_types which can be used; however, this data set is not up-to-date and it is just a static data set created on April 2019. For more details of such sample datasets, please refer to the next section.

data("pf_types", package = "PetFindr")

pf_list_breeds

pf_list_breeds lists all available breeds of each specific type of pet listed in users' R session and allows the user to specify an animal type found on Petfinder.com and returns a vector containing all available breeds for that animal type. Similar to pf_list_type, users' token is required for this function as the first input. The type of the pet must be also specified by the users. The output is a character vector containing the breed names for the specified animal type is given by the function.

# error animal type must be specified
pf_list_breeds(token = token)
pf_list_breeds(token = token, type = "Scales, Fins & Other")
# Listing all available dog "Retriever" 
dog_breeds <- pf_list_breeds(token = token, type = "dog")
dog_breeds[stringr::str_detect(dog_breeds, pattern = "Retriever") == TRUE]

Similar to pf_types dataset, there is also an static sample dataset of pet breeds available in the package called pf_breeds which can be used; however, to have the most recent breeds listed in R, pf_list_breeds needs to be called.

pf_map_locations

pf_map_locations displays the location of animal shelters on a Leaflet map, and requires the token and a dataframe of animal information from pf_find_pets, as inputs. Once a user obtains an animal dataframe, pf_map_locations uses the collection of the organization IDs of those pets to make a GET request. With the GET request, the ZIP codes of those organizations are collected, and the location of the organizations are shown on a Leaflet map according to their ZIP code. That's why the token is required to use the function.

The following code will display markers on a leaflet map, which are the locations of the animal shelters of the first 10 puppies stored in the dataset, LA_puppies which is one of the sample datasets. Such sample datasets as well as LA_puppies will be explained in the next section. The snapshot of the map is shown below.

Note that only one or a few of organizations would be shown if some of animals are from the same shelters, and none of them will be marked if the location of the organization is not updated or provided from the Peftinder API.

data(LA_puppies, package = "PetFindr")
pf_map_locations(token, LA_puppies[1:10,])
knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/snapshot_map_LA_puppies10.png")

pf_find_organizations

pf_find_organizations retrieves a dataframe of information about organizations that are listed on Petfinder.com. Filter searches based on location, which can be specified by one of the following three ways: a postal code, "city-name, state-abbreviation", or latitude and longitude. Users can also search for organizations within distance of the location by "distance" input as well as a country in which the organizations are by "country" input. Users also can choose what type of pre-sorted pages on Petfinder.com the information should be imported.date listed on Petfinder.com, the distance from the location specified in ascending or descending order, or randomly where the default setting is "recent", which is related to the date listed on Petfinder.com.

In the following examples with a valid token, the information of 100 organizations in US in the alphabetical order of states in the first page on the website, and of the first 20 organizations by default within 50 miles in Minneaplis, MN in the order of recent date listed will be returned, respectively.

US_orgs <- pf_find_organizations(token, country = "US", limit = 100, sort = "state")
MN_orgs <- pf_find_organizations(token, location = "Minneapolis, MN", distance = 50)

pf_view_photos

pf_view_photos presents the photos of searched pets in a slideshow format where it uses a dataframe of animal information from pf_find_pets, and the size of photos as inputs. Since the sizes of photos provided from the Petfinder website are "small", "medium", "large", and "full", one or more of them can be specified, and all sizes of photos will be shown by the default. In case there is no photo in the animal dataframe, an error message, "No photos found :(", will be printed. The following example of code will show a slideshow of small photos of puppies in LA.

data(LA_puppies, package = "PetFindr")
pf_view_photos(LA_puppies[1:10,], size = "small")
knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/pupphotos.gif")

Examples Using Sample Data

The developers have stored sample datasets extracted from Petfinder, \ because it can be useful to include example datasets in the R package, to illustrate how the datasets are structures and how it can be leveraged to explore your interests in pets.\ The datasets in the package are immediately available when the package is loaded.

Sample data: Puppies in Los Angeles

The dataset summarises information of 500 puppies in Greater Los Angeles area extracted from Petfinder on April 25, 2019. Output below is the first observation from the dataset.

library(dplyr)
library(PetFindr)
LA_puppies %>% head(n = 5) %>% select(id, organization_id, type, species, name,age,gender, size, status)

Note the output does not show every column in the sample dataset (otherwise the table would be very wide). Detailed information on the variables can be found here. Using the data, a user can perform various kinds of explanatory data analysis. For example, if a user wants to know what breed is the most common breed of dogs can be found in the animal shelters at Los Angeles. Top 10 breeds in Los Angeles area on April 2019 would be

names(sort(-table(LA_puppies$breeds.primary)))[1:10]

The result can also be visualized as below.

library(ggplot2)
top10 <- LA_puppies %>% filter(breeds.primary %in% c("Chihuahua","Terrier","German Shepherd Dog","Labrador Retriever" ,"Pug" ,"Shepherd", "Maltese" ,"Dachshund" ,"Pit Bull Terrier","Husky" ))
## Reorder puppies by frequency
top10 <- within(top10, 
                   breeds.primary <- factor(breeds.primary, 
                                      levels=names(sort(table(breeds.primary), 
                                                        decreasing=TRUE))))
## Create a bar plot
top10 %>% ggplot(aes(x = breeds.primary, fill = gender)) + geom_bar() + coord_flip()
knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/commonbreeds.png")

Through a visualization, the user can conclude that these are the most common puppy breeds (top 10) found at LA animal shelters. Or, a user can conduct a time series analysis. Let's create a time series plot to see when the animals are posted on Petfinder.

## Create a time variable
LA_puppies$published_at <- lubridate::date(ymd_hms(LA_puppies$published_at))
LA_puppies %>% group_by(published_at) %>% summarise(n = n()) %>% ggplot(aes(x = published_at, y = n)) + geom_line() + geom_point() + ggtitle("Time Series of Animal Posting in LA") +
           xlab("Date") + ylab("Number of Postings")
knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/nyorgstimeseries.png")

Sample data: Shelters in New York State

The dataset summarises the list of animal sheters in New York State extracted on April 2019. \ Output below is the first observation from the dataset.

NY_orgs %>% head(n = 5) %>% select(id, name, phone)

Note the output does not show every column in the sample dataset (otherwise the table would be very wide). Detailed information on the variables can be found here Similar explanatory analysis can be conducted for the dataset as well.

top10_orgs <- NY_orgs %>% filter(address.city %in% c("New York","Brooklyn","Rochester", "Albany",        "Staten Island", "Buffalo" ,"Flushing","Binghamton","Syracuse","Bronx"))
top10_orgs <- within(top10_orgs, 
                   address.city <- factor(address.city, 
                                      levels=names(sort(table(address.city), 
                                                        decreasing=TRUE))))
top10_orgs %>% ggplot(aes(x = address.city)) + geom_bar() + coord_flip()
knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/sheltersinny.png")

The bar plot indicates that most of the shelters in New York state is located in New York city.

Shiny

The Shiny is a convenient way to explore data through an interactive user interface in the user’s default browser straight from R. This section of the vignette will present a couple of examples to show users how to explore Petfinder using R Shiny.\ Detailed information on R Shiny could be found here. A user also needs an authentication to use the Shiny because Shiny extracts data from real-time information from the API.\ The app includes multiple ways to extract and visualize data from Petfinder. All of the outputs on the Shiny app can be created directly in R as well. To run the app, load the PetFindr package, and run the pf_run_Shiny() function:

library(PetFindr)
library(shiny)
pf_run_Shiny()

The computer’s default browser will open up and display the following pages:

Authentication

knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/shiny_main.png")

To ensure that the users are registered and have their own key and secret, the main page of the Shiny app shows a short instruction to let the users know that key and secret are required to use the Shiny app and the token has to be regenerated every hour.\ The shiny app also detects invalid key and secret.

Search by Animals

After the users are granted to access the API, they can search animals as below.

knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/shiny2.png")

The first tab is divided into 4 main sections: The left sidebar is used to input the required information (location, search radius, types of animals, breeds of animals, adoption status, number of pets in the output) to request the AP to search for pets that meet certain criteria. Three types of queries can be made in location. A postal code, city-name, state-abbreviation and latitude and longitde information. To select distance from the location, the location input must be specified first. Pets in your location is a table output that displays the list of pets from the output. The main visualization display section shows a leaflet output to visualize the location of the pets. Finally, if a user click one of the rows in Pets in your location table, app will display a picture of the selected animal. If the photo is unavailable, it will return an error message.

Search by Organizations

Second tab allows users to find information on animal shelters and is also divided into 4 main sections.\ A user can search organization based on 3 criteria (location, search radius , and number of outputs.\ The main visualization display shows three types of information.\ First, it shows a leaflet map to visualize the location of the animal shelters,\ and the table below shows information from a selected shelter in the leaflet map.\ Finally, the bar graph summarizes the types and number of pets in the selected shelter.

knitr::include_graphics("https://raw.githubusercontent.com/earl88/PetFindr/master/inst/FinalPresentation/images/orginfo.png")

Further Reading



earl88/PetFindr documentation built on Jan. 18, 2020, 9:10 a.m.