library(dplyr) library(httr)
Finding the cheapest flight from point A to point B could be a headache for many of us, especially with other multiple constraints, such as duration, layover, departure and arrival time, etc. The goal of the flightscanner
package is to provide a simple and straightforward interface for interacting with Rapid API -- Skyscanner through R. The Skyscanner API lets users to search for flight and query flight prices from Skyscanner's database, as well as quotes from ticketing agencies. Besides these basic functionalities as a flight searching tool, flightscanner
also allows users to schedule searches and record results automatically. In addition, this package provides a Shiny APP to visualize the trip on a map and to show the available ticket options according to the customized constraints.
knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.align = "center" )
At the time of this writing, flightscanner
has not been submitted to CRAN. Right now, the flightscanner
package can be easily installed through the devtools
package with the function install_github()
.
devtools::install_github("MinZhang95/flightscanner")
library(flightscanner)
apiSetKey("a01b3ec5e9msh2698ef80ca5232dp18fc92jsnddf5fad0cc7d")
The first step in using the flightscanner
package is to initialized the API connection to Skyscanner.
If this is your first time of loading this package, you will be required to pass the API key received from Skyscanner into the console.
Two questions will be prompted for users to quickly setup the API:
API key is required! Please follow the instructions to get the key: 1. Browse and login: https://rapidapi.com/skyscanner/api/skyscanner-flight-search Do you want to visit this website (1 for YES; 0 for NO)? 2. Copy the value of X-RapidAPI-Key in Header Parameters. Paste your key (without quote):
By selecting "1" for the first question, the users will be directed to the Rapid API Skyscanner webpage, where the API key can be found in the right panel (Figure 1) and be used for the second question.
knitr::include_graphics('APIwebpage.png')
A welcome message will show up with a valid API key:
Welcome to FlightScanner!
The valid API key will be stored into "APIkey.txt" under the current working directory, so that the API key will not be required again and again everytime when the package is loaded.
However, with an invalid API key, a failure message will show up:
Check your key or network connection. And use function `apiSetKey` to set key later.
Alternatively, the users could set (or reset) the API key manually with the function apiSetKey()
:
apiSetKey("YOUR KEY")
Please notice that apiSetKey()
does not generate or rewrite "APIkey.txt" under the current working directory.
To obtain the global API key, use the function apiGetKey()
:
apiGetKey()
This function will return the API key only if it has been successfully setup; otherwise it will return NULL
.
apiCreateSession()
apiCreateSession()
allows the users to input their flight information (origin, destination and dates) and create a session on the API server. The output contains a session ID. For example, to buy a ticket from Des Moines to Detroit for an adult on 2019-06-01 (the departure date cannot be earlier than the current date):
dsm2dtw_session <- apiCreateSession(origin = "DSM", destination = "DTW", startDate = "2019-06-01", adults = 1)
The output of apiCreateSession()
is used as the input of apiPollSession()
.
apiPollSession()
apiPollSession()
retrieves the flight data searched with apiCreateSession()
and allows the users to sort and filter the tickets by various standards. The default values of all filter variables are NULL
, meaning that we do not filter anything before we obtain the actual data. For example, to search the previous result in price ascending order:
dsm2dtw_res <- apiPollSession(response = dsm2dtw_session, sortType = "price", sortOrder = "asc")
Let's check the content of the output of apiPollSession()
:
dsm2dtw_res %>% content %>% names
The output of apiPollSession()
is messy, because it contains several sub-lists, such as "itineraries", "legs", and "segments". The relationship between these terms are shown below.
$$ \text{searching result} \begin{cases} \text{itinerary_1} \begin{cases} \text{leg_1} \begin{cases} \text{segment_1} \ \text{segment_2} \ \vdots \ \text{segment_S} \end{cases} \ \text{leg_2} \begin{cases} \text{segment_1} \end{cases} \end{cases} \ \text{itinerary_2} \begin{cases} \text{leg_1} \begin{cases} \text{segment_1} \ \text{segment_2} \end{cases} \ \text{leg_2} \begin{cases} \text{segment_1} \end{cases} \end{cases} \ \vdots \ \text{itinerary_n} \begin{cases} \text{leg_1} \begin{cases} \text{segment_1} \end{cases} \ \text{leg_2} \begin{cases} \text{segment_1} \end{cases} \end{cases} \end{cases} $$
One searching request may contain several itineraries. A one-way trip contains one leg, whereas a round-way trip contains two: outbound leg and inbound leg. One leg contains several segments if it is not a direct flight.
flightGet()
flightGet()
allows users to input the result from PollSession()
or to read from database (explain later in "Data Storage" section). The output contains a list of seven dataframes, whose names are printed below:
dsm2dtw_df <- dsm2dtw_res %>% flightGet() names(dsm2dtw_df)
The dataframe "price" provides information, such as the searching time and pricing options:
dsm2dtw_df$price %>% head(3) %>% print(width = 120)
Within the same itinerary, there might be several different prices due to different agents:
dsm2dtw_df$price$PricingOptions[[39]] %>% print(width = 120)
The dataframe "leg" provides information, such as duration and number of stops:
dsm2dtw_df$legs %>% head(3) %>% print(width = 120)
We can also check the stop information and the layover in minutes with the "leg" dataframe for each leg:
dsm2dtw_df$legs$Stops %>% head(3) %>% print(width = 120)
Similarly, the detailed results about the segments are stored in the "segments" dataframe:
dsm2dtw_df$segments %>% head(2) %>% print(width = 120)
In the above outputs, the carriers and stops are represented with their IDs. To "translate" to their names, run:
dsm2dtw_df$carriers %>% head(1) %>% print(width = 120) dsm2dtw_df$places %>% head(1) %>% print(width = 120)
flightFilter()
flightFilter()
allows users to filter the results obtained from flightGet()
. Continued with the previous example, the user looks for flights with a budget of $1,000, no more than 1 stop, and departure time after 8AM:
flightFilter(dsm2dtw_df, max_price = 1000, max_stops = 1, out_departure = c("08:00","24:00")) %>% head(3)
Storing flight data as database can be efficient for automatic searching.
dbCreateDB()
dbCreateDB()
is a function to connect to the local database file, default is "flight.db". This is the pre-configuration before saving data in database.
dbCreateDB(conn = RSQLite::SQLite(), dbname = "flight.db")
The flight.db includes seven tables:
con <- dbCreateDB(dbname = "flight.db") dbListTables(con)
It will excute:
dbSaveData
dbSaveDB
is a function to save data into the database file.
resp <- apiCreateSession(origin = "DSM", destination = "DTW", startDate = "2019-06-01") resp <- apiPollSession(resp) data <- flightGet(resp) # Connect to SQLite database con <- dbCreateDB(dbname = ":memory:") dbSaveData(resp, con) # from response dbSaveData(data, con) # from list dbDisconnect(con)
It accepts two classes of inputs: response
or list
. response
is the request response got by apiPollSession()
. list
is the data got by flightGet()
.
A feature that makes the flightscanner
package unique, compared with the existing flight searching engines, is its functionality of automatic flight enquiry according to a schedule.
This part of functions only works on Unix/Linux/MacOS, not Windows. In the future, we will add Windows part.
If you use MacOS and meet the problem of "Operation not permitted". Follow the instructions:
Creating Cron jobs is realized with the cron_create()
function. Besides the regular flight information (such as origin, destination, and dates), another input "frequency" is needed for the job schedule. It could be "minutely", "hourly", "daily" or other frequencies defined by Cron's syntax, see link. Here is an example:
cron_create("DSM", "SEA", "2019-07-20", frequency = "hourly") # this is the example cron_create("DSM", "PVG", "2019-06-01", frequency = "0 */2 * * *") # every 2 hours
This function will generate a log file and a database file. All of the scheduled searching results are contained in this database file, e.g. "flight.db".
# connect to SQLite database con <- dbCreateDB(dbname = "flight.db") # read data from database data <- flightGet(con) # show the searching time unique(data$price$SearchTime) # disconnect database dbDisconnect(con)
To show the current searching jobs, run the function:
cron_ls()
The job will be automatically excuted even if R
is closed or the computer is restarted. To stop the job, run:
cron_clear(ask = FALSE)
To open the Shiny App, run:
shiny::runApp(system.file(package = "flightscanner", "shiny"))
The Shiny App for the flightscanner
includes three tabs: Airport Map, Flights and IATA Code.
It is used as the welcome page by default. The map from leaflet
shows the accurate locations of the target airports. It could provide a rough intuition about how far the users need to travel.
Input values are needed on the top of the map when doing a flight search:
Click on the Go! button after providing the trip information.
knitr::include_graphics('shinyMap.png')
Click on the Flight tab after the search is complete. There are several filter options on the left panel.
knitr::include_graphics('shinyTickets.png')
A table containing the detailed information about the filtered flights will be given on the right main panel, including the ticket price, departure and arrival time for inbound and/or outbound flight, duration, and the number of stops for inbound and/or outbound. There are also hyperlinks to the ticketing agencies in the column of Link.
Under this tab, users can search for the 3-character codes for the target airports by providing city or country names in the searching box in the upper right corner. The data comes from MUCflight
.
knitr::include_graphics('shinyCode.png')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.