download_lighthouse: Download new Pagespeed report (Lighthouse) (Pagespeed API...

Description Usage Arguments Details Value Examples

View source: R/download_lighthouse.R

Description

Download PageSpeed Insights v5 report ("Lighthouse") for single or multiple URLs in variety of options. You can choose the the resulting object to be an original nested list (output_type = "raw") with all the data or a parsed data frame (output_type = "simple") with most of the data.

If you choose a data frame, mind that it will have literally hundreds/thousands of columns. The number of columns IS NOT STABLE, because it depends to number of error occurrences and their types.

Because of this you can decide to set parameter (long_result = TRUE) to obtain more messy but "long-like" output data frame.

Usage

1
2
3
4
download_lighthouse(url, key = Sys.getenv("PAGESPEED_API_KEY"),
  output_type = "simple", strategy = "desktop",
  categories = "performance", long_result = TRUE, interval = 0.5,
  locale = NULL, utm_campaign = NULL, utm_source = NULL)

Arguments

url

vector of character strings. The URLs to fetch and analyze MUST contain "http://" or "https://".

key

string. Pagespeed API key to authenticate. Defaults to "PAGESPEED_API_KEY" enviroment variable.

output_type

string. Choose how to parse the output. Options: "simple" or "raw". See more in Details section

strategy

string/character vector. The analysis strategy to use. Options: "desktop", "mobile", and "c("desktop", "mobile")" to return both results in one function call.

categories

string. A Lighthouse categories to run. Defaults to "performance". See more in Details section

long_result

logical. Should the resulting data frame be a long df?

interval

numeric. Number of seconds to wait between multiple queries. Defaults to 0.5 second.

locale

string. The locale used to localize formatted results

utm_campaign

string. Campaign name for analytics. Defaults to NULL

utm_source

string. Campaign source for analytics. Defaults to NULL

Details

The output_type parameter regulates how the output will be parsed and stored. For "simple" - formatted data frame that contains most of the data (scores, recommendations and error occurences). For "raw" - unformatted nested list that contains all the data that was returned by the API.

The api_version parameter regulates which API version is to create the report. Legacy version 4 is a classic Pagespeed, and the new version 5 returns Lighthouse reports.

The categories parameter works only for API version 5. It regulates which of the tests' categories from Lighthouse are to be run. You can select more than one in a vector. Options: "accessibility", "best-practices", "performance", "pwa", "seo".

Value

two options: data frame (if output_type = "simple"), nested list (if output_type = "raw")

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
## Not run: 
# download simple data frame with "Performance" Lighthouse report for Google.com:
# that's a lot of columns and you will have problems but you can
# spread/gather them as you like

lh_df_1 <- download_lighthouse(
 url = "https://www.google.com",
 output_type = "simple",
 long_result = FALSE) # return the results in a wide data frame

class(lh_df_1)
# [1] "data.frame"
dim(lh_df_1)   # 1 row, 779 columns. The number of columns may wildly differ
# [1]   1 779  # because it depends also to number of spotted errors and their types



# this time let's download it and parse into messy long-like table:
lh_df_1_long <- download_lighthouse(
  url = "https://www.google.com",
  output_type = "simple", # return the results in a wide data frame
  long_result = TRUE) # spread the data into easier-to-digest form

class(lh_df_1_long)
# [1] "data.frame"
dim(lh_df_1_long) # 780 rows in 3 columns
# [1] 780   3



# check "Performance" for Google.com & Bing.com for both desktop & mobile and
# return in a data frame with most important columns
lh_df_2 <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "simple", # return the results in a wide data frame
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 1, # wait 1 second between the calls to API
  categories = "performance", # which Lighthouse reports
  long_result = FALSE)       # are to be run?


class(lh_df_2)
# [1] "data.frame"
dim(lh_df_2)
# [1]    4 1231



# check "Performance" and "Accessibility" for Google.com & Bing.com for
# both desktop & mobile and return in a data frame with most important columns
lh_df_3 <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "simple", # return the results in a wide data frame
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 2,           # wait 2 seconds between the calls to API
  categories = c("performance", # run performance & accessibility
                 "accessibility"),
  long_result = FALSE)

class(lh_df_3)
[1] "data.frame"
dim(lh_df_3)
[1]    4 1637



# check "Performance" and "Accessibility" for Google.com & Bing.com for
# both desktop & mobile and return in a data frame with even more data,
# including error occurences and the importance of each report result
lh_df_4 <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "simple", # return the results in a wide data frame
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 2,           # wait 2 seconds between the calls to API
  categories = c("performance", # run performance & accessibility
                 "accessibility"),
  long_result = FALSE)



# another run for a messy long-like data frame
lh_df_4_long <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "simple", # return the results in a wide data frame
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 2,           # wait 2 seconds between the calls to API
  categories = c("performance", # run performance & accessibility
                 "accessibility"),
  long_result = TRUE) # spread into 4 columns

class(lh_df_4_long)
# [1] "data.frame"
dim(lh_df_4_long)
# 4 columns ("device" + "parameter" + pages values x2) and 1637 rows
# [1]    4 1637



# download nested list with "Performance" Lighthouse report for Google.com
lh_nl_1 <- download_lighthouse(
  url = "https://www.google.com",
  output_type = "raw") # return nested list with all possible data



# check "Performance" for Google.com & Bing.com for both desktop & mobile and
# return in a nested list with all possible data
lh_nl_2 <- download_lighthouse(
  url = c("https://www.google.com",
          "https://www.bing.com/"),
  output_type = "raw", # return nested list with all possible data
  strategy = c("desktop", # check both desktop and mobile, bind
               "mobile"),
  interval = 1,           # wait 1 second between the calls to API
  categories = "performance") # which Lighthouse reports
                              # are to be run?

## End(Not run)

Leszek-Sieminski/pagespeedParseR documentation built on May 12, 2021, 2:29 p.m.