Home

/

GitHub

/

In spcanelon/TidyTuesdayAltText: Alternative text for media attached to TidyTuesday tweets

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
# loading libraries
library(TidyTuesdayAltText)
library(tidyverse)
source("data-raw/createDataDictionary.R")

TidyTuesdayAltText

The goal of TidyTuesdayAltText is to provide insight into the alternative (alt) text accompanying the data visualizations shared on Twitter as part of the TidyTuesday social project^[rfordatascience/tidytuesday: Official repo for the #tidytuesday project].

Installation

<!-- You can install the released version of TidyTuesdayAltText from CRAN with:

install.packages("TidyTuesdayAltText")

--> You can install the development version of TidyTuesdayAltText from GitHub with:

# install.packages("devtools")
devtools::install_github("spcanelon/TidyTuesdayAltText")

Note about installing from a private repo

While this repo is private, you will first have to make sure to provide authentication to GitHub using a Personal Access Token (PAT) as a credential. You can follow the steps in the credential caching chapter of Happy Git with R (summarized below): Chapter 10 Cache credentials for HTTPS | Happy Git and GitHub for the useR

Create a personal access token using a usethis package helper function which pre-selects recommended scopes/permissions: usethis::create_github_token()
Then store your token somewhere safe and treat it like you would a password.
Call an R function to store your credentials using the credentials package which will prompt you to enter your token: credentials::set_github_pat()
This populates the GITHUB_PAT environment variable that install_github() defaults to as the authentication token (i.e. auth_token) argument.
Finally, proceed to install the package as usual: devtools::install_github("spcanelon/TidyTuesdayAltText")

About the data

The package contains 5 datasets:

library(TidyTuesdayAltText)
?ttTweets2018
?ttTweets2019
?ttTweets2020
?ttTweets2021
?AltTextSubset

Original data were collected and made available by Tom Mock (\@thomas_mock) using {rtweet}.

Tweets were processed and scraped for alternative text by Silvia Canelón (\@spcanelon)

Data were filtered to remove tweets without attached media (e.g. images)
Data were supplemented with reply tweets collected using {rtweet}. This was done to identify whether the original tweet or a reply tweet contained an external link (e.g. data source, repository with source code)
Alternative (alt) text was scraped from tweet images using {RSelenium}. The first image attached to each tweet was considered the primary image and only the primary image from each tweet was scraped for alternative text. The following attributes were used to build the scraper:
CSS selector: .css-1dbjc4n.r-1p0dtai.r-1mlwlqe.r-1d2f490.r-11wrixw
Element attribute: aria-label

knitr::include_graphics("man/figures/webInspection.png")

This data package does not include data that could directly identify the tweet author in order to respect any author's decision to delete a tweet or make their account private after the data was originally collected.^[Developer Policy – Twitter Developers | Twitter Developer]

To obtain the tweet text, author screen name, and many other tweet attributes, you can "rehydrate" the TweetIds (or "status" ids^[Tweet object | Twitter Developer]) using the {rtweet} package.^[Get tweets data for given statuses (status IDs). — lookup_tweets • rOpenSci: rtweet]

AltTextSubset

A dataset containing the alternative text for media shared between 2018 and 2021 as part of the TidyTuesday social project, and other attributes of 441 tweets. This is a subset of the 2018-2021 datasets, containing only tweets with alternative text that isn't "Image," the default alternative text added by the Twitter app in the absence of customized alternative text. More information can be found using ?AltTextSubset.

Dates included: April 10, 2018 to April 4, 2021.
Observations (rows): There are 465 rows in this dataset. Each row represents a single unique tweet post.
Variables (columns): There are 7 columns in this dataset. They are described below

createDataDictionary(AltTextSubset)

dataDictionary %>% knitr::kable(escape = TRUE)

ttTweets2021

Link to the raw data: data-raw/ttTweets2021.csv

A dataset containing the alternative text for media shared in 2021 as part of the TidyTuesday social project, and other attributes. More information can be found using ?ttTweets2021.

Dates included: January 1, 2021 to April 4, 2021.
Observations (rows): There are 1032 rows in this dataset. Each row represents a single unique tweet post.
Variables (columns): There are 7 columns in this dataset. They are described below

createDataDictionary(ttTweets2021)

dataDictionary %>% knitr::kable()

ttTweets2020

Link to the raw data: data-raw/ttTweets2020.csv

A dataset containing the alternative text for media shared in 2020 as part of the TidyTuesday social project, and other attributes. More information can be found using ?ttTweets2020.

Dates included: January 1, 2020 to December 31, 2020
Observations (rows): There are 3374 rows in this dataset. Each row represents a single unique tweet post.
Variables (columns): There are 7 columns in this dataset. They are described below

createDataDictionary(ttTweets2020)

dataDictionary %>% knitr::kable()

ttTweets2019

Link to the raw data: data-raw/ttTweets2019.csv

A dataset containing the alternative text for media shared in 2019 as part of the TidyTuesday social project, and other attributes. More information can be found using ?ttTweets2019.

Dates included: January 1, 2019 to December 31, 2019.
Observations (rows): There are 2022 rows in this dataset. Each row represents a single unique tweet post.
Variables (columns): There are 7 columns in this dataset. They are described below

createDataDictionary(ttTweets2019)

dataDictionary %>% knitr::kable()

ttTweets2018

Link to the raw data: data-raw/ttTweets2018.csv

A dataset containing the alternative text for media shared in 2018 as part of the TidyTuesday social project, and other attributes. More information can be found using ?ttTweets2018.

Dates included: April 2, 2018 to December 31, 2018.
Observations (rows): There are 709 rows in this dataset. Each row represents a single unique tweet post.
Variables (columns): There are 7 columns in this dataset. They are described below

createDataDictionary(ttTweets2018)

dataDictionary %>% knitr::kable()

Examples

(placeholder)

License

(placeholder)

Citation

To cite the TidyTuesdayAltText package, please use:

citation("TidyTuesdayAltText")

References

Data and hex logo originally published in:

Thomas Mock (2021). Tidy Tuesday: A weekly data project aimed at the R ecosystem. https://github.com/rfordatascience/tidytuesday

Many thanks to Liz Hare (\@DogGeneticsLLC) for testing the package in development and performing the analyses that went into our CSV Conf 2021 talk.

And thank you to the following resources for providing guidance and inspiration for how this package was organized and documented:

Chapter 12 Create a data package | rstudio4edu
The Pudding. Repo: the-pudding/data/foundation-names
Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer Archipelago (Antarctica) penguin data. R package version 0.1.0. https://allisonhorst.github.io/palmerpenguins/. doi:10.5281/zenodo.3960218.

Additional resources

spcanelon/TidyTuesdayAltText documentation built on Aug. 14, 2022, 10:56 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

spcanelon/TidyTuesdayAltText
Alternative text for media attached to TidyTuesday tweets

In spcanelon/TidyTuesdayAltText: Alternative text for media attached to TidyTuesday tweets

TidyTuesdayAltText

Navigation

Installation

Note about installing from a private repo

About the data

AltTextSubset

ttTweets2021

ttTweets2020

ttTweets2019

ttTweets2018

Examples

License

Citation

References

Additional resources

R Package Documentation

Browse R Packages

We want your feedback!

spcanelon/TidyTuesdayAltText Alternative text for media attached to TidyTuesday tweets

In spcanelon/TidyTuesdayAltText: Alternative text for media attached to TidyTuesday tweets

TidyTuesdayAltText

Navigation

Installation

Note about installing from a private repo

About the data

AltTextSubset

ttTweets2021

ttTweets2020

ttTweets2019

ttTweets2018

Examples

License

Citation

References

Additional resources

R Package Documentation

Browse R Packages

We want your feedback!

spcanelon/TidyTuesdayAltText
Alternative text for media attached to TidyTuesday tweets