Batch Processing & Quota Management

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)

Understanding YouTube API Quotas

The YouTube Data API v3 uses a quota system to manage usage. Every project is allocated 10,000 quota units per day by default.

Quota Costs by Operation

Different operations have different quota costs:

| Operation | Quota Cost | Example Function | |-----------|-----------|------------------| | Read (list, search) | 1 unit | get_video_details(), yt_search() | | Search | 100 units | yt_search() | | Write (insert) | 50 units | create_playlist(), upload_video() | | Update | 50 units | update_video_metadata() | | Delete | 50 units | delete_videos(), delete_comments() | | Upload video | 1,600 units | upload_video() |

Important: Search operations (yt_search(), yt_topic_search()) cost 100 units each, so you can only perform 100 searches per day with the default quota!

Tracking Quota Usage

tuber provides built-in quota tracking:

library(tuber)

# Check current quota usage
quota_status <- yt_get_quota_usage()
print(quota_status)

# View details
quota_status$quota_used      # Units used today
quota_status$quota_limit     # Your daily limit (default: 10,000)
quota_status$quota_remaining # Units remaining
quota_status$reset_time      # When quota resets (midnight Pacific Time)

Setting Custom Quota Limits

If you've requested a quota increase from Google, update your limit:

# Set a custom quota limit
yt_set_quota_limit(50000)

# Reset quota tracking (e.g., after midnight)
yt_reset_quota()

Authentication Strategies

tuber supports two authentication methods, each with different use cases.

API Key Authentication

Best for read-only public data: - Searching videos - Getting public video statistics - Fetching channel information - Reading public comments

# Set up API key (get one from Google Cloud Console)
yt_set_key("YOUR_API_KEY")

# Use API key for read operations
video_stats <- get_video_details(
  video_ids = "dQw4w9WgXcQ",
  auth = "key"
)

channel_stats <- get_channel_stats(
  channel_ids = "UCuAXFkgsw1L7xaCfnd5JJOw",
  auth = "key"
)

OAuth2 Authentication

Required for: - Accessing private/unlisted content - Writing data (creating playlists, uploading videos) - Deleting resources - Accessing user's own channel data

# Set up OAuth2 (opens browser for authentication)
yt_oauth("YOUR_CLIENT_ID", "YOUR_CLIENT_SECRET")

# Write operations require OAuth
create_playlist(
  title = "My New Playlist",
  description = "A collection of favorites",
  status = "private"
)

# Access your own channel's unlisted videos
list_my_videos()

Which Authentication to Use?

| Task | API Key | OAuth2 | |------|---------|--------| | Search public videos | Yes | Yes | | Get public video stats | Yes | Yes | | Get public channel info | Yes | Yes | | Read public comments | Yes | Yes | | Access unlisted videos (own) | No | Yes | | Access private videos (own) | No | Yes | | Create/update playlists | No | Yes | | Upload videos | No | Yes | | Delete content | No | Yes | | Manage comments | No | Yes |

Batch Processing

When working with many videos or channels, batch processing is essential for efficiency.

Processing Multiple Videos

# Get details for multiple videos at once
video_ids <- c("dQw4w9WgXcQ", "M7FIvfx5J10", "kJQP7kiw5Fk")

# Single API call for multiple videos (more efficient)
videos <- get_video_details(
  video_ids = video_ids,
  part = c("snippet", "statistics"),
  auth = "key"
)

# Analyze results
head(videos)

Using Bulk Analysis Functions

tuber provides high-level functions for common analysis tasks:

# Comprehensive video analysis
analysis <- bulk_video_analysis(
  video_ids = video_ids,
  include_comments = FALSE,
  auth = "key"
)

# Access results
analysis$video_data    # Detailed video information
analysis$benchmarks    # Performance percentiles
analysis$summary       # Overall statistics

# Channel analysis
channel_analysis <- analyze_channel(
  channel_id = "UCuAXFkgsw1L7xaCfnd5JJOw",
  max_videos = 50,
  auth = "key"
)

# Compare multiple channels
comparison <- compare_channels(
  channel_ids = c("UC1", "UC2", "UC3"),
  metrics = c("subscriber_count", "video_count", "view_count"),
  auth = "key"
)

Pagination Handling

tuber automatically handles pagination for large result sets:

# Request more items than API allows per page (50)
# tuber automatically makes multiple API calls
playlist_items <- get_playlist_items(
  playlist_id = "PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf",
  max_results = 200,  # Will make 4 API calls
  auth = "key"
)

# Get many comments with automatic pagination
comments <- get_comment_threads(
  filter = c(video_id = "dQw4w9WgXcQ"),
  max_results = 500,  # Will paginate automatically
  auth = "key"
)

Extracting Data from Results

YouTube API responses contain nested JSON. Here's how to work with them.

Using simplify = TRUE (Default)

Most functions flatten nested data into data frames:

# Get simplified output
videos <- get_video_details(
  video_ids = "dQw4w9WgXcQ",
  simplify = TRUE,
  auth = "key"
)

# Access fields directly
videos$title
videos$viewCount
videos$likeCount
videos$channelTitle

Using simplify = FALSE for Complex Data

When you need the full nested structure:

# Get raw API response
videos_raw <- get_video_details(
  video_ids = "dQw4w9WgXcQ",
  simplify = FALSE,
  auth = "key"
)

# Navigate nested structure
video <- videos_raw$items[[1]]
video$snippet$title
video$snippet$thumbnails$high$url
video$statistics$viewCount
video$contentDetails$duration

Common Field Access Patterns

# Video details
videos$snippet.title          # Title
videos$snippet.description    # Description
videos$statistics.viewCount   # View count
videos$statistics.likeCount   # Like count
videos$contentDetails.duration # Duration (ISO 8601)

# Channel details
channels$snippet.title           # Channel name
channels$statistics.subscriberCount
channels$statistics.videoCount
channels$brandingSettings.channel.description

# Comment extraction
comments$snippet.topLevelComment.snippet.textDisplay
comments$snippet.topLevelComment.snippet.authorDisplayName
comments$snippet.topLevelComment.snippet.likeCount

Error Handling & Retries

Using with_retry for Transient Errors

# Automatic retry with exponential backoff
result <- with_retry(
  get_video_details(video_ids = "dQw4w9WgXcQ", auth = "key"),
  max_retries = 3,
  base_delay = 1
)

Handling Quota Exhaustion

# Check before making requests
quota <- yt_get_quota_usage()
if (quota$quota_remaining < 100) {
  warning("Low quota! Consider waiting until reset at: ", quota$reset_time)
}

# Wrap expensive operations
tryCatch({
  results <- yt_search(term = "R programming", max_results = 50)
}, error = function(e) {
  if (grepl("quota", e$message, ignore.case = TRUE)) {
    message("Quota exceeded. Try again after: ", yt_get_quota_usage()$reset_time)
  }
})

Rate Limiting Best Practices

# Add delays between requests
video_ids <- c("id1", "id2", "id3", "id4", "id5")

results <- lapply(video_ids, function(vid) {
  Sys.sleep(0.5)  # 500ms delay between requests
  get_video_details(video_ids = vid, auth = "key")
})

Caching for Performance

tuber includes built-in caching for frequently accessed data:

# Configure cache
tuber_cache_config(
  enabled = TRUE,
  max_size = 100,
  ttl = 3600  # 1 hour TTL
)

# Cached functions (no API call if recently fetched)
cats <- list_videocats_cached(auth = "key")
langs <- list_langs_cached(auth = "key")
regions <- list_regions_cached(auth = "key")
channel <- get_channel_info_cached(channel_id = "UCxyz", auth = "key")

# Check cache status
tuber_cache_info()

# Clear cache when needed
tuber_cache_clear()

Practical Examples

Example 1: Analyze a Channel's Performance

# Full channel analysis
analysis <- analyze_channel(
  channel_id = "UCuAXFkgsw1L7xaCfnd5JJOw",
  max_videos = 100,
  auth = "key"
)

# Summary statistics
cat("Channel:", analysis$channel_info$title, "\n")
cat("Subscribers:", analysis$channel_info$subscriberCount, "\n")
cat("Average views:", analysis$performance_metrics$avg_views_per_video, "\n")
cat("Engagement rate:", analysis$performance_metrics$engagement_rate, "\n")

Example 2: Trending Analysis

# Analyze trending topics
trends <- analyze_trends(
  search_terms = c("machine learning", "AI", "data science"),
  time_period = "month",
  max_results = 25,
  region_code = "US",
  auth = "key"
)

# View trend summary
print(trends$trend_summary)

# Most trending term
best_trend <- trends$trend_summary[1, ]
cat("Top trending:", best_trend$search_term, "\n")
cat("Total views:", best_trend$total_views, "\n")

Example 3: Efficient Video Processing

# Get all videos from a playlist
playlist_videos <- get_playlist_items(
  playlist_id = "PLrAXtmErZgOeiKm4sgNOknGvNjby9efdf",
  max_results = 100,
  auth = "key"
)

# Extract video IDs
video_ids <- sapply(playlist_videos$items, function(x) {
  x$contentDetails$videoId
})

# Get detailed stats for all videos in one call
video_stats <- get_video_details(
  video_ids = video_ids,
  part = c("statistics", "contentDetails"),
  auth = "key"
)

# Analyze performance
total_views <- sum(as.numeric(video_stats$viewCount), na.rm = TRUE)
avg_duration <- mean(as.numeric(video_stats$duration), na.rm = TRUE)

Troubleshooting

Common Issues

"quotaExceeded" error: - Check quota with yt_get_quota_usage() - Wait until reset_time or request quota increase from Google

"forbidden" error: - Ensure YouTube Data API is enabled in Google Cloud Console - Check API key/OAuth credentials are correct - Verify the resource isn't private

"videoNotFound" or empty results: - Video may be private, deleted, or region-restricted - Double-check the video/channel ID format

Rate limiting (429 errors): - Add delays with Sys.sleep() between requests - Use with_retry() for automatic backoff

Getting Help

# Check function documentation
?get_video_details
?yt_search
?with_retry

# View package vignettes
browseVignettes("tuber")


Try the tuber package in your browser

Any scripts or data that you put into this service are public.

tuber documentation built on March 25, 2026, 9:08 a.m.