get_reddit_comments | R Documentation |
Query the PushShift API to search Reddit comments. Does some minimal input validation and massaging for quality-of-life. Incorporates polite batching, since the API will only return 100 results at a time.
get_reddit_comments( q = NA, ids = NA, size = 25, fields = NA, sort = c("created_utc", "score", "num_comments"), aggs = NA, author = NA, subreddit = NA, after = NA, before = NA, frequency = NA, metadata = FALSE, batch_pause = 1, parse_utc = TRUE, verbose = TRUE )
q |
Search term. String / Double-quoted String for phrases. |
ids |
Get specific comments via their ids. Comma-delimited base36 ids. |
size |
Number of results to return. Default is 25; values > 100 handled through batching. |
fields |
Return specific fields, either comma-delimited string or character vector. Default is all fields returned. Date/time created is always returned. |
sort |
Sort by a specific attribute. "score", "num_comments", "created_utc" |
aggs |
Return aggregation summary. DISABLED BY PUSHSHIFT DUE TO SERVER LOAD |
author |
Restrict to a specific author. |
subreddit |
Restrict to a specific subreddit. |
after |
Return results after this date. Epoch value or Integer + "s,m,h,d" (i.e. 30d for 30 days) |
before |
Return results before this date. Epoch value or Integer + "s,m,h,d" (i.e. 30d for 30 days) |
frequency |
Used with the aggs parameter when set to created_utc. DISABLED BY PUSHSHIFT DUE TO SERVER LOAD |
metadata |
display metadata about the query. Default false. |
batch_pause |
Pause between batches in seconds. Default is 1s. |
parse_utc |
Boolean flag: parse UTC timestamps into human-readable date-times? Default TRUE. |
verbose |
Debug boolean flag to enable/disable message logging to the console. |
A data frame with class tbl_df
with on row for each comment returned by the API.
## Not run: test <- get_reddit_comments(q = "coffee maker", size = 250)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.