stream_tweets: Collect a live stream of Twitter data.

Description Usage Arguments Value See Also Examples

View source: R/stream.R

Description

Returns public statuses via one of the following four methods:

Stream with hardwired reconnection method to ensure timeout integrity.

Usage

1
2
3
4
stream_tweets(q = "", timeout = 30, parse = TRUE, token = NULL,
  file_name = NULL, verbose = TRUE, ...)

stream_tweets2(..., dir = NULL, append = FALSE)

Arguments

q

Query used to select and customize streaming collection method. There are four possible methods. (1) The default, q = "", returns a small random sample of all publicly available Twitter statuses. (2) To filter by keyword, provide a comma separated character string with the desired phrase(s) and keyword(s). (3) Track users by providing a comma separated list of user IDs or screen names. (4) Use four latitude/longitude bounding box points to stream by geo location. This must be provided via a vector of length 4, e.g., c(-125, 26, -65, 49).

timeout

Numeric scalar specifying amount of time, in seconds, to leave connection open while streaming/capturing tweets. By default, this is set to 30 seconds. To stream indefinitely, use timeout = FALSE to ensure JSON file is not deleted upon completion or timeout = Inf.

parse

Logical, indicating whether to return parsed data. By default, parse = TRUE, this function does the parsing for you. However, for larger streams, or for automated scripts designed to continuously collect data, this should be set to false as the parsing process can eat up processing resources and time. For other uses, setting parse to TRUE saves you from having to sort and parse the messy list structure returned by Twitter. (Note: if you set parse to false, you can use the parse_stream function to parse the JSON file at a later point in time.)

token

OAuth token. By default token = NULL fetches a non-exhausted token from an environment variable. Find instructions on how to create tokens and setup an environment variable in the tokens vignette (in r, send ?tokens to console).

file_name

Character with name of file. By default, a temporary file is created, tweets are parsed and returned to parent environment, and the temporary file is deleted.

verbose

Logical, indicating whether or not to include output processing/retrieval messages.

...

Insert magical parameters, spell, or potion here. Or filter for tweets by language, e.g., language = "en".

dir

Name of directory in which json files should be written. The default, NULL, will create a timestamped "stream" folder in the current working directory. If a dir name is provided that does not already exist, one will be created.

append

Logical indicating whether to append or overwrite file_name if the file already exists. Defaults to FALSE, meaning this function will overwrite the preexisting file_name (in other words, it will delete any old file with the same name as file_name) meaning the data will be added as new lines to file if pre-existing.

Value

Tweets data returned as data frame with users data as attribute.

Returns data as expected using original search_tweets function.

See Also

https://stream.twitter.com/1.1/statuses/filter.json

Other stream tweets: parse_stream

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
## Not run: 
## stream tweets mentioning "election" for 90 seconds
e <- stream_tweets("election", timeout = 90)

## data frame where each observation (row) is a different tweet
e

## users data also retrieved, access it via users_data()
users_data(e)

## plot tweet frequency
ts_plot(e, "secs")

## stream tweets mentioning Obama for 30 seconds
djt <- stream_tweets("realdonaldtrump", timeout = 30)

## preview tweets data
djt

## get user IDs of people who mentioned trump
usrs <- users_data(djt)

## lookup users data
usrdat <- lookup_users(unique(usrs$user_id))

## preview users data
usrdat

## store large amount of tweets in files using continuous streams
## by default, stream_tweets() returns a random sample of all tweets
## leave the query field blank for the random sample of all tweets.
stream_tweets(
  timeout = (60 * 10),
  parse = FALSE,
  file_name = "tweets1"
)
stream_tweets(
  timeout = (60 * 10),
  parse = FALSE,
  file_name = "tweets2"
)

## parse tweets at a later time using parse_stream function
tw1 <- parse_stream("tweets1.json")
tw1

tw2 <- parse_stream("tweets2.json")
tw2

## streaming tweets by specifying lat/long coordinates

## stream continental US tweets for 5 minutes
usa <- stream_tweets(
  c(-125, 26, -65, 49),
  timeout = 300
)

## use lookup_coords() for a shortcut verson of the above code
usa <- stream_tweets(
  lookup_coords("usa"),
  timeout = 300
)

## stream world tweets for 5 mins, save to JSON file
## shortcut coords note: lookup_coords("world")
world.old <- stream_tweets(
  c(-180, -90, 180, 90),
  timeout = (60 * 5),
  parse = FALSE,
  file_name = "world-tweets.json"
)

## read in JSON file
rtworld <- parse_stream("word-tweets.json")

## world data set with with lat lng coords variables
x <- lat_lng(rtworld)


## End(Not run)

rtweet documentation built on Nov. 17, 2017, 5:54 a.m.