ipo: Facebook, Google, and LinkedIn IPO filings

Description Usage Format Details Source References Examples

Description

On Feb 1st, 2011, Facebook Inc. filed an S-1 form with the Securities and Exchange Commission as part of their initial public offering (IPO). This dataset includes the text of that document as well as text from the IPOs of two competing companies: Google and LinkedIn.

Usage

1

Format

The format is a list of three character vectors. Each vector contains the line-by-line text of the IPO Prospectus of Facebook, Google, and LinkedIn, respectively.

Details

Each of the three prospectuses is encoded in UTF-8 format and contains some non-word characters related to the layout of the original documents. For analysis on the words, it is recommended that the data be processed with packages such as tm and stringr. See example below.

Source

All IPO prospectuses are available from www.sec.gov: Facebook, Google, LinkedIn.

References

http://blogs.wsj.com/totalreturn/2012/02/06/mark-zuckerberg-ceo-for-life/

Credit to Qian Liu at the Wealthfront Blog for the data links and wordcloud example below.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
data(ipo)
## Not run: 
# install.packages("tm")
# install.packages("wordcloud")
library(tm)
library(wordcloud)

# pre-process data
corp <- Corpus(VectorSource(ipo), readerControl=list(language="en"))
corp <- tm_map(corp, removePunctuation)
corp <- tm_map(corp, tolower)
corp <- tm_map(corp, removeNumbers)
corp <- tm_map(corp, function(x)removeWords(x,stopwords()))
f    <- corp[1] # facebook
g    <- corp[2] # google
l    <- corp[3] # linkedin

tmat      <- TermDocumentMatrix(f)
m         <- as.matrix(tmat)
freq      <- rowSums(m)
words     <- rownames(m)
words.ord <- sort.int(freq, decreasing = T, index.return = F)
barplot(words.ord[1:15], las = 2)

wordcloud(words, freq, min.freq = 100, col='blue')

tmat <- TermDocumentMatrix(c(f, g))
m    <- as.matrix(tmat)
comparison.cloud(m, max.words = 100)

## End(Not run)

OIdata documentation built on May 2, 2019, 2:14 p.m.