Description Details Value Author(s) References See Also Examples
Counts for 2804 bigrams in 6175 restaurant reviews from the site www.we8there.com.
The short user-submitted reviews are accompanied by a five-star rating on four specific aspects of restaurant quality - food, service, value, and atmosphere - as well as the overall experience. The reviews originally appear in Maua and Cozman (2009), and the parsing details behind these specific counts are in Taddy (MNIR; 2013).
we8thereCounts |
A |
we8thereRatings |
A |
Matt Taddy, mataddy@gmail.com
Maua, D.D. and Cozman, F.G. (2009), Representing and classifying user reviews. In ENIA '09: VIII Enconro Nacional de Inteligencia Artificial, Brazil.
Taddy (2013, JASA), Multinomial Inverse Regression for Text Analysis.
Taddy (2013, AoAS), Distributed Multinomial Regression.
dmr, srproj
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | ## some multinomial inverse regression
## we'll regress counts onto 5-star overall rating
data(we8there)
## cl=NULL implies a serial run.
## To use a parallel library fork cluster,
## uncomment the relevant lines below.
## Forking is unix only; use PSOCK for windows
cl <- NULL
# cl <- makeCluster(detectCores(), type="FORK")
## small nlambda for a fast example
fits <- dmr(cl, we8thereRatings[,'Overall',drop=FALSE],
we8thereCounts, bins=5, gamma=1, nlambda=10)
# stopCluster(cl)
## plot fits for a few individual terms
terms <- c("first date","chicken wing",
"ate here", "good food",
"food fabul","terribl servic")
par(mfrow=c(3,2))
for(j in terms)
{ plot(fits[[j]]); mtext(j,font=2,line=2) }
## extract coefficients
B <- coef(fits)
mean(B[2,]==0) # sparsity in loadings
## some big loadings in IR
B[2,order(B[2,])[1:10]]
B[2,order(-B[2,])[1:10]]
## do MNIR projection onto factors
z <- srproj(B,we8thereCounts)
## fit a fwd model to the factors
summary(fwd <- lm(we8thereRatings$Overall ~ z))
## truncate the fwd predictions to our known range
fwd$fitted[fwd$fitted<1] <- 1
fwd$fitted[fwd$fitted>5] <- 5
## plot the fitted rating by true rating
par(mfrow=c(1,1))
plot(fwd$fitted ~ factor(we8thereRatings$Overall),
varwidth=TRUE, col="lightslategrey")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.