ucla_textbooks_f18: Sample of UCLA course textbooks for Fall 2018

Description Usage Format Details Source See Also Examples

Description

A sample of courses were collected from UCLA from Fall 2018, and the corresponding textbook prices were collected from the UCLA bookstore and also from Amazon.

A past data set was collected from UCLA courses in Spring 2010, and Amazon at that time was found to be almost uniformly lower than those of the UCLA bookstore's. Now in 2018, the UCLA bookstore is about even with Amazon on the vast majority of titles, and there is no statistical difference in the sample data.

Usage

1
data("ucla_textbooks_f18")

Format

A data frame with 201 observations on the following 20 variables.

year

Year the course was offered.

term

Term the course was offered.

subject

Subject.

subject_abbr

Subject abbreviation, if any.

course

Course name.

course_num

Course number, complete.

course_numeric

Course number, numeric only.

seminar

Boolean for if this is a seminar course.

ind_study

Boolean for if this is some form of independent study.

apprenticeship

Boolean for if this is an apprenticeship.

internship

Boolean for if this is an internship.

honors_contracts

Boolean for if this is an honors contracts course.

laboratory

Boolean for if this is a lab.

special_topic

Boolean for if this is any of the special types of courses listed.

textbook_isbn

Textbook ISBN.

bookstore_new

New price at the UCLA bookstore.

bookstore_used

Used price at the UCLA bookstore.

amazon_new

New price sold by Amazon.

amazon_used

Used price sold by Amazon.

notes

Any relevant notes.

Details

The most expensive book required for the course was generally used.

The reason why we advocate for using raw amount differences instead of percent differences is that a 20% savings on a $10 book is minor relative to a 20% savings on a $100 book, meaning a small and largely insignificant price difference on low-priced books would balance numerically (but not in a practical sense) a moderate but important price difference on more expensive books. So while this tends to result in a bit less sensitivity in detecting some effect, we believe the absolute difference compares prices in a more meaningful way.

Used prices contain the shipping cost but do not contain tax. The used prices are a more nuanced comparison, since these are all 3rd party sellers. Amazon is often more a marketplace than a retail site at this point, and many people buy from 3rd party sellers on Amazon now without realizing it. The relationship Amazon has with 3rd party sellers is also challenging. Given the frequently changing dynamics in this space, we don't think any analysis here will be very reliable for long term insights since products from these sellers changes frequently in quantity and price. For this reason, we focus only on new books sold directly by Amazon in our comparison. In a future round of data collection, it may be interesting to explore whether the dynamics have changed in the used market.

Source

http://sa.ucla.edu/ro/public/soc

http://ucla.verbacompare.com

http://amazon.com

See Also

textbooks, ucla_f18

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
d <- ucla_textbooks_f18
plot(d$bookstore_new, d$amazon_new, log = "")
a <- c(0.01, 10000)
lines(a, a)

# The following outliers were double checked for accuracy.
d$price_diff <- d$bookstore_new - d$amazon_new
these <- abs(d$price_diff) > 20
these <- these &
    !is.na(abs(d$price_diff) > 20)
d[these, ]

table(is.na(d$price_diff))
hist(d$price_diff)
qqnorm(d$price_diff)
t.test(d$price_diff)

JECheadle/RSOC317L documentation built on May 15, 2019, 4:02 a.m.