Description Details Value Author(s) References See Also Examples
Every NHL goal from fall 2002 through the 2014 cup finals.
The data comprise of information about
play configuration and the
players on ice (including goalies) for every
goal from 200203 to 201214 NHL seasons.
Collected using A. C. Thomas's nlhscrapr
package.
See the Chicago hockey analytics project at github.com/mataddy/hockey
.
goal 
Info about each goal scored, including 
player 
Sparse Matrix with entries for who was on the ice for each goal: +1 for a home team player, 1 for an away team player, zero otherwise. 
team 
Sparse Matrix with indicators for each team*season interaction: +1 for home team, 1 for away team. 
config 
Special teams info. For example,

Matt Taddy, mataddy@gmail.com
Gramacy, Jensen, and Taddy (2013): "Estimating Player Contribution in Hockey with Regularized Logistic Regression", the Journal of Quantitative Analysis in Sport.
Gramacy, Taddy, and Tian (2015): "Hockey Player Performance via Regularized Logistic Regression", the Handbook of statistical methods for design and analysis in sports.
gamlr
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33  ## design
data(hockey)
x < cbind(config,team,player)
y < goal$homegoal
## fit the plusminus regression model
## (nonplayer effects are unpenalized)
fit < gamlr(x, y,
lambda.min.ratio=0.05, nlambda=40, ## just so it runs in under 5 sec
free=1:(ncol(config)+ncol(team)),
standardize=FALSE, family="binomial")
plot(fit)
## look at estimated player [career] effects
B < coef(fit)[colnames(player),]
sum(B!=0) # number of measurable effects (AICc selection)
B[order(B)[1:10]] # 10 biggest
## convert to 20132014 season partial plusminus
now < goal$season=="20132014"
pm < colSums(player[now,names(B)]*c(1,1)[y[now]+1]) # traditional plus minus
ng < colSums(abs(player[now,names(B)])) # total number of goals
# The individual effect on probability that a
# given goal is for vs against that player's team
p < 1/(1+exp(B))
# multiply ng*p  ng*(1p) to get expected plusminus
ppm < ng*(2*p1)
# organize the data together and print top 20
effect < data.frame(b=round(B,3),ppm=round(ppm,3),pm=pm)
effect < effect[order(effect$ppm),]
print(effect[1:20,])

Loading required package: Matrix
[1] 620
PETER_FORSBERG ONDREJ_PALAT TYLER_TOFFOLI ZIGMUND_PALFFY SIDNEY_CROSBY
0.7506064 0.6035498 0.5999503 0.4229641 0.4087186
JOE_THORNTON PAVEL_DATSYUK LOGAN_COUTURE ERIC_FEHR MATT_MOULSON
0.3808053 0.3696573 0.3616907 0.3613557 0.3510730
b ppm pm
ONDREJ_PALAT 0.604 37.496 38
SIDNEY_CROSBY 0.409 31.847 52
HENRIK_LUNDQVIST 0.162 26.746 9
JONATHAN_TOEWS 0.301 24.060 35
ANDREI_MARKOV 0.274 23.707 34
TYLER_TOFFOLI 0.600 21.847 31
JOE_THORNTON 0.381 21.824 34
ANZE_KOPITAR 0.241 21.700 39
RYAN_NUGENTHOPKINS 0.282 18.768 18
GABRIEL_LANDESKOG 0.260 18.379 36
PAVEL_DATSYUK 0.370 18.092 13
LOGAN_COUTURE 0.362 17.353 29
ALEX_OVECHKIN 0.300 16.389 16
MARIAN_HOSSA 0.261 15.681 21
DAVID_PERRON 0.273 15.186 2
ALEXANDER_SEMIN 0.349 15.040 1
MATT_MOULSON 0.351 14.595 22
MIKKO_KOIVU 0.262 14.057 12
FRANS_NIELSEN 0.289 14.053 8
JONATHAN_BERNIER 0.128 13.317 22
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.