Tag Archives: RAPM

ezPM Compared with RAPM: Part II (Offense and Defense)


In a previous post, I showed the results for regressions of ezPM against 1-yr and 3-yr RAPM (regularized adjusted +/-). Now, let’s take a look at how the offensive and defensive components of ezPM correlate with their RAPM counterparts. If you are familiar with ezPM, then you know I typically calculate three separate components: O100, D100, and REB100. To enable comparison with RAPM data, I folded the REB100 into O100 and D100, to give total offense and defense components (i.e. that include offensive and defensive rebounding, respectively). Just as a quick refresher, I re-ran the regression for the overall metric comparison, this time weighting by possession number, and focusing only on the 3-yr data set:


RAPM as a function of EZPM (3-YR).

Call:lm(formula = RAPM ~ EZPM100, data = tot, weights = POSS) Residuals:    Min      1Q  Median      3Q     Max  -528.06  -84.51   -7.21   64.96  613.87  Coefficients:             Estimate Std. Error t value Pr(>|t|)     (Intercept)  0.81273    0.09070    8.96   <2e-16 *** EZPM100      0.60519    0.03686   16.42   <2e-16 *** ---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 142.3 on 381 degrees of freedom Multiple R-squared: 0.4144, Adjusted R-squared: 0.4129  F-statistic: 269.6 on 1 and 381 DF,  p-value: < 2.2e-16

You can see that there is a slight increase in R^2 to 0.41 from 0.37 previously. Now, let’s look at the regression results for the offense:


RAPM vs. EZPM (3-YR Offense)

lm(formula = OFF_RAPM ~ O100, data = off, weights = POSS) Residuals:    Min      1Q  Median      3Q     Max  -399.88  -84.15  -19.91   45.38  564.10  Coefficients:            Estimate Std. Error t value Pr(>|t|)     (Intercept) -0.23537    0.09328  -2.523   0.0120 *   O100         0.57146    0.04266  13.395   <2e-16 *** ---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 117.6 on 381 degrees of freedom Multiple R-squared: 0.3201, Adjusted R-squared: 0.3184  F-statistic: 179.4 on 1 and 381 DF,  p-value: < 2.2e-16

The correlation between the individual offensive components of ezPM and RAPM is significant (p<2.2e-16) and R^2=0.32. As I did last time, I want to give a table showing the best and worst players according to an average of the two metrics (note Warriors guard Charlie Bell shows up on the Bottom 20):

Top 20 Offensive Players (> 5000 Possessions)

RANK NAME OFF_RAPM OFF_ezPM AVG
1 LeBron James 6.90 7.19 7.05
2 Steve Nash 7.80 5.56 6.68
3 Dwyane Wade 6.30 6.05 6.18
4 Chris Paul 5.10 7.07 6.09
5 Dwight Howard 3.60 6.37 4.99
6 Deron Williams 4.50 4.60 4.55
7 Chauncey Billups 3.90 4.72 4.31
8 Pau Gasol 2.90 5.67 4.29
9 Dirk Nowitzki 5.10 3.16 4.13
10 Manu Ginobili 4.20 3.74 3.97
11 Kobe Bryant 4.30 3.60 3.95
12 Brandon Roy 3.60 3.88 3.74
13 Chris Bosh 3.10 4.05 3.58
14 Kevin Martin 3.50 2.84 3.17
15 Joe Johnson 3.90 2.38 3.14
16 Nene Hilario 2.00 4.25 3.13
17 Amare Stoudemire 2.40 3.59 3.00
18 Ty Lawson 2.10 3.85 2.98
19 Carmelo Anthony 3.30 2.58 2.94
20 Kevin Love 0.90 4.94 2.92

Bottom 20 Offensive Players (> 5000 Possessions)

RANK NAME OFF_RAPM OFF_ezPM AVG
237 Donte Greene -1.10 -2.76 -1.93
236 Chris Kaman -2.80 -0.83 -1.82
235 Rasual Butler -2.10 -1.41 -1.76
234 Yi Jianlian -1.60 -1.87 -1.74
233 J.J. Hickson -3.30 -0.07 -1.69
232 Jonny Flynn -0.60 -2.55 -1.58
231 Corey Brewer -0.80 -2.21 -1.51
230 Brandon Rush -1.70 -1.31 -1.51
229 Dahntay Jones -2.90 0.08 -1.41
228 Darko Milicic -2.10 -0.45 -1.28
227 Jason Kapono -0.80 -1.74 -1.27
226 Tyrus Thomas -2.30 0.14 -1.08
225 Andray Blatche -1.70 -0.41 -1.06
224 Spencer Hawes -0.90 -1.08 -0.99
223 Joel Anthony -3.30 1.34 -0.98
222 Charlie Bell -0.90 -0.98 -0.94
221 Rafer Alston -0.80 -0.98 -0.89
220 Trevor Ariza -1.40 -0.32 -0.86
219 Tyreke Evans -2.10 0.46 -0.82
218 Kurt Thomas -1.80 0.18 -0.81

Here are the results for the defense:


ezPM vs. RAPM (3-YR Defense)

lm(formula = DEF_RAPM ~ D100, data = def, weights = POSS) Residuals:    Min      1Q  Median      3Q     Max  -392.80  -51.15    6.09   61.18  372.19  Coefficients:            Estimate Std. Error t value Pr(>|t|)     (Intercept)  1.02475    0.08468   12.10   <2e-16 *** D100         0.54199    0.04460   12.15   <2e-16 *** ---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 102.1 on 381 degrees of freedom Multiple R-squared: 0.2793, Adjusted R-squared: 0.2774  F-statistic: 147.7 on 1 and 381 DF,  p-value: < 2.2e-16

Once again the results are statistically significant (p<2.2e-16) and, perhaps, somewhat surprisingly, the R^2 value of 0.28 is only slightly lower than for the offense. This tells me that we are capturing quite a bit of the same defensive contributions as RAPM. To wrap it up, here are tables for the top and bottom players averaged by the two metrics (unfortunately, you will notice that $15M-man David Lee shows up on the less preferred of the two lists):

Top 20 Defensive Players (> 5000 possessions)

RANK NAME DEF_RAPM DEF_ezPM AVG
1 Kevin Garnett 6.2 2.49 4.35
2 Dwight Howard 3.0 3.87 3.44
3 LeBron James 3.8 2.6 3.20
4 Andrew Bogut 4.1 1.88 2.99
5 Tim Duncan 3.5 2.08 2.79
6 Josh Smith 3.8 1.48 2.64
7 Gerald Wallace 3.0 2.19 2.60
8 Marcus Camby 3.0 2.03 2.52
9 Andrei Kirilenko 2.6 1.88 2.24
10 Ron Artest 3.2 0.88 2.04
11 Ben Wallace 2.5 1.37 1.94
12 Lamar Odom 3.4 0.38 1.89
13 Thabo Sefolosha 2.5 0.98 1.74
14 Kurt Thomas 2.6 0.85 1.73
15 Luol Deng 2.8 0.61 1.71
16 Trevor Ariza 2.1 1.29 1.70
17 Manu Ginobili 1.6 1.53 1.57
18 Tyrus Thomas 1.6 1.43 1.52
19 Anderson Varejao 2.1 0.81 1.46
20 James Harden 2.0 0.9 1.45

Bottom 20 Defensive Players (> 5000 Possessions)

RANK NAME DEF_RAPM D100 AVG
237 Andrea Bargnani -3.1 -3.07 -3.09
236 Aaron Brooks -1.8 -3.86 -2.83
235 Charlie Villanueva -2.8 -2.75 -2.78
234 Kevin Martin -3.8 -1.72 -2.76
233 Will Bynum -1.8 -3.47 -2.64
232 Jason Kapono -1.1 -3.98 -2.54
231 D.J. Augustin -0.8 -4.15 -2.48
230 JaVale McGee -1.7 -3.17 -2.44
229 Jason Maxiell -1.2 -3.61 -2.41
228 Spencer Hawes -1.6 -3.01 -2.31
227 Goran Dragic -1.1 -3.5 -2.30
226 Jose Calderon -1.7 -2.67 -2.19
225 Jeff Green -2.2 -2.14 -2.17
223 Antoine Wright -0.9 -3.43 -2.17
224 Jonny Flynn -1.6 -2.73 -2.17
222 J.J. Hickson -2.6 -1.63 -2.12
221 David Lee -1.9 -2.31 -2.11
220 Devin Harris -1.1 -3.03 -2.07
219 Ben Gordon -2.1 -1.99 -2.05
218 Maurice Evans -1.5 -2.52 -2.01

 

ezPM Compared with RAPM: Part I


An APBR forum member (back2newbelf) has recently started publishing regularized adjusted +/- (RAPM) data. It’s like +/- data, but on steroids. If you want to learn more about how RAPM works, in general, see the paper presented by Joe Sill at the MIT Sloan Sports Analytics Conference in March, 2010. I thought it would be useful at this point in the development of the ezPM model to compare 1yr and 3yr averages with back2newbelf’s RAPM data. What follows are the results for regression of total RAPM on total ezPM100 (both metrics are per 100 possessions), with some tables of best/worst players by average of the two metrics, to give some idea of the actual numbers. In a subsequent post, I will perform the same type of analysis on the offensive and defensive components of the metrics.

1 Yr RAPM

The 1yr RAPM data set can be found here. I used a 1000 possession minimum as my cutoff, which left about 200 players to compare. Here are the results in graphical form, followed by regression data from R:

1 yr. RAPM as a function of ezPM100.
Call:lm(formula = RAPM ~ EZPM, data = data.1yr)
Residuals:    Min      1Q  Median      3Q     Max
 -3.2659 -0.8885  0.0009  0.8501  4.3308 
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.20883    0.09066   2.303   0.0223 *  
EZPM         0.27176    0.03348   8.118 4.34e-14 ***
---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 Residual standard error: 1.284 on 205 degrees of freedom
Multiple R-squared: 0.2433,	Adjusted R-squared: 0.2396 
F-statistic: 65.91 on 1 and 205 DF,  p-value: 4.335e-14

A few things to note here. 1) The regression result is highly significant (p=4.34e-14); 2) The slope of the regression is 0.27, which means that RAPM underestimates ezPM100, or put the other way, ezPM100 overestimates RAPM; and 3) ezPM100 explains about 24% of the variance (R^2=0.24).

Let’s look at some of the best and worst players according to the 1-yr data by averaging the two metrics:

Top 20 Players by 1 Yr. Metric Average (ezPM100+RAPM)/2

RANK NAME RAPM EZPM AVG
1 LeBron James 3.3 7.37 5.34
2 Manu Ginobili 4.1 5.41 4.76
3 Chris Paul 2.7 6.63 4.67
4 Dwight Howard 1.7 7.48 4.59
5 Pau Gasol 2.9 6.04 4.47
6 Dwyane Wade 2.3 6.45 4.38
7 Paul Pierce 3.4 5.20 4.30
8 Steve Nash 2.5 5.67 4.09
9 Dirk Nowitzki 5.2 2.43 3.82
10 Kevin Garnett 4.0 3.45 3.73
11 Kobe Bryant 0.5 6.22 3.36
12 Tyson Chandler 3.4 3.25 3.33
13 Nene Hilario 2.4 4.12 3.26
14 George Hill 3.5 2.87 3.19
15 Ronnie Brewer 0.9 5.43 3.17
16 Kevin Durant 1.7 4.30 3.00
17 Brandon Bass 3.2 2.75 2.98
18 Lamar Odom 1.7 4.11 2.91
19 Rajon Rondo 2.2 3.45 2.83
20 Al Horford 1.1 4.51 2.81

Bottom 20 Players

RANK NAME RAPM EZPM AVG
207 J.J. Hickson -2.8 -4.99 -3.90
206 Andrea Bargnani -0.9 -5.89 -3.40
205 Goran Dragic -3.6 -3.13 -3.37
204 Eric Bledsoe -2.5 -3.57 -3.04
203 Sonny Weems -1.4 -4.47 -2.94
202 Jordan Hill -2.9 -2.93 -2.92
201 DeMarcus Cousins -2.1 -3.71 -2.91
200 John Wall -1.8 -4.00 -2.90
199 Travis Outlaw -2.8 -2.97 -2.89
198 Dante Cunningham -0.9 -4.80 -2.85
197 Spencer Hawes -0.8 -4.79 -2.80
196 Steve Blake 0.6 -6.03 -2.72
195 Richard Hamilton -2.7 -2.27 -2.49
194 Charlie Villanueva -0.1 -4.77 -2.44
193 Stephen Jackson -2.1 -2.33 -2.22
192 Linas Kleiza -0.1 -4.31 -2.21
191 Jeff Green -0.4 -3.99 -2.20
190 Antawn Jamison -1.7 -2.65 -2.18
189 Michael Beasley -0.8 -3.46 -2.13
188 Darko Milicic -1.0 -3.25 -2.13

Ok, both models are clearly wrong. Did you see that block J.J. Hickson made on Griffin last night?

3 Yr RAPM

The 3 yr. RAPM data can be found here. My data set goes back to the 2008-2009 season through the first week of February. As far as I know, the RAPM data weights each year equally, so I did the same to make the comparison fair. As before, a plot followed by numbers:

 

3 yr. RAPM as a function of ezPM100.

Call:lm(formula = RAPM ~ EZPM, data = avg.3yr)
Residuals:    Min      1Q  Median      3Q     Max
 -5.8064 -1.1009  0.0435  1.1527  5.9218 
Coefficients:            
Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.58592    0.10964   5.344 1.92e-07 ***
EZPM         0.43908    0.03725  11.786  < 2e-16 ***
---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
Residual standard error: 1.703 on 273 degrees of freedom
Multiple R-squared: 0.3372,	Adjusted R-squared: 0.3348 
F-statistic: 138.9 on 1 and 273 DF,  p-value: < 2.2e-16

As expected, there is an improvement in both the slope (0.44) and R^2 (~0.34). Here are the player tables:

Top 20 Players by 3 Yr Metric Average

RANK NAME RAPM EZPM AVG
1 LeBron James 10.6 9.8 10.2
2 Dwight Howard 6.7 10.1 8.4
3 Chris Paul 6.7 7.7 7.2
4 Kevin Garnett 7.2 4.1 5.6
5 Manu Ginobili 5.8 5.4 5.6
6 Tim Duncan 5.0 5.7 5.4
7 Steve Nash 7.4 3.3 5.3
8 Dirk Nowitzki 7.5 2.3 4.9
9 Kobe Bryant 4.7 5.0 4.8
10 Pau Gasol 4.0 5.4 4.7
11 Yao Ming 4.0 5.2 4.6
12 Chris Bosh 5.0 3.4 4.2
13 Paul Pierce 4.7 3.6 4.2
14 Greg Oden 2.7 5.6 4.2
15 Andrew Bogut 4.5 3.8 4.1
16 Lamar Odom 5.0 3.2 4.1
17 Leon Powe 1.4 6.7 4.1
18 Amir Johnson 5.2 2.9 4.1
19 Marcus Camby 3.2 4.3 3.8
20 Nene Hilario 3.6 3.9 3.8

Bottom 20 Players by 3yr Average (Or the List You Really Don’t Want to Appear On)

RANK NAME RAPM EZPM AVG
458 Gerald Green -1.9 -10.2 -6.0
457 Josh Powell -5.5 -6.4 -5.9
456 Bobby Brown -3.6 -7.4 -5.5
455 Adam Morrison -2.6 -7.7 -5.2
454 Ricky Davis -4.9 -5.0 -5.0
453 Darnell Jackson -2.8 -6.8 -4.8
452 Stephon Marbury -2.6 -6.8 -4.7
451 Brian Skinner -4.5 -4.6 -4.6
450 Brian Scalabrine -1.1 -8.0 -4.5
449 Johan Petro -3.7 -5.2 -4.5
448 Jannero Pargo -0.7 -8.1 -4.4
447 Marcus Williams -3.1 -5.6 -4.3
446 Malik Allen -2.2 -6.4 -4.3
444 Timofey Mozgov -1.8 -6.5 -4.1
445 Trenton Hassell -4.5 -3.8 -4.1
443 Rob Kurz -2.2 -6.0 -4.1
442 DaJuan Summers -1.2 -7.0 -4.1
441 J.J. Hickson -5.9 -2.2 -4.1
440 Oleksiy Pecherov -0.8 -7.3 -4.0
439 Brian Cook -2.6 -5.3 -3.9

Conclusions

This was definitely a worthwhile exercise. It’s good to see how the ezPM model compares to RAPM. Of course, it should not be expected that the two models line up perfectly. That would be great, but in practice, we should be using multiple models to evaluate players. Some players may look better in one metric or the other. We should have more confidence in players that are highly rated by both an APM model and a box score metric, such as ezPM. For example, what I didn’t show here are the players that were ranked in the top 20 by either metric alone. That would have showed that Derek Fisher is one of the best players in the league according to RAPM 1yr data (2.4), but not according to ezPM (-4.03). Kris Humphries looks great according to ezPM (4.52), but not RAPM (-1.4). (His new girlfriend always looks great!)

Anyway, this is a good stopping point, but also a good starting point. Going forward, we’ll see if there are adjustments that can be made to ezPM that will make it even more consistent with RAPM. For example, why is Dirk rated so much higher in RAPM? Does it have something to do with usage? His teammates? It’s also important to ask which model is a better predictor. If one or the other (or an average) is a better predictor, we probably want to know that, right? As always, to be continued…