Blazersedge: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
New Blog: World Soccer Digest for Soccer Fans!

Predicting Wins in 2009-2010, Consider "Regression to the Mean"


With the release of the 2009-2010 NBA schedule yesterday, we saw the first wave of predictions for the upcoming season. In making predictions for the Blazers or any other team the first thing most people consider, whether explicitly or implicitly, is a team's record the previous season. The previous season serves as a baseline in our mind and then we try to figure out if the team will improve or get worse. By and large, this is a sensible way of thinking.

However, an additional consideration that should go into our thought process--but that we often forget--is a phenomenon called regression to the mean. Regression to the mean is a technical term in probability and statistics that refers to the fact that, left to themselves, things tend to return to normal, whatever that is. The term was coined by Francis Galton, inventor of the regression, when he noticed that the offspring of very tall parents tended to be shorter than their parents--at least that's the story they always tell in statistics classes. Regression to the mean in NBA wins would imply that teams that tend to win a lot of games in one season are more likely to win fewer games the following season, or conversely that teams that win few games in one season are more likely to win more games the following season. In other words, a 60 win team in 2009 is more likely to win 55-59 games than 60-64 in 2010, while a 25 win team in 2000 is more likely to win 26-30 games than 25-21 games in 2001. Does this happen in the NBA? Indeed it does. In fact, regression to the mean is quite pronounced and holds controlling for the average age of teams. Details below.

Star-divide

Regression to the Mean in wins from Season to Season, the NBA from 1956 to 2009

 

It is actually fairly easy to assess the importance of regression to the mean in wins from one season to the next in the NBA. Team season records are readily available at basketball-reference.com or databasebasketball.com and the empirical question is straightforward: do teams tend to win fewer games the following season if they are above the mean number of wins (41)? There are a variety of ways to answer this question analytically, all of which point toward regression to the mean being quite robust. To demonstrate that this occurs, I have simply graphed the average change in the of number of wins as a function of wins in the previous season for all teams from 1956 to 2009:

 Rtm_wins56-09_medium

The graph shows that teams that have won 40 or 41 games the previous season, win about 40 to 41 games on average the next season, because the average change in the number of games is about 0. As teams move in either direction of average, however, regression to the mean occurs. Teams that win 46 games a season, tend to win about 2 fewer games the following season (44). In contrast, teams that win 37 games in one season win, on average, about 3 more games the next season (40). In addition, the farther a team moves away from the mean, the stronger the pull of the mean. 60 win teams win, on average, about 6 fewer games the following season (54), while 25 win teams tend to win about 6 more games the following season (31).

In common language, this graph shows that bad teams tend to get better and elite teams tend to get worse. A fairly sensible implication of this pattern is that, as many have suggested, going from 54 to 60 wins is "harder" than going from 40 wins to 46 wins. Why is true? There are a variety of possible reasons, but the most important one is probably luck. Teams that do well tend to avoid injuries, have favorable schedules, and win close games. Teams with bad luck (injuries, bad chemistry, or bad bounces) tend to see their fortunes brighten the following season simply because average luck is more likely than bad luck. I was fairly confident that I would see evidence of regression to the mean in the data, but the strength, regularity, and linearity of the pattern surprised me. I figured that the graph would be fairly flat around the middle, with teams with 35 to 47 wins not regressing to the mean much, but truly elite and terrible teams regressing to the mean quite strongly.

Regression to the Mean in wins from Season to Season, the NBA from 1980 to 2009

To check on the robustness of this pattern, consider the graph below, which restricts the analysis to season from 1980 to 2009. Though the graphs look almost indentical, they are run on different data, which is one indication of how regular this pattern is:

Rtm_wins80-09_medium

Is it all about Age?

One might wonder if this trend is simply a reflection of something that we've talked about before, age. That is, is the tendency for bad teams to get better and elite teams to get worse, simply a reflection of the fact that "bad" teams are really just younger teams and elite teams are full of veterans? The short answer is no. While it is true that older teams tend to get worse and younger teams tend to get better (with the break even point being an average age of 27 years), regression to the mean is still strong controlling for age. In other words, teams with an average age greater than 27 tend to have a worse record the following season, but the higher the number of wins the previous season, the worse their record.

To illustrate this, the graph below shows the average change in number of wins as a function of previous wins, controlling for a team's average age in the previous season. (For those that care, the y-axis is actually the residuals from a regression of change in wins on the age of a team in the previous season). That the slope of the line in this graph is less steep indicates that age was driving some--but not all--of the pattern in the previous graphs.

 

Rtm_winscage56-09_medium

So what does this mean for the Blazers in 2009-2010?

 

Since the Blazers were both very young and very good in 2009, what should we expect in 2010? The youthfulness of the Blazers suggest that they should improve, but improving from 54 wins is very difficult. In particular, teams that have won 54 games have won an average of 51 games the following season. On the other hand, teams of an average age of 24 to 25, win about 5 additional games the following season. Quick and dirty regressions of wins on a set of dummy variables for wins the previous season and age the previous season yields a prediction of 54-56 wins for the Blazers in 2010, depending on some minor technical assumptions.  While I do not believe that theses are the only factors one should consider in projecting the Blazers season in 2010, I also would not ignore them. If you think that the Blazers are going to win more than 54-56 games in 2010, it should be because you believe the additions of Andre Miller and improvement of Oden and other players will make-up for the normal regression to the mean that occurs in the NBA. 

Lastly, for those of you that are not interested in averages, graphs, regressions, and whanot, below is a list of the team records for all teams following a season of 54 wins (so, LAL won 54 games in 62, but 43 games in 63):

Team   Year    Wins       Age prior year
PH1 1960 48 26.43747
LAL 1963 43 26.49471
BOS 1968 48 30.05789
NYK 1969 60 26.63787
CHI 1974 47 29.21364
BOS 1976 44 29.32505
WA1 1979 39 28.55113
LAL 1981 57 27.44369
LAL 1984 62 28.10951
PHI 1986 45 28.51308
DEN 1988 44 28.99218
DET 1988 63 27.97651
PHO 1990 55 27.21177
UTA 1991 55 29.1215
CLE 1993 47 29.28346
CHA 1997 51 29.47461
DET 1997 37 28.57562
IND 1999 56 31.22052
ORL 1999 41 29.17436
MIA 1999 52 30.66639
DET 2004 54 27.88333
DET 2005 64 28.37027
PHO 2006 61 28.01781

 

As you can see, some 54 win teams improved, more got worse. In addition, last year's Blazers, with an average age of 24.5, is far younger than all previous 54 win teams. Thus, there is no perfect historical analogy for the current team.

Nonetheless, seeing the strength of regression to the mean in the NBA probably has made me a bit more skeptical about the Blazers chances of winning 60 games in 2010 (my original prediction), and it has had an even bigger impact on the way I will think about the rest of the league in 2010. Anyway, this is far from the final word on making projections for the coming season. It's just an interesting pattern that is easy to document that I thought the Blazersedge community might find interesting. Does this change your outlook on the Blazers in 2009-10? For other teams? Why or why, not?

Any alternative explanations, comments, questions, or suggestions for additional analysis?

20 recs  |  Comment 69 comments

Story-email Email Printer Print

Comments

Display:

I applaud your willkingness to

wade into meaningful stats. You’re well ahead of me there.

Another way of stating regression to the mean is that it is hard to become really good, and even harder to maintain that level.

"I'm a man, but I can change.....if I have to......I guess." - Red Green

by antediluvian on Aug 5, 2009 11:31 AM PDT reply actions   0 recs

So getting above par (improving) means return to par is more likely than not. What comes up must come down, but how far up is different for each team. If three teams get 54 wins, one is already on their way down and is passing 54 going the wrong direction. One is getting better and passing 54 on the way up. The third peaked the previous year at 54 and so will be part of the batch going back down the next year. 2 go down for 1 going up.

How can you analyze the zenith of each teams rise? Wins spent above .500? Peak wins and time to return to .500? Age relative to peak wins? Championships per team that include somebody on the roster named Brandon Roy?

I don’t know any statistics jokes, but I looked one up and included it here:

Statistics play an important role in genetics. For instance, statistics prove that numbers of offspring is an inherited trait. If your parent didn’t have any kids, odds are you won’t either.

The cowards never started
The weak died along the way
Only the strong survived
They were the Trailblazers

by lukeyhere on Aug 5, 2009 11:45 AM PDT reply actions   1 recs

A very intersting suggestion...

What you are suggesting is that there is some momentum to the trajectory of a franchise, with teams oscillating back and force. You are right that this would produce a regression towards the mean type pattern in the data.

It’s not entirely clear how best to assess if this happens, but I did a quick check and I was surprised to see the opposite: the more "good’ teams improved from year 1 to year 2, the more they regressed in year year 3. Conversely, the more bad teams deteriorated from year 1 to year 2, the more they improved in year 3. For example, the Phoenix suns won 62 games in 2004, 54 games in 2005, and then 61 games in 2006. In contrast, Orlando won 41 games in 1997, improved to 54 games in 1998, and then fell back to 41 wins in 1999.

by PoliSam on Aug 5, 2009 6:29 PM PDT up reply actions   0 recs

I don’t want to look at past teams stats i want a real prediction not some where between 54 and 56 wins. and no lakers stats this is a blazer blog. if you went to church would you talk about how the devil improved? no, i personally think 58 wins and a trip to the western conference finals, anything else will just be disappointing. (see short and concise)

by B-rizzoy on Aug 5, 2009 1:00 PM PDT reply actions   0 recs

This comment

was utterly silly.

The Michael Ruffin of BlazersEdge, cuz Amlmart said so.

by BlazersOrBust on Aug 5, 2009 2:41 PM PDT up reply actions   2 recs

Regression to the mean is not applicable here, though your graphs

might lead you to believe this. Specifically, when looking at all teams as a whole as compared to previous seasons, number of wins will be equally far from the mean from year to year. So if Cleveland’s wins go down, Portland’s might go up to maintain that “equally far from the mean” phenomenon.

Also, you cannot predict an individual team’s wins based on regression to the mean – if a team knows it’s not going to make the playoffs, they may not play as hard, skewing their win total for both them and their opponents.

Also, you should have a control sample when trying to establish regression to the mean, which is pretty much impossible in the NBA…

Patty Mills - PG of the future. Book it.

by Blazerholic on Aug 5, 2009 1:14 PM PDT reply actions   2 recs

well, i think you are using the term in a more technical way than I intended

what you described in the first paragraph is simply a reason that terms tend to revert toward the average record.

While it’s true that using a control sample is valuable for correcting for statistical regression to the mean, it’s not necessary here.

by PoliSam on Aug 5, 2009 6:34 PM PDT up reply actions   0 recs

Seems to me like several of the teams that did improove after 54 win seasons

went on to win championships, either that year or the following.

by pxilpooshr on Aug 5, 2009 1:14 PM PDT reply actions   0 recs

er improve.

by pxilpooshr on Aug 5, 2009 1:15 PM PDT up reply actions   0 recs

Normally you might be right

But barring any major injuries we should improve. For one the number of ridiculous comebacks should drop.

"Good evening Blazer fans, wherever you may be!"-Bill Schonely

by skywaker9 on Aug 5, 2009 1:21 PM PDT reply actions   0 recs

Player injury, Player movement, Draft talent, and Salary cap

would seem to be big reasons for teams getting better or worse.
 — How badly did Gilbert Arenas getting injured hurt the Wizards?
 — Bostons traded to put together the Big Three and went from worst to first
 — Drafting Tim Duncan made San Antonio go from worst to first
 — Orlando would have liked to hold on to Hedo, but he went where the money was better

For these various factors, team from one year can have very little bearing to the team from previous year. For example, San Antonio won a lot last year, but they have upgraded with trades, and should win more. The team name “San Antonio Spurs” has stayed the same but the team make-up has not. So does regression to mean even apply when you’re not talking about the same teams?

by FromAfar on Aug 5, 2009 1:34 PM PDT reply actions   0 recs

For these various factors, team from one year can have very little bearing to the team from previous year. For example, San Antonio won a lot last year, but they have upgraded with trades, and should win more. The team name "San Antonio Spurs" has stayed the same but the team make-up has not. So does regression to mean even apply when you’re not talking about the same teams?

I am simply assuming the units are the same and letting the data tell me if there is regression toward the mean in those units. If the units were completely unrelated, you’d be less likely to see regression towards the mean.

by PoliSam on Aug 5, 2009 6:37 PM PDT up reply actions   0 recs

Isn't it the opposite?

I think that one of the biggest reasons why regression to mean happens is because of player turnover. If teams completely changed every year, there would be 100% regression to mean (every team could expect to win about 41 wins next year regardless of what happened in previous years). If teams are stable and are able to retain their key players, the regression effect should be reduced.

by trk on Aug 6, 2009 1:11 PM PDT up reply actions   0 recs

Is it possible

that you are mixing up parity and regression to the mean? I think what polisam said is the correct way to think about it with every team being a unit, while it seems that you are looking at the league as a unit.

Life is exhausting when you are this stupid.

I will talk about DeJuan Blair no more forever

by jonestr on Aug 7, 2009 9:51 AM PDT up reply actions   0 recs

Still question whether league wide regression around mean is applicable
If teams completely changed every year, there would be 100% regression to mean

A test of this statements validity is to examine the inverse. That is if teams stayed exactly the same, would they produce the exact same results from year to year? I’m not trying to suggest that we are trying to oversimplify the complex factors on performance. However, I question whether regression to the mean can be validly applied.

Regression to a mean suggests that point results from an experiment are not a reliable final indicator, since an occasional extreme result might be misconstrued as the typical answer. And rather, that results will “regress to a mean” after a series of experiments. In general you are keeping other factors constant. Applying regression to the mean across the entire league — there just seem to be too many variables, where there can be any regression to a mean.

The Clippers are “always” going to be the Clippers. So just because they had a 20 (?) loss season, does not mean that will get up to 62 win season someday to balance it out. Similarly, salary cap or not, some teams always seem to contend for championships. [Celtics, Spurs and L@kers]. And because they had a 60 (?) win season, does not mean that they will have a 22 loss season to balance it out. Maybe each team has its own mean that it regresses about.

Or for that matter for performance with constantly fluid set of conditions, bell curves are considerably more likely than a flat mean, where there a few really good, a few really bad, and the bulk falling in the middle. Could one apply some form of regression to a bell curve?

Lastly I would put forward that there is a “hysteresis effect”. Teams that are good remain good for several years, and they strive to “keep their window open”. Teams that are bad remain bad for several years, as they have “patience while rebuilding”. If a young, up-and-coming team won 49 games, would could certainly expect it to win more (and not regress to 41). The 90-91 Bulls won 61 games, and then won 67 games the following season. This hysteresis skews regression to the mean.

Maybe a given team that does not have much change might regress around some mean relevant for itself. The 72 win bulls team, won 69 the following season, and 62 wins the year later. They went lower each year, but the mean was still way over 54, and 54 wins probably would have been considered under-performance. So I cant see how league wide regression around 41 could reasonably apply to assess future performance of a given team.

by FromAfar on Aug 9, 2009 3:38 PM PDT up reply actions   0 recs

I believe we will win more games

As long as major injuries and/or major trades etc. dont happen.

S

The Princess of Blazersedge

Sports do not build character. They reveal it. - Casey Dillon Stengel

by BlazerFan1 on Aug 5, 2009 1:34 PM PDT reply actions   0 recs

Factors cause

regression to the mean shows an overall picture, but the things you mentioned should account for it. That’s what should guide us into predictions and not a line drawn down and showing a trend.

I'm a really really ridiculously good looking orange mocha frappaccino drinking manhammer sandwich

by hobobob on Aug 5, 2009 1:59 PM PDT up reply actions   0 recs

I like this post a lot.

You can measure skill and talent with your eyes, but productivity is shown through statistics.

by austinpwnz on Aug 5, 2009 2:04 PM PDT up reply actions   0 recs

I recced both you and PoliSam

his original post was an excellent springboard for exactly this kind of incisive counter-commentary. You made all the points I wish I had thought of during the ten seconds that I was considering extenuating factors.

The Michael Ruffin of BlazersEdge, cuz Amlmart said so.

by BlazersOrBust on Aug 5, 2009 2:44 PM PDT up reply actions   0 recs

I rec'd him, too

He identified reality. The cause of reality is open to discussion (and testing, if someone wants to do the work of testing it).

When I rule the world, everyone will know how to use Excel.

by jscot on Aug 5, 2009 3:08 PM PDT up reply actions   0 recs

Injuries.

I could be way off, but I kinda figured we were lucky last year with regard to injuries. Maybe my view is skewed because I’m holding my breath anytime Roy has the ball and anything short of him blowing out a knee is a good thing.

Did we actually have more injuries last year than normal? I smell a graph. Of course it would need to be weighted by the value of that player, as a game lost by Roy weights heavier than a game lost by Shav. I smell a new stat. Player Injury Relative to Value or PIRV. I smell something else…

The cowards never started
The weak died along the way
Only the strong survived
They were the Trailblazers

by lukeyhere on Aug 5, 2009 2:46 PM PDT up reply actions   0 recs

Martell....

Greg for a short period
Roy for an even shorter period
LaMarcus for a short stint
Blake also out for some games.

Those are big names for this team, and for them to be sitting out games can only hurt our overall production.

For example: How we almost beat Cleveland in Cleveland without LaMarcus… that could have been a very different game if he were playing.

Big D from Blog-A-Bull - "Pritchard is such a genius that teams just give him players for free."

Greg Oden - The only other rookie with more than 500 points, 400 rebounds, and 65 blocks in under 1400 minutes played. Since 1946

by FiveOhThree-RipCity!! on Aug 5, 2009 3:39 PM PDT up reply actions   0 recs

some good points, but a couple of counterpoints

you are right that many things can cause the pattern in the data that I produced above. The draft is a really good one that I missed. Player movement as a result of the salary cap is another, though I would note that the pattern existed before the cap was instituted.

More importantly, I would make a pretty sharp distinction between things that are observable ex ante, like previous average age and the draft and things that are unpredictable or immeasurable ex ante, like injuries. While you can certainly measure injuries in the previous season, you cannot do so for the following season. The occurrence of injuries in 2010 is pretty much magical in my view.

I also think you are under appreciating the effect of luck on win total a bit. How many games are won because of a blown call by a ref? Because of a hot shooting night by a star? Intuition says that those things tend to balance out over the course of the season, but part of the intuition for regression to the mean comes from the fact that luck rarely balances out perfectly. Indeed, the probability that a fair coin will land heads 5,000,000 times in 10,000,000 tosses is pretty close to zero.

Here’s a random factoid: there is a sort of regression to the mean effect within seasons. That is, NBA teams that win by more than expected in one game, tend to revert back to their underlying strength. There is almost no evidence of “streakiness” within seasons for teams and this is true even though there are injuries, which should drive the data towards “streakiness.”

by PoliSam on Aug 5, 2009 6:47 PM PDT up reply actions   0 recs

One way to control for luck, at least somewhat

would be to look at point differential rather than win/loss.

When I rule the world, everyone will know how to use Excel.

by jscot on Aug 5, 2009 9:20 PM PDT up reply actions   0 recs

I would love to see the exact same analysis, but instead of focusing on W-L record focus on point differential

And see how much age plays in to that as well.

My one concern, how much statistical correlation is there between point differential and W-L? I suspect over the sample size we are talking about .99 or better, meaning that it will essentially show us the exact same thing.

But that is just a gut feeling and would actually need to be tested to be proved.

by diskord on Aug 5, 2009 11:03 PM PDT up reply actions   0 recs

There isn't going to be a .99 correlation between point differential and W-L, partially due to rounding

(i.e. you can’t have a fractional win, even though your pythagorean w-l projection suggests you should).

But there is a strong correlation. I took the W-L records for every team in the databasebasketball.com data set (data ends after the 2007-2008 season) and looked at the correlation between team winning % and pythagorean winning . There is some question about the proper exponent to use in the equation, so I looked at both 14 and 16.5. Both and an R^2 value of .926. Or in common talk point differential explained 92.6 of team wins and loses, which seems mean that about 6 games per season seem to be up to the vagaries of chance.

by tingeyga on Aug 6, 2009 8:15 AM PDT up reply actions   0 recs

I was thinking about this more

and I think that the draft is probably the biggest factor.

You will always have aberrations, the #1 pick who adds nothing to his team, the #25 pick who becomes an instant star, etc.

But over many seasons, if you look at the changes in wins based on draft position, I think you’ll see that is the biggest factor. For instance, it has been well documented that the team with the #1 pick, on average, gains 11 wins the following season.

Are all 11 of those wins attributable to that draft pick? Hard to say. But if they are, you have just wiped out most of what you are seeing here at the lower end of the scale.

It would be very difficult to measure. Sure, we added Greg, but we also made a significant roster move (dumping Zach) and Greg didn’t play. So our improvement two years ago was completely independent of Greg. Last year, we added Greg and he played, so we should see positive impact — but we also added Nic and Rudy, a 24 and 25 pick, and they made major contributions. So I don’t know how you really measure draft impact.

When I rule the world, everyone will know how to use Excel.

by jscot on Aug 7, 2009 1:55 AM PDT up reply actions   0 recs

We also gained

12 wins when Greg had his rookie season (even though it wasn’t the year immediately after his draft). Weird

"I'm tired" -Me

by 92wastheyear on Aug 7, 2009 4:51 PM PDT up reply actions   0 recs

Baker's dozen

we knew what you meant.

When I rule the world, everyone will know how to use Excel.

by jscot on Aug 8, 2009 3:02 AM PDT up reply actions   0 recs

The draft is definitely huge on the bottom side (for the losers)

but not nearly as important at the top (the winners). A very basic indicator for this claim is that there is more reversion towards the mean for teams with fewer wins than teams with a lot of wins. The draft is going to affect all teams, but would not explain why 65 win teams would regress more than 55 win teams.

by PoliSam on Aug 9, 2009 8:17 AM PDT up reply actions   0 recs

I suppose a 55 win team is marginally more likely

to draft a player who will help than a 65 win team would.

So we might expect to see some impact from the draft even at that level, but you are right that it wouldn’t (on average) be very much.

It might be more complicated than that, though.

The 20 win team loses all of their games to the 65 win team, but might win 1 of 4 against the 55 win team. If the draft turns them into a 30 win team, the odds are still pretty good that they only win 1 of four against the 55 win team, but their chances of winning one of their home games against the 65 win team are arguably much better than the year before.

In other words, strengthening the bottom teams impacts the good teams, but may have a greater impact on the overall record of the very best teams. To win a very high number of games, you probably need a lot of games against bottom feeders who aren’t a real threat to beat you.

So I’m guessing that the draft is non-negligible even on the top side — but probably not the main factor.

When I rule the world, everyone will know how to use Excel.

by jscot on Aug 9, 2009 8:48 AM PDT up reply actions   0 recs

Well, as far as "luck" goes,

there is good and bad luck WITHIN an individual season, so that would reduce the likelihood that it would differentiate seasons. Regression to the mean, a model ultimately based on a large number or random factors, is a pretty rough predictor for more specific events. The more depth of understanding you have of the underlying factors, the better predictions you can make which may or may not show a “regression to the mean”, or tendency toward mediocrity.
Basically, I can agree that “regression to the mean” is a legitimate factor in how the Blazers (or any team) do next year. I just think other factors like how Oden plays and the addition of Miller are way more powerful variables that will mask the “regression to the mean” effect, plus numerous others that lots of fans could list. Just assumming that the negatives will counterbalance the positives toward the mean is ultimately a rather superficial and simplistic presumption, when we have quite a bit of information available on each player, understandings of their compatibility, etc…
And then we can go further into more philosophical issues, such as the ultimate adequacy of random events as the determinates of all that occurs, including human experience, life…. Interesting material, possible junk drawer OT sometime.

by Berkeley on Aug 8, 2009 12:13 AM PDT up reply actions   0 recs

I wonder how much luck could be quantified for a team in a given season

Record in close games is often cited as a lucky statistic, but what else could we use to determine if the Blazers were particularly lucky last season?

Team wide games missed to injury
Star players’ games missed to injury
Number of back to backs (and other schedule quirks)
Number of games against an opponent in the 2nd game of a back to back
Free throw percentage against

all compared to the mean, plus the aforementioned record in close games statistic.
There are probably some other ways to measure luck in basketball, but that’s a good place to start. It would be interesting to see a “luck factor”. I wonder if we were more lucky or unlucky last season.

"It’s a good ol’ fashioned Rip City beat down!"

by Magnum on Aug 5, 2009 1:58 PM PDT reply actions   0 recs

those are some really good suggestions

I think number of close games won, injuries, and opponent free throw percentage are all relatively good indicators of luck. If you really wanted to know how much bad luck affected a team as a result of injury, you probably want to weight the number of games missed by something like win shares, adjust-plus minus or whatever number you think would give you a good estimate of the effect of the injury.

by PoliSam on Aug 5, 2009 6:50 PM PDT up reply actions   0 recs

I wish I remembered where I have seen this number

but the Blazers are something like .800 in more than 40 games decided by 3 points or less since Brandon Roy has joined the team. It’s definitely statistically significant. Would you chalk that up to Type I error?

The Michael Ruffin of BlazersEdge, cuz Amlmart said so.

by BlazersOrBust on Aug 6, 2009 7:53 AM PDT up reply actions   0 recs

One measure of luck I have seen

is comparing the Pythagorean wins vs actual wins

just throwing that out there.

Life is exhausting when you are this stupid.

I will talk about DeJuan Blair no more forever

by jonestr on Aug 5, 2009 7:03 PM PDT up reply actions   0 recs

and in that case

we finished 2 games lower than would be expected by Pythagorean wins, so we were generally unlucky.

"It’s a good ol’ fashioned Rip City beat down!"

by Magnum on Aug 6, 2009 1:20 AM PDT up reply actions   0 recs

It's a good cautionary stat

but it’s nothing to base predictions for an individual team on. I get why you put it up there, but I’m not sure whether it’s appropriate for this team.

I'm a really really ridiculously good looking orange mocha frappaccino drinking manhammer sandwich

by hobobob on Aug 5, 2009 2:02 PM PDT reply actions   0 recs

Nice research, and very well done

I still think the Blazers are gonna win more than 60 games this year. Jscot pretty much took my reasons: all the things that usually cause such a regression to the mean don’t apply to us very strongly.

You can measure skill and talent with your eyes, but productivity is shown through statistics.

by austinpwnz on Aug 5, 2009 2:05 PM PDT reply actions   0 recs

Nice, someone is breaking out the Minitab ;-)

"I'm addicted to polo y'all...respect my fresh" - Travis25Outlaw

by Norsktroll on Aug 5, 2009 2:43 PM PDT reply actions   0 recs

Comment Summary thus far:

“Wow! Those are some good stats. I think they make a lot of sense. I don’t like them though, because they indicate my team won’t win as many games next year. Since I want them to win more games, I will say that my team will defy those odds and win more games next season.”

:)

Yes! Yes! In the face!

by LeafHawk on Aug 5, 2009 3:29 PM PDT reply actions   1 recs

How DARE you question my motives, sir!!

ok …I didn’t comment…… but still

"I'm tired" -Me

by 92wastheyear on Aug 7, 2009 4:54 PM PDT up reply actions   0 recs

nice well laid out post

However the graph, I would liked to have seen is the raw data of teams wins vs the change in wins the following year. That is plot the data, not just the mean of the data. Those two graphs tell very different stories. One is your story which is true, on average all teams will tend to be average (because it is a closed system, for every win there is one loss). However, by plotting all the raw data for each individual team you might see that 2/3 of losing teams improve while 1/3 get worse. Teams that are 41-41 half get better half get worse. and teams that win 2/3 get worse while 1/3 get better. So you see my point is there is a lot of wiggle room for the trend and that when you’re only looking at the mean of means will always find the mean.

by NWfan on Aug 5, 2009 3:50 PM PDT reply actions   1 recs

Yeah...

There were three 54 win teams last year. The Spurs hit 54 on a decline from 56. The Nuggets peaked at 54 after rising from 50 and will decline again. That’s 2 teams out of 3 on the decline. The third must be rising past 54 or the universe will explode.

The cowards never started
The weak died along the way
Only the strong survived
They were the Trailblazers

by lukeyhere on Aug 5, 2009 3:56 PM PDT up reply actions   0 recs

Very cool

I haven’t read other comments, but in favor of improvement I list:

1.) Oden improvement=overall team improvement.
2.) Miller
3.) Hard workers and unwillingness to relax.
4.) End of season domination
5.) A summer of sitting out play-offs they shouldn’t have been sitting
6.) Roy. I think he’s probably one of the most underrated superstars to ever be in the league, and yes… I’m giving him superstar status. People consistently place him top 6 and don’t seem to really think about what that means. He’s been rookie of the year and voted 2x all star by coaches who passed up other amazing talent. He dominates and WILLS games to wins. He’s had a summer to simmer. He wants to win. He could show off like other stars, but chooses to play smartly and win. I’ve also undervalued him for three years in a row despite believing he’d be good.

Cool stats though. Here’s hoping i’m right :)

"Fernandez, to my eyes, is the Blazer who walks that walk most comfortably. A lot of Portland's fans (egged on, dare I say, by their local broadcasters) lament things like how Ron Artest or Yao Ming get to hit Brandon Roy's arms.

But I suspect Fernandez sees all that and thinks: We get to hit arms! Cool!"

http://myespn.go.com/blogs/truehoop/0-39-135/On-Playoff-Experience.html

by ratbastird on Aug 5, 2009 4:05 PM PDT reply actions   0 recs

Tenuous corellation to individual teams and seasons but interesting from a macro view

Although the point is hard to argue with as a reminder that momentum is often overstated season to season, and that historically the opposite effect is realized.

I would be interested to see similar analysis for other pro (and major college) sports. This tendency in the NBA is consistent with the relative parity found here (again, relative). I wonder what more dynasty-oriented sports such as college BB & FB, MLB, and Premier League would show.

You could argue that the ‘gravity’ exerted pulling teams towards .500 is strongest in the NBA, with measures such as the cap, luxury tax, and revenue sharing.

by stikit on Aug 5, 2009 4:22 PM PDT reply actions   0 recs

MLB vs NBA

Given that one of the graphs that showed the very even regression to the mean was from 1980 and beyond I think it’s hard to call MLB more dynastic than the NBA. Of the 30 finals contested since 1980 28 have been won by 6 teams.

the nba has the last parity in recent history of any major american pro sport. i can’t speak to premier league but i would agree that college football basketball are very unlikely to show regression to the mean at least among the elite programs.

by colinmarsh on Aug 6, 2009 4:18 PM PDT up reply actions   0 recs

Regression towards the mean is interesting

Blazerholic hit a lot of good points. I’ll just say a couple other things as to why this isn’t as appropriate as the historical examples of regression to the means (or even as appropriate as the “Sophmore Slump” example in Wikipedia about single player performances after a rookie season).

True Regression towards the mean occurs due to measurement error being symmetrically distributed around the center of the population. It requires a high amount of measurement error… what I mean by measurement error is the difference between a observable score or measurement and the concept of a true score. Now, how does this make sense in the sense of something like height which we can clearly physically measure within a reasonable amount of precision. But in the Galton story, our construct isn’t actual height.. it is the influence of parent’s height on their children’s height. However, due to the multitude of other factors causing noise in trying to predict heights between two generations, Galton found that the grand population mean of height is a better predictor of offspring height than the degree that the parent’s heights diverge from the population mean because a single indicator of height (parents measures) is more likely to reflect anomalies that are statistically significantly different from the average of the distribution but are just one of many factors that will actually produce the actual height of the offspring (recessive genetic factors, mutations, and environment being the most major other predictors of height in addition to dominant genetic inheritance).

There are some analogies to sports. The team of the following year is going to have some relatedness to the team of the previous year… mostly.. certainly the Blazers will as they have a solid core and very few major changes. But there are million issues with assuming regression towards the mean is our biggest obstacle to improving next season and that we can use methods like Galton developed to predict that effect. Among them.. teams win numbers are dependent upon each other. For us to win, other teams have to lose. No one has to be taller, because I was born short. Laws of central tendency will make this tend to show this happening, but height of two unrelated people are essentially independent of each other while every team influences every other teams win totals in as direct of a way as possible. Also, while there is certainly an amount of luck to winning some NBA games, there are a lot of games where luck didn’t really play into it. Again, for regression towards the mean to be a factor next season, we would have to assume a large amount of error in the number of wins we got… and that this error over all teams and through all time would be symmetric around the error (teams winning lucky will balance with teams losing not lucky and over time each team will show a balance in winning and losing luck games) and it is not related to true ability (that winners and losers are more equal in close games and the final result is not based on who the better team was). I know there are some games that are decided by luck, but I don’t think it is very many.

Finally, the biggest obstacle to the Blazers increasing their win totals is much more related (imo) to how their improvements (through changing the team and through development) compare to the improvement every team we are playing most of our games against. Esp. a team like the Thunder.

"...the primary focus of all obstacles is to induce labor, so progression can be born." - LiL C

by idoltime on Aug 5, 2009 4:56 PM PDT reply actions   0 recs

Regression to Mean is not an obstacle

>But there are million issues with assuming regression towards the mean is our biggest obstacle to >improving next season

I’m pretty sure that’s not what he’s saying, after all regression to the mean is just an observation of effects of effects, not a cause.

What I do see in the post saying is that the myriad other influences of wins – injuries, other teams injuries, schedule, grumpiness, sweat, refs, fans, weather etc – combine to have a greater than expected influence.

We want player selection and development to dominate. As fans we’ll spend more time talking about that anyway. After all who want weather in influence an indoor sport – but it does.

Cheers, Alistair

by holder on Aug 5, 2009 5:27 PM PDT up reply actions   0 recs

Yeah, that was pretty much my message

I can see why introducing the term regression to the mean had a distracting effect… but I was basically pointing up all of the uncertainties that can influence the course of a season.

by PoliSam on Aug 6, 2009 9:28 AM PDT up reply actions   0 recs

And that those uncertainties tend

to have a negative influence on teams that were good last year and a positive influence on teams that were bad last year.

It’s a good message. But I think you are right that “regression to the mean” was distracting.

When I rule the world, everyone will know how to use Excel.

by jscot on Aug 7, 2009 1:58 AM PDT up reply actions   0 recs

Yeah, If I had it to do over

I would have simply presented the pattern in the data and then said that the forces that produce statistical regression to the mean is likely to be one of the factors producing that pattern.

by PoliSam on Aug 9, 2009 8:04 AM PDT up reply actions   0 recs

yes,

that the results above conflate a “pure” statistical regression to the mean with less pristine factors, such as the mutual dependence of team wins on each other. So, surely the graphs above are not a measurement of the strength of pure statistical regression to the mean. That’s one reason that I put in bold that age and regression to the mean were not the only factors that I would consider..

On the other hand, I disagree with most of the second half of your second paragraph. I think there is a large amount of fundamental error in the number of wins a team gets—at least enough to consider when making projections from one season to the next. Suppose each teams true win total in year 1 is x +/- 4 wins, which I don’t think is absurd at all, then you’d definitely want to consider it in year two.

(I’d also note that regression to the mean does not have to be the result of measurement error; it can be produced by sampling variability and other types of random error).

by PoliSam on Aug 5, 2009 7:03 PM PDT up reply actions   0 recs

I think my definition of measurement error (which perhaps would be better called observation error) encompasses both of those other types of error.

And I’m willing to agree that +/- 4 is a completely reasonable estimate of the “random” wins.. esp. in terms of predictions. I still don’t think that you will find that the average number of wins for the entire NBA as being a significant predictor of a teams following year success.

"...the primary focus of all obstacles is to induce labor, so progression can be born." - LiL C

by idoltime on Aug 6, 2009 4:11 AM PDT up reply actions   0 recs

can check that with simulations
And I’m willing to agree that +/- 4 is a completely reasonable estimate of the "random" wins.. esp. in terms of predictions. I still don’t think that you will find that the average number of wins for the entire NBA as being a significant predictor of a teams following year success.

I am not sure what you mean by the second sentence. The claim was that wins in year 1 are a significant predictor of the change in wins from year 1 to year 2. To test if +/- 4 is enough noise to make this true, I generated simulated data with teams having a “true” win total (or propensity) from 25 to 50 games and then added 4 games of noise, literally a normally distributed random variable with mean 0 and standard deviation of 4. I then created a simulated second season (reseeding the random number generator), subtracted the first season from the second and regressed the difference in wins on win total in year 1. Win total in year 1 was a significant predictor of the change in win total. (the coefficient was -.133).

On the other hand, if by +/- 4 wins, you had in mind a standard deviation of 2 games, so that +/-4 is the “margin of error” as reported by polling organizations, then you’d get a much weaker effect. Something you’d detect with enough data, but only worth a game for the elite teams. For what it’s worth, one would need about 6 games of noise (r.v. with mean=0, sd=6) to produce the results in the nba data.

by PoliSam on Aug 6, 2009 9:07 AM PDT up reply actions   0 recs

only problem with your approach

is that two randomly produced win totals would have a lower correlation than a teams win total would…which would produce a great regression to the mean effect. You should use a regression equation with a noise element to predict the second season win totals from the simulation on the first season.

Plus, you have created a restriction of range problem where teams cannot change beyond the limits you set on it.

but Bravo for breaking out the simulation solution

And all I meant by “I still don’t think that you will find that the average number of wins for the entire NBA as being a significant predictor of a teams following year success.” is that the stronger the regression to the mean artifact, the more likely the average number of wins for every NBA team is as good or a better predictor of one team’s year 2 performance than that same team’s previous year win totals.

"...the primary focus of all obstacles is to induce labor, so progression can be born." - LiL C

by idoltime on Aug 6, 2009 5:42 PM PDT up reply actions   0 recs

is that two randomly produced win totals would have a lower correlation than a teams win total would…which would produce a great regression to the mean effect. You should use a regression equation with a noise element to predict the second season win totals from the simulation on the first season.

Plus, you have created a restriction of range problem where teams cannot change beyond the limits you set on it.

The simulation is an example of pure statistical regression to the mean. It simply illustrates how much regression to the man there would be if teams did not improve at all, but each season’s true quality was measured with error by the wins. It was a way to show that +/- 4 games of noise is enough noise to matter, not a claim about what happens in the NBA.

If I introduced the possibility that teams can also improve at random, that would actually increase the amount of observed regression to the mean.

by PoliSam on Aug 9, 2009 11:16 AM PDT up reply actions   0 recs

Regression to the mean

is all fine and dandy when talking about large samples, but we’re talking about a sample of ONE here. Despite all the nice graphs and examples, I don’t think you can really apply it with any accuracy on the “micro” level. Statistics is all about the “macro”.

Far more important than generalized tendencies over years and many different teams is the maturation of the Blazers young players and their ability to become a cohesive team (i.e. chemistry).

Interesting post, but I just don’t think it can be applied.

Duct tape makes you smart.

by TTRocks on Aug 5, 2009 5:21 PM PDT reply actions   0 recs

Dynamics Indicate the Opposite

I love stats and appreciate your analysis. And I agree that it it possible that the Blazers will win less games, not more. But I think it likely that they will win more, for the reasons so many other commenters have suggested.

I think that there is something in year to year dynamics that is not captured in your graphs. It would be improved if the “shape” of 5 year win streams were analyzed.

I’m going to play a bit of devil’s advocate here. If regression to the mean was fully in effect, all teams would get stuck at 41 wins. Looking at your middle graph, start with the Blazers 21 win season. The graph tells me that the next year they would win 8 more, for 29 wins. And the following year they would win 5 more, for 34 wins. Then they’d add 3 more, for 37 wins. They’d slowly creep up from there to 41 wins. And they’d get stuck. Meanwhile, all the good teams would head towards the center. All teams would end up with 41 wins.

So the fact that there is dispersion between 20 win teams and 60 win teams demonstrates that regression to the mean is not in effect! Something else is going on.

I think that teams sustain winning, at least over a 5 or so year period. I’ve done no statistical analysis, but here’s a few facts and figures that support this concept. Some teams are consistently good. The Lakers have won 63% of their games in the 49 years since they moved to LA. The Spurs have a won 60% since they came into the NBA. Boston has been in the league for 63 years, and has won 59%. At the other end of the spectrum, the Grizz have won 33%, the Bobcats 35%, the Clips 36%, and the Raptors 41% since coming into the league.

I think regression to the mean works for time series that are random, with one time period outcome unaffected by the next. Think rolls of the dice. Getting snake eyes on one throw has no effect on the next throw. But the NBA is different. There is “momentum” carried from year to year. It’s like if you threw snake eyes, that side of the dice become lighter and more likely to end up on top.

So I agree with your basic point, but believe that there is a dynamic analysis that analyzes time series with “momentum” that would be more appropriate.

by Blaz06Draft on Aug 6, 2009 5:22 PM PDT reply actions   1 recs

I’m going to play a bit of devil’s advocate here. If regression to the mean was fully in effect, all teams would get stuck at 41 wins. Looking at your middle graph, start with the Blazers 21 win season. The graph tells me that the next year they would win 8 more, for 29 wins. And the following year they would win 5 more, for 34 wins. Then they’d add 3 more, for 37 wins. They’d slowly creep up from there to 41 wins. And they’d get stuck. Meanwhile, all the good teams would head towards the center. All teams would end up with 41 wins.

So the fact that there is dispersion between 20 win teams and 60 win teams demonstrates that regression to the mean is not in effect! Something else is going on.

Hypothetically speaking, one would observe a regression to the mean if every team had a different “true” win total. Say the Lakers are a 55 win a year franchise, the Grizzlies a 25 win a year franchise and so on. If you took each teams true win total and added wins at random or noise wins, drawn from a normal distribution with a mean of 0 and a standard deviation of 4, you’d observe regression to the mean.

Regarding larger trends over a larger number of years. There’s actually not that much evidence for it the data. The number of wins two years prior is only barely a significant predictor of wins. Wins from three seasons ago are completely unrelated to wins in the current season, controlling for wins one year and two years ago.

by PoliSam on Aug 9, 2009 8:13 AM PDT reply actions   0 recs

54 wins, for most teams

would probably be their peak, hence the regression.

call me a homer, but it’d say we’re just getting started.

Yellow Mamba FTW!

by northwestj on Aug 9, 2009 11:04 AM PDT reply actions   0 recs

LOL

Yes, for most teams, the mean is 41. For us, it is 70. Expect further regression to the mean this year.

When I rule the world, everyone will know how to use Excel.

by jscot on Aug 9, 2009 1:12 PM PDT up reply actions   0 recs

Nerd Alert!

Get ready for your swirly, Sam. Hope you brought a comb.

(jkjkjkjkjkjkjkjkjk, don’t kickban me!!)

Life is hilarious.

by SolGoode on Aug 9, 2009 3:26 PM PDT reply actions   0 recs

Comments For This Post Are Closed


User Tools

A site by Blazer fans, for Blazer fans
Start posting about the Trail Blazers »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Small
Ego, Selfishness, Superstars, and Perspective
Screen_shot_2009-11-03_at_9
Junk Drawer 11/22/09 - Your Greatest Fear
Troll_stone_cropped_small
Who has the best hair in the NBA?
Blazers_small
Mr. McMillan, meet Reality.
Small
From Benefit of the Doubt to Just Plain Doubt.

Recent FanPosts

Trogdor_small
I think it's time to blow this 'Roy' experiment up.
Original
Start Miller and Rudy
Batum_small
Blazers Revert to 2008 Team
Small
Style vs Substance
Small
The Steve Blake Paradox
300px-color_icon_gray_svg_small
Thanksgiving Week Trade Drawer: What To Do With Andre Miller?
Small
I hate the way roy is playing

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recommended FanShots

A modest proposal: Fire Mike Rice and Antonio Harvey!
For Ann, and Travis.
FREE GREG ODEN
"It's been good for us," Oden said. "We're going to stick with it. We have...

Recent FanShots

Single seat for Bulls - near center court
Koponen!
Tossing Andre Miller Under the Bus
Blazers statistical scouting reports (2008-09 season)
Is this worth linking?
Buzz Cut, Mohawk or Afro? Which one is better?
Did we sign Juwanna man as our 14th guy or for 9M a year?
the key to a portland championship
Pendergraph is running at full speed and could be cleared to play by Christmas.
Bayless describes his situation on the team as "tough".

+ New FanShot All FanShots >


Editors

Kitten_small Dave

Ben_small Ben.

Moderators

Pict1126_small -ken

Polar_bear_small jorga

Terryporter_small prezofdeath

Small usmcr3049

Jesus_icon_i_small T Darkstar

Wallpaper_small geoffm