website statistics

Here’s a fun exercise. Remember back in high school calculus when you would find the peak of a curve by setting its derivative to zero? That’s a basic example of mathematical optimization. Regression is another, more complicated form of optimization. In regression we want to fit a line or curve so that the difference between the predicted values and the observed values is minimized. Simply put, optimization is finding either the maximum or minimum of a function, subject to a list of constraints.

Optimization is the hidden science behind much of the world around us. It’s how airlines know how to schedule routes, how politicians gerrymander districts, and how companies plan advertising campaigns. It’s also how sports leagues manage to build schedules to accommodate a dizzying myriad of requirements. Optimization is why we had to suffer through the Titans-Jaguars game Thursday night.

Let’s say we want to find the best-fitting set of team ratings that would explain the game outcomes so far in the season. What we can do is start out with an assumed set of team ratings. Anything is fine, but let’s start with a rating of zero points for every team. Then we can list each game and its net score result (home score – visitor score - 3). We’ll subtract 3 because that’s the value of home field.

Next to each game we’ll lookup each opponent’s rating and compute a projected net score in the same way (home rating – visitor rating – 3). For each game there is an error between the projected net result and the actual result. We can add up all the games’ errors and get a total error.

Almost everyone has a relatively powerful optimization tool on their computer but probably doesn’t realize it. Excel’s Solver tool does a pretty good job for small optimization problems. We can tell Solver to select the combination of team ratings that minimizes the total error. This would tell us the team ratings that best explains the game outcomes we’ve seen to date.

There’s one problem. If we simply add up all the errors, we’re bound to get something close to zero no matter what the team ratings are. That’s because positive errors (the home team overperforms expectation) and negative errors (home team underperforms expectation) will cancel out. One way around this problem is to square the error. This is known as L2 Norm regression, or least squared regression. One big drawback with this approach is that it is very sensitive to outliers. In our example, it’s very sensitive to blowouts.

Another option is to minimize the absolute value of the errors. This is known as L1 Norm regression or absolute error. This approach is not hyper-sensitive to blowouts but comes with its own difficulties.

Solvers have a tough time with the absolute value function. If you plot it on the x-y plane, it’s a “V” shape, with its apex at the origin (0,0 point). This is a discontinuity and means absolute value is a non-differentiable function. Solvers like differentiable functions for the same reason they were so handy in high school for finding the minimum and maximum of a curve.

Fortunately, there’s a way to trick solvers into seeing problems like this as a completely linear problem. We can express the error as a positive component and a negative component, both constrained to be greater than or equal to zero. By telling the solver to add the positive component and subtract the negative component the problems becomes purely linear. Purely linear problems can be solved with a technique called Linear Programming, using a algorithm known as the Simplex method. (This technique requires more variables and constraints than Excel’s Solver can handle, so I used another more heavy-duty solver.)

Here are team ratings for the 2014 season through week 15, based on a variety of approaches. The L2 Norm column is least squared error approach. The column labeled Non-Lin is the L1 approach using Excel’s non-linear solver. The column labeled L1 Evo is the L1 approach using Excel’s evolutionary algorithm, which uses a process I explained in my summer project from last year. The column labeled simply L1 is the pure, direct solution using Simplex. The average column doesn’t average all the columns, just the pure L1 and L2 approaches.

Rank Team L2 Norm L1 Non-Lin. L1 Evo L1 Norm Average
1 NE 13.2 19.0 18.0 18.7 15.9
2 DEN 8.8 12.0 10.2 13.7 11.2
3 SEA 6.6 8.7 8.1 8.7 7.7
4 KC 6.6 7.0 6.0 6.7 6.6
5 GB 6.1 5.0 4.2 6.7 6.4
6 IND 6.1 7.0 7.9 6.7 6.4
7 BLT 7.0 3.6 4.4 3.7 5.3
8 PHI 3.4 6.1 5.7 5.7 4.6
9 MIA 4.2 4.0 3.7 3.7 3.9
10 BUF 3.2 4.0 3.1 3.7 3.4
11 ARZ 3.0 4.0 3.9 3.7 3.3
12 SD 2.7 4.0 2.9 3.7 3.2
13 DAL 2.1 3.1 3.1 3.2 2.7
14 DET 2.1 1.9 1.8 1.7 1.9
15 HST 0.9 3.0 3.1 2.7 1.8
16 PIT 2.6 0.6 1.3 0.7 1.6
17 SL 1.2 0.1 -0.3 0.2 0.7
18 CIN 0.5 -0.9 -1.2 -1.3 -0.4
19 SF -1.3 -0.4 -0.2 -0.3 -0.8
20 MIN -0.6 -3.2 -2.9 -3.3 -1.9
21 NO -0.8 -3.4 -2.7 -3.3 -2.0
22 CLV -4.4 -4.4 -3.5 -4.3 -4.4
23 NYG -3.3 -6.4 -6.6 -6.3 -4.8
24 ATL -3.9 -6.4 -5.7 -6.3 -5.1
25 NYJ -5.8 -5.1 -5.3 -5.3 -5.5
26 WAS -7.0 -6.0 -5.9 -6.3 -6.6
27 CAR -5.6 -8.4 -7.4 -8.3 -7.0
28 CHI -5.6 -8.6 -6.7 -9.3 -7.5
29 TB -7.5 -9.3 -8.4 -9.3 -8.4
30 OAK -10.4 -8.2 -7.3 -7.3 -8.8
31 JAX -10.0 -10.9 -10.9 -11.3 -10.7
32 TEN -10.6 -11.1 -10.5 -11.3 -11.0

Notice that each approach yields slightly different ratings. The teams with very different L2 ratings than their L1 ratings likely had some big blowouts on their records. The overall order of the teams is relatively stable though.

Also notice how the true L1 results tend to group teams into tiers. There is a group of teams rated at 6.7, at 3.7, and at -6.3 points. This is a natural tendency of the approach, and it might be a useful way to think of the teams as well.

The only real surprise to me in terms of actual results is that KC is ranked so high. I suspect they are strongly buoyed by their big win over NE early in the season.

If you want true estimates of team strength to project future game outcomes, you’d want to regress these ratings toward the mean to some degree. There is an awful lot of sample error and other random effects included in every actual game outcome.

We can use the same technique on just about any team statistic. Here’s an example of L2 and L1 optimization for team Expected Points Added (EPA). EPA would give us similar results to actual points scored, except it treats special teams scores probabilistically.

Rank team L2 Norm L1 Norm Average
1 NE 9.2 18.7 13.9
2 DEN 11.1 13.7 12.4
3 SEA 8.3 8.7 8.5
4 KC 5.8 6.7 6.3
5 GB 5.5 6.7 6.1
6 IND 3.1 6.7 4.9
7 BLT 5.7 3.7 4.7
8 ARZ 3.9 3.7 3.8
9 SD 2.8 3.7 3.3
10 MIA 2.8 3.7 3.2
11 DAL 2.9 3.2 3.0
12 DET 4.0 1.7 2.8
13 BUF 1.7 3.7 2.7
14 PHI -0.3 5.7 2.7
15 PIT 2.3 0.7 1.5
16 HST -0.5 2.7 1.1
17 SL 1.0 0.2 0.6
18 SF 1.1 -0.3 0.4
19 NO 0.8 -3.3 -1.3
20 MIN -0.4 -3.3 -1.8
21 CIN -2.4 -1.3 -1.9
22 CLV -4.5 -4.3 -4.4
23 NYG -3.6 -6.3 -5.0
24 NYJ -5.8 -5.3 -5.6
25 WAS -5.4 -6.3 -5.8
26 ATL -5.8 -6.3 -6.1
27 CAR -4.4 -8.3 -6.4
28 CHI -3.7 -9.3 -6.5
29 OAK -7.7 -7.3 -7.5
30 TB -7.8 -9.3 -8.6
31 JAX -9.7 -11.3 -10.5
32 TEN -9.9 -11.3 -10.6

We get mostly the same order with EPA as we did with net scores.

The overall lesson is that there is no one correct method. Different but equally valid approaches give different answers. There are a number of ranking systems like these, such as SRS and Jeff Sagarin's long-standing Pure Points ratings, and they all will give us a very good picture of which teams are best and worst.


Leave your comments

Post comment as a guest


Comments (7)

  • Guest - W. Vohs

    In a similar vein, could we fit a regression (logistic?) model to games to see which statistic has the largest effect on wins (turnovers, total yards, PPG over a season, combinations of these, etc.) using a stepwise model? Possibly rather than wins, plot these versus EPA? Theoretically, could it help a team devise a "formula" on what aspects if their game to emphasize?

  • Guest - Mike

    Absolute value functions aren't discontinuous.

  • Guest - Mitch

    You guys go far to deep into math. One does not need that to be successful picking games. I suppose if you can be successful with it there's some value .

    Using the right statistics and information is far more important. Any method which has the Patriots no.1 is using the wrong stuff, period. All the mathmatical formula's will not help you be successful.

    I do agree there are different approaches, and no single method can include all the variables needed. But some methods are very easy to pick apart and find the flaws, SRS has big flaws that are very easy to pick apart with a little knowledge.

    The model has flaws that can be pick apart especially in the playoffs. As I did last season when I called Seattle far better than either the Saints or Broncos and the model had them only slightly better.

    DVOA has flaws, but each method contributes something as well. Having a very good understanding of what statistics and information is important goes a very long way .

  • Guest - James

    Some possible extensions to this would be to play with penalty functions that prioritize win/loss consistency over score difference accuracy.

  • Guest - shareit apk

    Shareit is a free and powerful application that gives you the possibility of transferring different files between two devices at super fast speeds.

  • Guest - Hilarys

    I do concur there are diverse methodologies, and no single strategy can incorporate every one of the factors required. In any case, <a target="_blank" href="/">2D Animation Services Prices</a> a few strategies are anything but difficult to dismantle and discover the imperfections, SRS has huge defects that are anything but difficult to dissect with a little learning.

  • Guest - Hilarys

    3D Animation Company