PDA

View Full Version : 2016-2017 GRaNT Computer Rankings



Pages : [1] 2

TonyTheTiger20
03-07-2017, 08:33 AM
Remember a couple years ago when I tried my hand at building a computer ranking system?

http://board.uscho.com/archive/index.php/t-111945.html

It was fun to build, but it was pretty rudimentary and had a lot of flaws (for example, all losses were treated the same by the system regardless of how good the team you beat was.

I started over from scratch and gave it another go, and I am thrilled with what I have. Everybody welcome back the brand new GRaNT (Grant's Reasonable and Not Terrible) Computer Rankings!

Think of it as essentially KRACH for margin of victory. But instead of margin of victory, what's used is the percentage of goals each team scores in each game (i.e. if you win 3-1, you scored 75% of the goals in the game). The ratings are set such that if you add up the sum of the actual G%'s for each team, it equals the expected G% against each team. It is calculated the same way KRACH is in that you set each team's rating equal to 100, run through the calculations, get a new rating, and then run that new rating through the calculations, over and over until the ratings for each team don't change after successive runs.

It avoids messy problems with shutouts (i.e. even though a 1-0 win and a 5-0 win both have 100% of goals scored in each game) by dealing with the season as a whole instead of individually. So, for example, if Team A wins its two games 1-0 and 3-1, its combined G% will be 4 out of 5 goals = 80%, and if Team B wins its two games 5-0 and 3-1, its combined G% will be 8 out of 9 goals = 89%.

Some more details and a full ranking of teams for both men's and women's hockey can be found here: http://www.bcinterruption.com/boston-college-hockey/2017/3/7/14839436/2016-2017-ncaa-mens-and-womens-college-hockey-grant-computer-rankings

Here's a screenshot of the women's rankings after the conference tournaments below.

All feedback and criticism are welcome!

<a href="http://www.bcinterruption.com/boston-college-hockey/2017/3/7/14839436/2016-2017-ncaa-mens-and-womens-college-hockey-grant-computer-rankings">
<img src="http://i.imgur.com/mHKnvty.png" alt="Click here to go to the GRaNT Computer Rankings!" height="500">
</a>

robertearle
03-07-2017, 10:00 AM
It avoids messy problems with shutouts (i.e. even though a 1-0 win and a 5-0 win both have 100% of goals scored in each game) by dealing with the season as a whole instead of individually. So, for example, if Team A wins its two games 1-0 and 3-1, its combined G% will be 4 out of 5 goals = 80%, and if Team B wins its two games 5-0 and 3-1, its combined G% will be 8 out of 9 goals = 89%.


Is that then "amplified" (is maybe the word I want here) by the number of games the two teams played against one another? A "for instance" would be Wisconsin (of course) and their four games against Bemidji vs their lone game against Lindenwood. Total goals against Bemidgi were 21-to-3 for .875 versus 5-1 for .833 against Lindenwood. Do they count more-or-less the same, or is there a 'reward' for four big wins versus a single win?

TonyTheTiger20
03-07-2017, 10:21 AM
Is that then "amplified" (is maybe the word I want here) by the number of games the two teams played against one another? A "for instance" would be Wisconsin (of course) and their four games against Bemidji vs their lone game against Lindenwood. Total goals against Bemidgi were 21-to-3 for .875 versus 5-1 for .833 against Lindenwood. Do they count more-or-less the same, or is there a 'reward' for four big wins versus a single win?Basically, yes, it's on a per game, not per team, basis.

It's not that games against one opponent are calculated together, it's that all games in total are calculated together. The end result is that each game, shutout or not, duplicate opponent or not, is included one time in the rankings.

For example for Wisconsin, they've scored 149 total goals this season and they've allowed 33. So the percentage of goals they've scored over the course of the season would be 149/(149+33) = 81.9%. If there were no shutouts, that would be mathematically the same as adding up each individual game's G% and dividing by the number of games played. But doing it on the season as a whole allows you to properly take those 1-0 vs. 5-0 shutout wins into account.

robertearle
03-07-2017, 10:35 AM
Basically, yes, it's on a per game, not per team, basis.

It's not that games against one opponent are calculated together, it's that all games in total are calculated together. The end result is that each game, shutout or not, duplicate opponent or not, is included one time in the rankings.

For example for Wisconsin, they've scored 149 total goals this season and they've allowed 33. So the percentage of goals they've scored over the course of the season would be 149/(149+33) = 81.9%. If there were no shutouts, that would be mathematically the same as adding up each individual game's G% and dividing by the number of games played. But doing it on the season as a whole allows you to properly take those 1-0 vs. 5-0 shutout wins into account.

Well, then, is your rating number meant to be similar to KRACH? Meaning "if Team A’s GraNT rating is three times as large as Team B’s, Team A would be expected to amass a winning percentage of .750 and Team B a winning percentage of .250 if it played each other enough times"? Because if so, again using Wisconsin and Bemidji, KRACH says Wisconsin is going to beat Bemidji 17 times out of 18, while GRaNT says it is only 6 times out of 7. 17 out of 18 sounds more like it to me. That is, it seems like yours is damping things down at the outer edges.

D2D
03-07-2017, 10:48 AM
Grant, would it be fair to say that GRaNT rewards a very good defensive team more than a very good offensive team? For example, hypothetically, you have two teams, Team D and Team O, and over the course of the season each goes undefeated in their 35 games. Team D wins every game by a score of 2-1 while Team O wins every game 6-5, so each has the same one goal margin of victory. That would mean that over the course of the season Team D has scored 66.7% of the goals (70/105), substantially better than Team O's 54.5% (210/385). Of course if I'm a coach I'd rather be in charge of Team D, but then again my record would be no better than the coach of Team O.

Comments?

TonyTheTiger20
03-07-2017, 11:19 AM
Well, then, is your rating number meant to be similar to KRACH? Meaning "if Team A’s GraNT rating is three times as large as Team B’s, Team A would be expected to amass a winning percentage of .750 and Team B a winning percentage of .250 if it played each other enough times"? Because if so, again using Wisconsin and Bemidji, KRACH says Wisconsin is going to beat Bemidji 17 times out of 18, while GRaNT says it is only 6 times out of 7. 17 out of 18 sounds more like it to me. That is, it seems like yours is damping things down at the outer edges.
It's set up the same way KRACH is in that everyone's rating is determined by everyone else's rating, but while you would use KRACH to determine what percentage of games Team A would win against Team B, you would use GRaNT to determine what percentage of goals Team A would score in a game against Team B.

So taking your UW/BSU example:

KRACH: UW = 1837, BSU = 108.7. So UW would win 1837/(1837+108.7) = 94.4% of games.

GRaNT: UW = 497.82, BSU = 82.25. So UW would score 497.82/(497.82+82.25) = 85.8% of goals in a season against Bemidji State.

Looking at real-life results, UW obviously won 100% of games against BSU vs. 94.4% expected by KRACH.

In four games against Bemidji, Wisconsin won by scores of 5-0, 6-0, 6-1, and 4-2. That comes out to 21 out of 24 goals scored, or 87.5% -- which is less than one Bemidji goal away from what GRaNT says to expect.

So you were on target for how GRaNT should be interpreted, but it's for % of goals scored against a team, not % of wins.


Grant, would it be fair to say that GRaNT rewards a very good defensive team more than a very good offensive team? For example, hypothetically, you have two teams, Team D and Team O, and over the course of the season each goes undefeated in their 35 games. Team D wins every game by a score of 2-1 while Team O wins every game 6-5, so each has the same one goal margin of victory. That would mean that over the course of the season Team D has scored 66.7% of the goals (70/105), substantially better than Team O's 54.5% (210/385). Of course if I'm a coach I'd rather be in charge of Team D, but then again my record would be no better than the coach of Team O.

Comments?
That's a pretty fair question. I don't think I'd say it "favors" a defensive team in that I don't think it gives them a particular advantage in the methodology. I think it would be more accurate to say that a team that wins all its games 2-1 is just better than a team that wins its games 6-5. If a team routinely scores 2x as many goals as it allows, I would say that's a good argument for them to be better than a team that scores 1.2x as many goals as it allows.

Because think of it this way, a team that wins its games on average 6-5 is going to have much more variance in what games it wins and loses, and in practice, is statistically far more likely to be closer to .500 than undefeated.

These are such good questions yessssssssssssss

robertearle
03-07-2017, 12:18 PM
but it's for % of goals scored against a team, not % of wins.



So then could it be turned into a sort-of game score predictor?

Here's what I'm thinking: I don't know if you follow college basketball at all; I do to the extent that I pay attention to Wisconsin, and then to the Big Ten. I need a "rooting interest" to stay interested. But in paying that much attention, I have come across a basketball rating system put together by a guy named Ken Pomeroy (who in turn bases some of what he does on Bill James "pythagorean theorem" of runs scored in ML baseball, which is actually somewhat similar to what you're doing here).

The essence of Pomeroy's rating is offensive and defensive efficiency. A "run and gun" team that averages 100 points a game, but needs 100 possessions during the game to get there is less efficient offensively than a slow, deliberate team that only scores 70 points a game, but manages to do so in only 60 possessions. If they play one another in an 80 possession game, the slow team would score 93 points while the run-and-gun team would only score 80 (assuming speeding the slow team up to 80 possessions doesn't affect their efficiency, etc.). Likewise for defensive per-possession efficiency.

Obviously, trying any sort of 'per possession' in hockey is out of the question. But is there some Pomeroy or James equivalent for goal percentage?

If we know that in Wisconsin vs Bemidji, UW is going to score 87.5% of the goals, and there are likey to be

(UW average goals scored + Bemidji average goals scored)/2

total goals scored, then the expected final score would be....

Or are there too few goals scored overall to make that worth the bother?

Ken Pomeroy's current page (with some amount of explanation... but less than there used to be!)
http://kenpom.com/

and the wiki page for Bill James' MLB Pythagorean win expectation.
https://en.wikipedia.org/wiki/Pythagorean_expectation

EDIT: and here's a page with more Pomeroy explanation

(where he explains some changes that he's made over the years; in particular, that he has moved away from something that he had been using that I thought might apply well here, his "log5" calculation. My recollection is that "log5" was the result of his having done a regression analysis on his modified 'Pythagorean formula' calculation on two team's comparative efficiencies, and "log5" turned out to be a better predictor than simple squaring that James uses. Or something like that. It makes more sense going forward than the backward direction we're going now, and its been years since I first read it all, and it may not apply well to hockey, anyway, and ...)

http://kenpom.com/blog/ratings-methodology-update/

TonyTheTiger20
03-07-2017, 01:38 PM
So then could it be turned into a sort-of game score predictor?

If we know that in Wisconsin vs Bemidji, UW is going to score 87.5% of the goals, and there are likely to be

(UW average goals scored + Bemidji average goals scored)/2

total goals scored, then the expected final score would be....
I wrapped this up at 1:30am last night (otherwise I would have never stopped...), but that was definitely where I was thinking a "next step" would be for this.

Right now, GRaNT can easily be used to determine an expected final score based on the losing team scoring n goals with the following formula:

g=x/(x+n) -- this is the formula for percentage of goals scored (G%)
g(x+n)=x
gx+gn=x
gn=1x-gx
gn=(1-g)x
(gn)/(1-g)=x

where g = G% for the winning team
n = number of goals scored for the losing team
x = number of goals scored for the winning team

(g and n are constants).

Using this we find that in the case of UW and BSU,

UW's GRaNT is 497.82
BSU's GRaNT is 82.25

That gives a G% for UW of 497.82/(497.82+82.25) = 85.82% = .8582=g

So, if we want to see how many goals UW would score in a game where BSU scored n=1 goal, our formula gives:

(gn)/(1-g)=x
(.8582*1)/(1-.8582)=x
.8582/.1418=x
6.05=x

So, in a game where Bemidji State scored 1 goal, we would expect Wisconsin to score just about 6 goals, for a 6-1 final score.

The next step for me with this is to figure out the best way to set the losing team's score (like you tried to do above). Something like an average of UW's GA/gm and BSU's GS/gm? Possibly. More likely, I'll need to work out the math on how to account for strength of schedule (which, not coincidentally, the GRaNT Ranking calculator does provide). The thing is, I think there's a correct answer here -- a corollary to the GRaNT formula that will spit out its projected score between two teams -- rather than something we can arbitrarily assign.

Once you do that, you have yourself a handy little final score predictor.

robertearle
03-07-2017, 01:43 PM
... or are we making this all more complicated than it really is?

The basic idea behind James' Pythagorean Expectancy could be stated as "your average MLB game is gonna have eight runs scored; how often are you gonna win 5-3, and how often are you gonna lose 5-3?"

Your average women's hockey game (at least, looking at this years WCHA conference games) is gonna have five goals scored; how often will you win 3-2, how often 4-1, and how often are you gonna lose 4-1 or 3-2? Simple 'goals per game' does that pretty well.

EDIT: put another way, the reason Pomeroy efficiency numbers are useful is because you have run-and-gun teams AND slow grind-it-out teams resulting in a much wider variation in the numbers of points scored in a game, and wider variation in average points per game. There just may not be enough variation in goals-per-game to make all this worth the while.

TonyTheTiger20
03-07-2017, 02:06 PM
... or are we making this all more complicated than it really is?

The basic idea behind James' Pythagorean Expectancy could be stated as "your average MLB game is gonna have eight runs scored; how often are you gonna win 5-3, and how often are you gonna lose 5-3?"

Your average women's hockey game (at least, looking at this years WCHA conference games) is gonna have five goals scored; how often will you win 3-2, how often 4-1, and how often are you gonna lose 4-1 or 3-2? Simple 'goals per game' does that pretty well.
Yes, but there's a better way to do it out there than just "your average hockey game" I think.

If you have a team that wins their games on average 2-1, and another team that wins their games on average 8-4, they're going to have the same rating, but they are going to have a much different expected final score. You want to find a way to be able to differentiate between teams while accounting for strength of schedule.

And like I said, GRaNT gives you strength of schedule, so I feel like there may be a way to do using the inputs we already have.

robertearle
03-07-2017, 02:14 PM
If you have a team that wins their games on average 2-1, and another team that wins their games on average 8-4, they're going to have the same rating, but they are going to have a much different expected final score.

I'm asking "do we ever have one team that averages 2-1 and another team that averages 8-4?"

TonyTheTiger20
03-07-2017, 03:52 PM
I'm asking "do we ever have one team that averages 2-1 and another team that averages 8-4?"

I'm sure we don't quite to that extent, but there are certainly differences. Just compare teams with similar ratings --

Bemidji and Ohio State have almost identical GRaNT ratings at 82.25 and 82.06 respectively.

Bemidji's average game is 1.91 to 2.57
Ohio State's average game is 1.86 to 2.24

Same exact (almost) ratings, but Bemidji's games average an extra 0.38 goals per game.

Factoring in strength of schedule makes it interesting too -- BSU's SOS is 3rd in the country, and Ohio State's is 10th. So Bemidji is scoring more goals against tougher competition, and Ohio State is allowing more goals against weaker competition.

ARM
03-07-2017, 06:35 PM
I think it would be more accurate to say that a team that wins all its games 2-1 is just better than a team that wins its games 6-5. If a team routinely scores 2x as many goals as it allows, I would say that's a good argument for them to be better than a team that scores 1.2x as many goals as it allows.IMO, you've swung too far to the stats side and lost important aspects of the hockey side. Margin of victory matters more than % of goals. A team that wins 7-2 had better control of a game than a team that won 4-1. Also, a 5-3 winner was less likely to finish tied than a team that ended up 2-1, but in each case, your model suggests the opposite. You give more weight to the variance that comes with playing higher-scoring games than you do to the variance that comes with playing 2-1 games where a team is more at the mercy of a bounce.

From a math perspective, you make the statement that a team that wins 2-1 is better than a team that wins 6-5. That may be true, but you haven't offered anything in the way of proof. Perhaps the 2-1 team is sunk if the opponent scores first, while the 6-5 team is never out of a game; we don't know.

TonyTheTiger20
03-07-2017, 10:23 PM
IMO, you've swung too far to the stats side and lost important aspects of the hockey side. Margin of victory matters more than % of goals. A team that wins 7-2 had better control of a game than a team that won 4-1. Also, a 5-3 winner was less likely to finish tied than a team that ended up 2-1, but in each case, your model suggests the opposite. You give more weight to the variance that comes with playing higher-scoring games than you do to the variance that comes with playing 2-1 games where a team is more at the mercy of a bounce.

From a math perspective, you make the statement that a team that wins 2-1 is better than a team that wins 6-5. That may be true, but you haven't offered anything in the way of proof. Perhaps the 2-1 team is sunk if the opponent scores first, while the 6-5 team is never out of a game; we don't know.
Sorry for the delayed response, I've been working on the score projection feature all afternoon (and I think I have it!)

Anywho -- I see your "you've lost important aspects of the hockey side" and raise you a "the point of any mathematical system is to remove subjectivity." What I think of whether a 7-2 win is better than a 4-1 win doesn't matter. As constructed, the system considers 4-1 (slightly) better. Whether the 7-2 winner "controlled the game better" is really subjective.

I would definitely, definitely argue that % of goals matters more than margin of victory, though, and that a 2-1 win is better than a 6-5 win. You said I didn't offer any proof on that, but I don't know what i could say that's better than the sentence you actually quoted: If, over the long run, a team scores 2x as many goals as its opponent, it will win a higher percentage of games than a team that scores 1.2x as many goals as its opponent.

But honestly, I think getting hung up on an individual game result of 2-1 vs. 6-5 is sort of looking at it from a wrong angle. You want to consider the season results as a whole. If one team scores 200 goals in a season and allows 100, their "average result" is a 2-1 score. But if another team scores 200 goals in a season and allowed 167 goals, their "average result" is a 6-5 score. Surely you must agree that it is far more likely that the 200 scored/100 allowed team has a better record.

Winning on average by a score of 2-1 is a pretty sizable advantage. Most of the projected score ratios are much closer than that -- for example, UMW vs. BC would be projected at a 2.5 to 2.3 final score. That doesn't mean much in a one game setting, but over the long run it does have meaning. It's like saying the average household has 2.5 kids -- you aren't going to walk into any houses and find half a kid! (I hope...)

As far as game flow fluctuations are concerned -- of course a team that's winning 2-1 is more likely to have one bounce affect the result than a 5-3 score. But the difference is that over the long run that team that is doubling up its opponent (and it's not limited to a 2-1 score, i can be scores of 4-2, 6-3...) is going to do better than a team that is scoring less than that.

I love all this -- keep 'em coming!

TonyTheTiger20
03-07-2017, 10:45 PM
So then could it be turned into a sort-of game score predictor?
I've done it!

I've teased out the data necessary to convert percentage of goals scored to an actual score projection for any matchup.

It took a whole lot of math-ing it out, generally the way it works is you figure that if Team A played a team with an identical ranking to itself, it would score on average as many goals as it allows. Exactly how many is easy to figure out -- just take the midpoint between its goals scored and goals allowed. If a team with a positive goal differential keeps playing a better and better team, you would expect it's goals scored to decrease and it's goals allowed to increase, meeting in the middle.

Using that figure, I'm able to calculate what a team's "expected" goals scored and goals allowed would be. I'm skipping a lot of steps for how I determined this, but basically you take the projected goal percentage for Team A and multiply it by a factor involving Team A's GS and Team B's GA, and then you take the projected goal percentage for Team B and multiply it by a factor involving Team B's GS and Team A's GA.

Anyway, I tested it with a few checks on what you would expect (for example, the projected goal percentage should equal the R1/(R1+R2) GRaNT Rating calculation of the two teams, and also that if you calculate the linear regression based on the data points, a team should expect to score 0 points against a "perfect" team and allow 0 points to a "perfectly bad" team.

I'm tired and this probably doesn't make any sense but it works and I'm very pleased. I added a score projection tab to the spreadsheet online, and here's a screenshot of it. So, if you find the UW x BSU cell, you'll see that Wisconsin would be expected to beat Bemidji State on average by approximately a score of 4.0 to 0.7 (rounded to one decimal).

(Click here if it's too small) (http://i.imgur.com/ig5wYic.png)

<img src="http://i.imgur.com/ig5wYic.png">

D2D
03-07-2017, 11:01 PM
Rounding off:
Wisconsin 4, Robert Morris 1 (I don't think it will be this close)
Boston College 2, St. Lawrence 2 (to be decided in overtime?)
Clarkson 3, Cornell 2 (not taking into account the Big Red's 2nd leading scorer is out)
Minnesota-Duluth 3, Minnesota 2 (OK, 2.5 to 2.3 - we've got a shot!)

TonyTheTiger20
03-07-2017, 11:22 PM
Boston College 2, St. Lawrence 2 (to be decided in overtime?)
Oh God nooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooo

TonyTheTiger20
03-07-2017, 11:51 PM
Oh, one other thing -- we can debate whether a 4-3 win is better than a 6-4 win or whatever you want, but the theory behind using this methodology is to reward teams a team that wins 4-0 or 5-1 or some other big win, and determine that team to be better than another team who might have the same win/loss result, but might win a 2-1 or 3-2 close one.

I'm not saying this is better than KRACH or CHODR or any other methodology out there (except for PWR and RPI, those are stupid), just that it's another methodology to go alongside them, depending on what level of the results (i.e. wins/losses, goals, or even shots) you want to analyze. I'm still the world's biggest KRACH addict -- which is why I'm so excited about my system, because the theory behind the calculation is very similar, it's just using a different level of game results.

TonyTheTiger20
03-08-2017, 08:28 AM
Bump

ARM
03-08-2017, 08:33 AM
BumpI did send an email requesting Spam removal. In the meantime, thanks for moving actual topics to the front.