Announcement

Collapse
No announcement yet.

John t whelan ranking simulator

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Re: John t whelan ranking simulator

    Originally posted by FlagDUDE08 View Post
    As for Jim's and my differences, we've already discussed a disagreement in Quality Wins Bonus.
    If the only differences are in teams with dropped wins, this is probably the difference. Our analysis of how to calculate Minnesota a page or two back was otherwise identical.

    I will also ask Jim if he is taking weighting into account on RatingsPI
    I'm not sure what this means.

    One other factor that could be making a difference is how OOWP is calculated, specifically whether games involving the team in question should be counted. Some sources say yes, others say no.
    I'm assuming this is unchanged from the past -- OOWP is simply the average of the OWP's for each opponent (so does not include games against each opponent).

    One other thing that could cause issue is specifically how OWP and OOWP is calculated. Do you take a cumulative record, or do you take the average of each team's records?
    I'm again assuming unchanged from the past -- average of records.

    Comment


    • Re: John t whelan ranking simulator

      Originally posted by JimDahl View Post
      If the only differences are in teams with dropped wins, this is probably the difference. Our analysis of how to calculate Minnesota a page or two back was otherwise identical.


      I'm not sure what this means.


      I'm assuming this is unchanged from the past -- OOWP is simply the average of the OWP's for each opponent (so does not include games against each opponent).


      I'm again assuming unchanged from the past -- average of records.
      Sorry for being unclear with the second point. I meant in terms of OWP and OOWP. The sources I have say not to take weighting into account, and to do a straight 1.0/0.0/0.5 for each game.

      One thing I did notice with calculations, at least between RHamilton and myself, is that we had different games to remove for various teams. I wonder if this is the case for us.

      Comment


      • Re: John t whelan ranking simulator

        Originally posted by FlagDUDE08 View Post
        Sorry for being unclear with the second point. I meant in terms of OWP and OOWP. The sources I have say not to take weighting into account, and to do a straight 1.0/0.0/0.5 for each game.
        I agree, I interpreted OWP and OOWP as being straight (not with the home/away weightings). So, my OWP and OOWP calculations are essentially unchanged from previous years.

        One thing I did notice with calculations, at least between RHamilton and myself, is that we had different games to remove for various teams. I wonder if this is the case for us.
        A point of disagreement in the past has been whether this process is recursive. I've always been pretty convinced that they calculate RPI once, then drop all the games that make it go up if you drop them. Others have wondered if you then need to make another pass to see if the new, higher RPI, has pushed any new games into "adverse" territory (repeating until you don't find any). That matters a lot more this time of year than later, so I'm not sure we've ever had a conclusive test come tournament time.
        Last edited by JimDahl; 11-19-2013, 08:29 AM.

        Comment


        • Re: John t whelan ranking simulator

          Originally posted by JimDahl View Post
          I agree, I interpreted OWP and OOWP as being straight (not with the home/away weightings). So, my OWP and OOWP calculations are essentially unchanged from previous years.


          A point of disagreement in the past has been whether this process is recursive. I've always been pretty convinced that they calculate RPI once, then drop all the games that make it go up if you drop them. Others have wondered if you then need to make another pass to see if the new, higher RPI, has pushed any new games into "adverse" territory (repeating until you don't find any). That matters a lot more this time of year than later, so I'm not sure we've ever had a conclusive test come tournament time.
          I would think what we've seen to date would imply it isn't... should be simple enough to test on previous data... if anything upsets the seeding apple cart then it'll show that it isn't recursive.
          BS UML '04, PhD UConn '09

          Jerseys I would like to have:
          Skating Friar Jersey
          AIC Yellowjacket Jersey w/ Yellowjacket logo on front
          UAF Jersey w/ Polar Bear on Front
          Army Black Knight logo jersey


          NCAA Men's Division 1 Simulation Primer

          Comment


          • Re: John t whelan ranking simulator

            Originally posted by Patman View Post
            I still say the end goal is a simulator

            Edit: I am not using a modular executable language... I don't know the differences in the -oriented but what I do is use a thing that primarily uses C as a platform. I suppose its possible to treat it as a script but not without installing software.

            For me KRACH is deadly simple once you purée the data into a win matrix and game matrix. I've posted the code for that before.

            I'll say the big thing is if we can adopt a data input standard that will go a long way.
            Absolutely. When I wrote my KRACH script (is there any hockey fan who hasn't at least tried this?) in MATLAB, it was <100 lines of code, and the majority of that just had to do with reading the input file and stuffing the information into the win matrix, as you say. The actual "calculation" itself is like 10 lines of code - that simplicity is one of the aesthetic beauties of KRACH (in addition to its functional beauty).

            A standard input format would be great, but you'd probably need all of the major sites (USCHO, CHN, etc) to come together to agree on it, and I'm not sure they'd be motivated enough to bother.
            If you don't change the world today, how can it be any better tomorrow?

            Comment


            • Originally posted by LynahFan View Post
              Absolutely. When I wrote my KRACH script (is there any hockey fan who hasn't at least tried this?) in MATLAB, it was <100 lines of code, and the majority of that just had to do with reading the input file and stuffing the information into the win matrix, as you say. The actual "calculation" itself is like 10 lines of code - that simplicity is one of the aesthetic beauties of KRACH (in addition to its functional beauty).

              A standard input format would be great, but you'd probably need all of the major sites (USCHO, CHN, etc) to come together to agree on it, and I'm not sure they'd be motivated enough to bother.
              If we are talking about a major website. Unlikely. I mostly meant amongst ourselves. In theory if one wanted to grab direct data then webscraping might be the best... Though painful.

              One alternative would be to ask collegehockeystats to do a dump file for us with the most relevant summary (game data) info. But I don't know under whose auspices they produce game information.
              BS UML '04, PhD UConn '09

              Jerseys I would like to have:
              Skating Friar Jersey
              AIC Yellowjacket Jersey w/ Yellowjacket logo on front
              UAF Jersey w/ Polar Bear on Front
              Army Black Knight logo jersey


              NCAA Men's Division 1 Simulation Primer

              Comment


              • Re: John t whelan ranking simulator

                Originally posted by LynahFan View Post
                Absolutely. When I wrote my KRACH script (is there any hockey fan who hasn't at least tried this?) in MATLAB, it was <100 lines of code, and the majority of that just had to do with reading the input file and stuffing the information into the win matrix, as you say. The actual "calculation" itself is like 10 lines of code - that simplicity is one of the aesthetic beauties of KRACH (in addition to its functional beauty).

                A standard input format would be great, but you'd probably need all of the major sites (USCHO, CHN, etc) to come together to agree on it, and I'm not sure they'd be motivated enough to bother.
                KRACH, I would assume, is much easier to do than PWR. My SLOC is a few thousand, but the majority of this code is for display purposes; I would say there is only a couple hundred SLOC that actually involves math. I have not tried KRACH, mostly because I do not know what the formula is.

                Standard input would be nice, but I agree that not many would. I know my input is entirely based upon the Google Docs spreadsheet that I put together over the summer that has the entire country's schedule.

                Comment


                • Re: John t whelan ranking simulator

                  Originally posted by Patman View Post
                  If we are talking about a major website. Unlikely. I mostly meant amongst ourselves. In theory if one wanted to grab direct data then webscraping might be the best... Though painful.

                  One alternative would be to ask collegehockeystats to do a dump file for us with the most relevant summary (game data) info. But I don't know under whose auspices they produce game information.
                  I think use of collegehockeystats is the key. I currently scrape a combination of sites, which gets the data sooner (USCHO and CHN often post earlier) and helps me catch errors (they do sometimes post bad scores), but does require some manual poking to fix things now and then. I don't think any of you would want to rely on that data (nor would I on yours) because you'd occasionally be waiting for me to notice, care about, and fix such a problem.

                  If you wanted it to be a truly automated, trusted source, I think a high quality scraper/translator for collegehockeystats into a machine-readable input file is the way to go.

                  Comment


                  • Re: John t whelan ranking simulator

                    Originally posted by Patman View Post
                    I still say the end goal is a simulator
                    My previous code was in fact a simulator. It simulates a season (including in-season and post-season tournaments) and calculates the NCAA field in under 1/7th of a second per simulation. The one thing it didn't do was implement the tie-breaking rules for conference playoffs. It just randomly seeded teams with tied conference records. Anyway, when i get around to it, I'll have to implement the new pairwise. By the time I do that, all the in-season tournaments will be over, so the simulation should be really fast.

                    Basic KRACH code is dead-simple, but implementing a home-road differential and a tie probability requires a maximum likelihood routine. Still not diffocult, but a lot of the elegance goes away.


                    Originally posted by JimDahl View Post
                    A point of disagreement in the past has been whether this process is recursive. I've always been pretty convinced that they calculate RPI once, then drop all the games that make it go up if you drop them. Others have wondered if you then need to make another pass to see if the new, higher RPI, has pushed any new games into "adverse" territory (repeating until you don't find any). That matters a lot more this time of year than later, so I'm not sure we've ever had a conclusive test come tournament time.
                    Interesting. I had though about this and made my code recursive, though it was quite rare in simulations after the season was up that you needed more than one pass. (As I recall, it was something like one season in 30.) I guess i don't understand how it could not be recursive. If you don't make it recursive, can't a team protest that there was a game left in the games that counted to calculate RPI that lowered its rating?

                    Comment


                    • Re: John t whelan ranking simulator

                      Originally posted by Patman View Post
                      I still say the end goal is a simulator
                      My previous code was in fact a simulator. It simulates a season (including in-season and post-season tournaments) and calculates the NCAA field in under 1/7th of a second per simulation. The one thing it didn't do was implement the tie-breaking rules for conference playoffs. It just randomly seeded teams with tied conference records. Anyway, when I get around to it, I'll have to implement the new pairwise. By the time I do that, all the in-season tournaments will be over, so the simulation should be really fast.

                      Basic KRACH code is dead-simple, but implementing a home-road differential and a tie probability requires a maximum likelihood routine. Still not difficult, but a lot of the elegance goes away.


                      Originally posted by JimDahl View Post
                      A point of disagreement in the past has been whether this process is recursive. I've always been pretty convinced that they calculate RPI once, then drop all the games that make it go up if you drop them. Others have wondered if you then need to make another pass to see if the new, higher RPI, has pushed any new games into "adverse" territory (repeating until you don't find any). That matters a lot more this time of year than later, so I'm not sure we've ever had a conclusive test come tournament time.
                      Interesting. I had thought about this and made my code recursive, though it was quite rare in simulations after the season was up that you needed more than one pass. (As I recall, it was something like one season in 30.) I guess i don't understand how it could not be recursive. If you don't make it recursive, can't a team protest that there was a game left in the games that counted to calculate RPI that lowered its rating?

                      Comment


                      • Originally posted by goblue78 View Post
                        My previous code was in fact a simulator. It simulates a season (including in-season and post-season tournaments) and calculates the NCAA field in under 1/7th of a second per simulation. The one thing it didn't do was implement the tie-breaking rules for conference playoffs. It just randomly seeded teams with tied conference records. Anyway, when I get around to it, I'll have to implement the new pairwise. By the time I do that, all the in-season tournaments will be over, so the simulation should be really fast.

                        Basic KRACH code is dead-simple, but implementing a home-road differential and a tie probability requires a maximum likelihood routine. Still not difficult, but a lot of the elegance goes away.




                        Interesting. I had thought about this and made my code recursive, though it was quite rare in simulations after the season was up that you needed more than one pass. (As I recall, it was something like one season in 30.) I guess i don't understand how it could not be recursive. If you don't make it recursive, can't a team protest that there was a game left in the games that counted to calculate RPI that lowered its rating?
                        I wish we were using the same code base... In season tournaments and tie breakers has been what has stopped me. But I will admit its the in season tournaments because I don't know how I want to wave it in and the rest
                        BS UML '04, PhD UConn '09

                        Jerseys I would like to have:
                        Skating Friar Jersey
                        AIC Yellowjacket Jersey w/ Yellowjacket logo on front
                        UAF Jersey w/ Polar Bear on Front
                        Army Black Knight logo jersey


                        NCAA Men's Division 1 Simulation Primer

                        Comment


                        • Re: John t whelan ranking simulator

                          Originally posted by Patman View Post
                          I wish we were using the same code base... In season tournaments and tie breakers has been what has stopped me. But I will admit its the in season tournaments because I don't know how I want to wave it in and the rest
                          I am the same where it's impractical for me to do such a predictor. Mostly because I'd then have to change my entire format to include what league, whether a game was a league game or not, and so on. I'm sure I could do it, but I fear everything would just get larger and larger.

                          Comment


                          • Re: John t whelan ranking simulator

                            Originally posted by FlagDUDE08 View Post
                            I have not tried KRACH, mostly because I do not know what the formula is.
                            From Whelan's explanation on the CHN site:

                            K(i) = V(i) / [∑(j)N(ij)/(K(i)+K(j))]

                            K = Krach Rating
                            V = # of Victories of that team regardless of who they faced
                            n(ij) = Number of games against opponent j

                            Obviously, since K(i) is on both sides, this is recursive. Plug in 100 for everyone on the right sides, recalculate all 59 Ratings. Plug those in again.... Continue until nothing changes anymore (usually 20 iterations gets close enough)

                            Comment


                            • Originally posted by FlagDUDE08 View Post
                              I am the same where it's impractical for me to do such a predictor. Mostly because I'd then have to change my entire format to include what league, whether a game was a league game or not, and so on. I'm sure I could do it, but I fear everything would just get larger and larger.
                              For me its just more confusing... I don't have the ability to bare down on things anymore. The way I would do it would to have to define the type of tournament and types of results, etc.

                              The more things get formalized the closer we will get.

                              Honestly, if I could just leave myself to model development... Still too many steps ahead I'm afraid
                              BS UML '04, PhD UConn '09

                              Jerseys I would like to have:
                              Skating Friar Jersey
                              AIC Yellowjacket Jersey w/ Yellowjacket logo on front
                              UAF Jersey w/ Polar Bear on Front
                              Army Black Knight logo jersey


                              NCAA Men's Division 1 Simulation Primer

                              Comment


                              • Originally posted by Numbers View Post
                                From Whelan's explanation on the CHN site:

                                K(i) = V(i) / [∑(j)N(ij)/(K(i)+K(j))]

                                K = Krach Rating
                                V = # of Victories of that team regardless of who they faced
                                n(ij) = Number of games against opponent j

                                Obviously, since K(i) is on both sides, this is recursive. Plug in 100 for everyone on the right sides, recalculate all 59 Ratings. Plug those in again.... Continue until nothing changes anymore (usually 20 iterations gets close enough)
                                Yeah, I only run 100 because its so **** fast... Reality 10 does it well enough
                                BS UML '04, PhD UConn '09

                                Jerseys I would like to have:
                                Skating Friar Jersey
                                AIC Yellowjacket Jersey w/ Yellowjacket logo on front
                                UAF Jersey w/ Polar Bear on Front
                                Army Black Knight logo jersey


                                NCAA Men's Division 1 Simulation Primer

                                Comment

                                Working...
                                X