CFB Stats Part II

In College Football Statistics, I created my Team and Game objects, filled the Teams collection, and wrote a FindByName property for Teams. Now the fun part; filling the Game objects. This is a long one and I’m not going to sit here and tell you it doesn’t need some refactoring. But it works for now. I added line numbers so I can identify the lines as I explain the logic.

The main block of the procedure is looping through every team in the CTeams collection class. Within that loop, there are two main blocks, offense and defense. Offense starts on line 20 and defense on line 520. For this code to work, I need a reference to Microsoft XML, v6.0 and Microsoft HTML Object Library. Generally, I go get a web page, loop through a certain table, fill a CGame class, and add it to CGames plus to the collection classes within the two teams involved.

The web page fetching starts on line 40. I open a request passing a URL, send the request, then create an HTML document based on the response.

Starting on line 70, I loop through every table in the HTML document and look for the one that has a class of “game-log”. Then I loop through each row in that table (90). I skip the first row (100), any rows that don’t have a date in the first cell (110), and any row where the team name isn’t a hyperlink (120). Teams without hyperlinks are D1AA teams and I don’t care about those games.

Once I find a row that’s relevant, I grab the game date (130), and a CTeam object called clsOpponet based on the name in the second cell (140). Now I have clsTeam and clsOpponent representing the two teams in the game and a date. With those three pieces of information, I can determine if the game already exists (150). That property simply loops through all the games and checks the teams and the date. If the game exists, I don’t want to create a new one. For instance, if I’ve already processed Auburn then the Auburn v. Clemson game already exists and Auburns stats are recorded in there. When I get around to processing Clemson, I don’t want to create a new game, just fill in Clemson’s stats.

To differentiate the teams, I identify the home team and the away team. I need to determine if I’m processing the home or away team on this pass (160). This got a little funky for neutral site games because the web page doesn’t identify who’s home and who’s away. I have this in MUtlities

If there’s an @ sign, it’s an away game. For neutral site games, identfied with a “+”, I call the loser the away team regardless of who actually was. Ugh, that code needs to be refactored to bReturn = Left$(hRow.Cells(3).innerText, 1) = "L". I’ll get on that.

Back to my fill code. Starting on 170, I create a new game if doesn’t already exist and set the date, score, and the home and away teams. I add the game to the CGames collection class, but also to each teams games collection (310 and 320). Starting in 340, I fill in the home or away stats.

In the second major block of code, I pretty much do the same thing for the defensive stats. There’s not much different here and that screams for a rewrite. I should have these two major blocks in another procedure and just pass the differences in. Like I said, it’s a work in process. I like to get the code working and then refactor where it makes sense. Apparently not before I post to the blog though.

At this point I have a collection of CTeam objects and a collection of CGame objects. My CTeam objects also have their own collection games. Let’s see if it works.

I check a few and it all looks good. Some of the new properties I wrote for this test

It wasn’t all peaches and beach balls though. I ran into a few problems and had to create this little sub to check out one team at a time.

That way I could check each game of a particular team for accuracy.

I feel like I have all the information I need. Now I need to actually do something with it. How will I exclude games? I prefer to do it automatically, but I’ll need a pretty fancy algorithm. CBS ranks all 120 D1A teams so I could use that ranking system and exclude teams below a certain number. Another thought I had is to weight the stats against the opponent based on the opponents ranking. For instance, I could divide the 120 teams into thirds. The top third would be weighted 100%. The middle third 50%. The bottom third 0%. I’ll probably need to do it a few different ways to see what I like.

One thought on “CFB Stats Part II

  1. Thanks Dick. Because of this, I will start following college football.

    Just kidding! There’s a serious bug in that code, but I’m not going to tell you what it is. I hope you don’t have any big money riding on this.


Posting code? Use <pre> tags for VBA and <code> tags for inline.

Leave a Reply

Your email address will not be published.