The team you are wanting to assess just finished a game against some opponent. You have no information about the game other than who the two teams are. You are asked to guess who won the game, but before you do, you are allowed to look at one player's stat line. Which player do you pick? The player you would pick would seem to be the most important player, as their stats most relevantly show whether or not the team won.
That is a relatively imprecise notion of importance, so I wanted to try my hand at creating a rudimentary metric for defining importance. The discussion that I had begun with had been about basketball, so I started with looking at it for that sport. Here is my first crack at it.
Importance Index: the correlation coefficient between the point differential in a game and the number of points scored by an individual in that game.
Before assessing the merits of it, let's look at the stat in action. We will look at the men's basketball team for my beloved Kansas State Wildcats. Before running the numbers, my first thought was that I would pick either Shane Southwell's or Thomas Gipson's stat line in order to predict the result. Let us see what the numbers said.
|M. Foster||T. Gipson||S. Southwell||N. Williams||DJ Johnson|
|O. Lawrence||W. Spradling||W. Iwundu||N. Johnson||J. Thomas|
Ok. So, that's a lot of numbers after the decimal. I like them, but I know they are obnoxious. Here are the list of the top three most important players according to this Importance Index:
- Shane Southwell (0.491)
- DJ Johnson (0.243)
- Omari Lawrence (0.102)
My intuition about Southwell seems to be correct, but not so much with the one about Gipson. Jevon Thomas' numbers are skewed badly, because he has only played in 6 games so far and his highest score game (9 points) was at Allen Fieldhouse, where the Cats lost by 26. I was surprised a bit by the fact that Foster's and Gipson's numbers came up negative. It might just be noise or it might be that their importance to Wildcat victories are shown more by their other numbers, namely Gipson's rebounds. (I ran the index with just that; it came out to 0.299.)
It's a relatively simple statistic, which has its benefits and drawbacks. I think it captures the intuition behind that idea of picking one player's stat line in order to predict how a team did. What you are looking for is whose stat line best correlates with a team winning. This could easily be done with rebounds, assists, steals, or any other statistic under the sky. My thought is that there needs to be some way to add together all of a player's contribution without simply adding points, rebounds, and assists. These might want to be weighted to some extent. I'm still working through this, but I think there is something here. I'd be interested in any input.
This whole discussion began when talking to my future co-host of the Sneaky Fast Sports Show, Amar. Look out for us being on Wildcat 91.9 through the spring semester. This theme will certainly come up on the show.