Welcome back to the Making RMNB Last essay series. This time our prompt comes from Joe K., who I think did a wonderful job articulating, so I’ll just hand it over to him.
“You all put together wonderful statistical analyses which tell us way more than the standard NHL-provided goals/assists/+/-, etc and really help us look at players more insightfully than ever before (tho without CapGeek, maybe we’re all back to square 2). The only issue I have is how the analyses are typically constrained by even strength/5v5 TOI, and more importantly, that this is a not insignificant portion of overall TOI for a lot of players, or in some cases a very significant portion. Seems it could also be presumed that the more non-5v5 TOI a player has a game, the less likely their even strength stats tell the story of what their value might be to the team.
I’d like to see something that assessed, what, if any, stats are out their which might enhance the lens thru which we look at these players’ advanced stats and help flag which players’ 5v5 SA%/G% #’s might be more/less meaningful.
Finally, I realize the above might be the subject of a Doctoral dissertation and know that can’t happen, but even weaving the issue into these discussions more is something I see as a potentially avenue to drive analyses in that direction and appropriately couch bigger picture judgment on players. Don’t worry about writing a specific article on this, just would find it interesting to see something alluding to this concept and informing the discussion at some point along the way. Thanks for the ear and opportunity to offer the thoughts. By all means if I just haven’t been reading you all enough and this path has been beaten, by all means accept my apologies and offer a link. Keep up the great work; you all do amazing content, and, whether or not anyone will ever recognize it, contribute so much to the Caps and NHL generally by offering anyone who follows the game so many different ways to look at/unpack what is the most exciting sport out there.”
Thank you, Joe. That’s a wonderful question, and you framed like a thousand times better than Ian, who tries to troll me with this topic every few months.
Why do so many statistics uses exclude everything but 5v5 even strength, and is that a flaw? Well, it is and it isn’t. A lot of our goof-ups regarding statistics occur when we ask them to do things they weren’t built for or when we fail to consider the context that informs objective measurements. To paraphrase Rob Vollman, stats should be the beginning of the conversation, not the end.
Plus/minus isn’t a bad stat. I know, I know, but follow me on this. Plus/minus isn’t a bad stat if what you’re looking for is the net number of goals scored while a player was on the ice during certain situations. The trouble occurs when we utilize that stat to be equivalent with a player’s impact or talent. In those cases, we’re using the wrong stat to describe something: i.e. the quality of a player. The same thing happens with Corsi and other possession stats.
I’ll come back to the 5v5 thing, but let’s start by identifying what we want to measure and why, and how one stat might be good or bad at that task.
If you want the best possible statistic to understand the totality of goals a player has ever scored, you can’t do better than summing up of all the goals that player has ever scored. There’s no way to better understand the volume of goals better than to simply express that volume with a number. That’s obvious.
Except, wait. We should exclude shootout goals because they’re not from actual game time. And playoff goals and preseason goals and whatever the player scored while playing Juice Boy after practice.
We probably did that because those goals might “pollute” the number; we’d have a less pure understanding of the player if he counted preseason goals among his total goals.
Another way of thinking of it: the thing we wanted to measure wasn’t as simple as “all goals ever,” but rather “all goals within this sample that we find meaningful.” Some situations are more meaningful to certain questions we ask. We want our stats to be precise to the purpose of the question.
That’s kind of the same principle we use when we limit some samples to 5v5. 5v5 is “pure” hockey in a sense. It doesn’t have the amped up goal rates of a power play or the constant opponent assault we see during the PK. If a penalty-killing specialist’s time were mingled with his 5v5 stats, he might look like a far worse player because he spends more time playing “harder” hockey.
So limiting stats (or at least some stats) to 5v5 allows us to mitigate that distortion. It’s not perfect though. If a certain player is protecting a lead for a disproportionate amount of his 5v5 ice time in a smaller sample, he’d likely look less awesome– taking fewer offensive risks and often sitting back.
That doesn’t mean we should ignore non-5v5 ice time, although often we (read: I) do just out of convenience or laziness. It means we should consider the limitations and distortions of the sample as well as the construction and purpose of the stat itself.
When we limit stats to 5v5 we might get the benefit of less distortion inside our sample, but we’re also not seeing the totality of a player’s contributions. The best example is obvious: Alex Ovechkin drops from first in goals scored in all situations over the last five years to just sixth when you limit it to 5v5. He’s better than that.
So while we get a better and deeper understanding of what’s going on in those portions of games by considering them in isolation, we will also lack a broad understanding of the player until we look at every portion.
There’s more to it than this– it’s plausible that team effects are stronger during special teams than they are during 5v5, so it might be true that 5v5 gives us a better (but never perfect) indication of a player’s talent, contribution, or outcomes than all-situation stats would give us. It also could make sense that a player’s and team’s 5v5 stats will be more reliable in projecting future all-situation playoff success given the relatively small sample of special teams in the postseason.
But I’d like to end where we began: what stats are we using and why. Sometimes it feels to me like the metrics we use get interpreted as being absolute or perfectly descriptive when they’re not intended to be anything of the sort. My snapshot series is filled with comments by people who reject the idea of shot-attempt percentages while a player is on the ice during 5v5 in close games being a definitive representation of a player’s skill. To which I would say, yeah, me too.
Some stats have flaws in their construction (plus/minus). Some have sample biases (shot-attempt percentage). Some have widespread interpretation problems (hits). Some have epistemological shortcomings (any WAR-type stat). Some we use primarily because of the data sets that are available (shot attempts vs zone time). Some we use because they make more intuitive sense than others (percentages vs 60-minute rates). I think it’s healthy to countenance all that stuff when sifting through the numbers, though I admit it can be tedious and I don’t often do it correctly.
There’s a movement in hockey stats to express players with a single stat. It’s a fascinating and worthwhile arm of research, but if I somehow had control over hockey stats pedagogy, I’d also encourage the opposite. The more numbers we have to understand a player in different situations, the more our minds get comfortable with having lots of nuance and context and mitigations in our analysis.
Put another way: numbers may be absolute, but our interpretation of them needn’t be.
Thanks for the donation and thanks for writing.
Russian Machine Never Breaks is not associated with the Washington Capitals; Monumental Sports, the NHL, or its properties. Not even a little bit.
All original content on russianmachineneverbreaks.com is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)– unless otherwise stated or superseded by another license. You are free to share, copy, and remix this content so long as it is attributed, done for noncommercial purposes, and done so under a license similar to this one.
Share On