RIP.
It’s the twilight of the nerds. Along with Cam Charron, Darryl Metcalf (creator of Extra Skater) has been hired to join the nascent analytics department in Toronto. That’s terrific news for them personally and for Leafs fans in general, but it means a massive brain drain for NHL fans.
I expect ExtraSkater.com to be gone for good. Metcalf likely won’t have time to devote to it, and it might represent a conflict of interest to work on it. With that hiring, the Maple Leafs have taken away the single best resource for hockey fans and geeks and writers and coaches. Plus the other 29 GMs.
We can’t allow that. We need to replace Extra Skater.
If you’re not aware, ExtraSkater.com launched last fall as an all-purpose statistical resource for the NHL. The site scraped the NHL’s official data and presented it in a much more useful and convenient way. I was a huge fan and used it in nearly every article I’ve written on RMNB in the last year.
Point of order: the shutdown of Extra Skater, it turns out, seems to have had nothing to do with the NHL updating its websites’ terms of service last week. Here’s what the TOS say now:
For example, you may not:
[…]Engage in unauthorized spidering, scraping, or harvesting of content or information, or use any other unauthorized automated means to compile information;
As Jen LC said (btw, follow her on Twitter; she’s brilliant), the new NHL.com TOS are kind of vague and seem to be targeted more towards pirated video feeds and user-data harvesting rather than stat geeks.
Plus, Elliotte Friedman’s take on the the Extra Skater shutdown explicitly said the TOS change wasn’t a factor:
https://twitter.com/FriedgeHNIC/status/500350935085092865
And Greg Wyshysnki at Puck Daddy seemed to echo Jen’s impression in speaking with league representatives:
The League claims that’s not the case [that they’re opposing stat sites]. They know sites are harvesting data from NHL.com with software; they say the provision was “not added to counter anything that we know of at this moment.”
So it’s clear that a stat site like Extra Skater is not naughty. I’d also propose that it’s essential. It’s Metcalf’s choice if he wants to remove his site, but hockey’s intellectual commons need a resource like it. It creates deeper understanding of the game, it helps writers fill column inches, and it’s vital to comprehensive arguments for staffing changes.
I think a replacement for Extra Skater is needed, possible, and imperative. So let’s do it. Let’s start the conversation right here.
Is this a good idea?
Yes.
Although there may be other entrants into the space, I think we can create something novel and explicitly public. I think the technology (discussed below) will make the job easier. I think Darryl’s site (still available on the Wayback Machine!) is a great model to build on. I think the appetite and passion to create it are there.
It’s a good project that can be done well.
What’s the pitch?
We combine a ton of existing technology to create the Son of Extra Skater (please don’t suggest names in the comments– that should come WAY later).
We’ve already got a lot of advantages:
- The lessons of Extra Skater– both UI and data
- Platforms for interaction and visualization like Twitter Bootstrap and Google Charts API
- Existing scripts for reading NHL play-by-play and time-on-ice data
- Either volunteer labor or crowdfunding
- A centralized source control solution like GitHub
- Cheap cloud hosting like AWS or Azure
Finally, and this is the important part: this would be open source and under a Creative Commons license. The source code for this new site would be hosted on a site like GitHub so that it the community won’t ever lose the accumulated knowledge of the project.
The eventual site would likely sell ads to cover hosting and maintenance.
In addition to supplying similar functionality to ExtraSkater.com (RIP), the new site might also link to torrents to actual data tables for use in applications like Tableau and R. We could increase the level of peer review in the industry by standardizing becoming more transparent.
Who’s gonna do this?
Me, for starters. I’ve got 15 years of experience of building complex systems for thousands of users. I can write requirements, specifications, and project plans. I know the technology and I can make the prototypes. I can organize this thing, but I’m not gonna write the final code.
I’ve already heard from a dozen developers today who have offered to help out, but we’d need a team. We’d need a DBA for the back-end and the data scraping. We’d need a front-end UI developer– maybe two. We’d need an IT person to get the server up. More on this below.
How should we fund it?
It depends on how we build it. I can imagine three options:
- Crowdfund it. Then we’d write specs, make a prototype, build a development team (likely offshore), and make the thing.
- Volunteer only. In this case, the start-up costs would be probably be smaller.
- Hybrid. Raise funds for core functionality and then allow volunteer developers to add to the code base via GitHub.
How much will it cost?
Depends on how we build it. If we were to hire offshore developers, I’d estimate less than $10,000.
If we were to pay everyone for their work, maybe as high as $22,000.
If someone has a big piece of the puzzle all ready (i.e. the back end), then around $15k.
If everyone works for free, it’ll just cost our operating fees.
Those are wild guesses. We’ll find out when we pick a development plan and lock down requirements.
What technology should we use?
I’m open to suggestions, but I think the best bet would be as follows:
- PHP server-side programming language
- MySQL database
- Twitter Bootstrap for user interface
- Google Charts API for data visualization
- GitHub for source control and project management
- Amazon Web Services or Azure for hosting
Plus a whole lot more. I know Darryl used most of that tech for ExtraSkater.com. I’m familiar with all of it, and based on what I’ve seen most of the volunteers are too.
And it’d be all open?
Yes. The source code would be free for anyone to use or copy or whatever. That source code would be around even if someone dies or gets hired by Edmonton (same thing). The source code would be free and open source, available for download on GitHub. I would not own it, and RMNB would not own it. It’s for the people.
(The site would be ad-supported– that’d help us pay for hosting and maintenance.)
Of course, anyone could download the source code and spin off their own site whenever they feel like it.
I’m in. What’s next?
Well, first we gotta build consensus. Does anyone think I’m full of it? Is there something I’m missing? Are we okay with building something and then giving it away? Do we wanna build it ourselves or just raise money and then I’ll hire some developers to do it? Does anyone have any existing technology we can build off? Is anyone planning a similar project? Should we join forces with them? Does Darryl just wanna put his source code up on Git right now and we’ll forget this ever happened?
Once we’ve made those decisions, here’s how I’d see us going
- Write requirements. They’d be public and transparent, perhaps authored by me, with as many collaborators and commenters as are interested.
- Develop a prototype. Make a clickable model of the interface to validate our requirements and provide a resource for developers.
- Develop back-end functionality. This is the stuff users don’t see. We’d use existing scripts to get and organize NHL data, then store it in a useful schema, and then optimize it to be used by the UI. This is maybe the hardest part.
- Develop the front-end functionality. This is the actual interface that everyone uses. Like ExtraSkater.com, it’d rely heavily on functionality provided by the Bootstrap platform.
- Develop iteratively additional functionality. Bells and whistles.
- Test and launch! Kegger at my house afterwards.
For now, I want to open up the conversation. Let’s use the comments below to exchange ideas.
If you’re interested in volunteering labor, tell us who you are what you know. I’ll build a spreadsheet of everyone’s contact info and expertise.
If you’re interested in donating, let us know. It’ll help us gauge interest.
We can do this.
