In preparation for the Scottish independence referendum, Josh Boswell and I were tasked with preparing the usual visualisations: a “balance of power” graphic for the front page, and the customary results map. We had worked together on a similar project during the local elections earlier this year, and we tried to build upon what we’d learned.
Before I get to what we’ve learned from the process, note that we’ve open-sourced the NodeJS server code used to grab XML documents from the Press Association FTP server. Get it at github.com/times/election-server.
Lesson 1 — Websockets are tricky
We had a beautiful bit of functionality provided via Socket.io for updating readers when new data became available. A happy little message would pop up saying “Updating results”, CSS animations happened, was gorgeous stuff all around.
Alas, 2000+ clients connecting to the server simultaneously made it fall over pretty badly — I’d run into “EMFILE” errors within seconds of the server starting, even on a m3.large instance. This resulted in me disabling websockets at 4:00 a.m. on election night. Thankfully, the only downside to this (beyond having to refactor highly-visible live code at four in the morning) was that users had to refresh to get the updated map — see the bit below about S3.
Next time around, I’ll probably have a separate, load-balanced Elastic Beanstalk environment specifically for this purpose, with the main server running by itself and proxying notifications to clients via the Elastic Beanstalk instances. Websockets are bananas, I now know why people use services like Pusher.
Lesson 2 — PhantomJS is rad
Another cool thing that was awesome in development but a slight headache in production was my bit of code that screencapped our map and pushed a PNG of it to S3, to be consumed on the front page. This made our map thumbnail update dynamically over the course of the evening.
However, in trying to fix the websocket issue on election night, I ended up also doing a quick refactor so that process would run independently on a Heroku instance instead of bogging down the main server because spawning PhantomJS isn’t exactly the lightest thing ever.
That said, it was really fun working with it and I plan to do some more stuff with it eventually. Scaling issues aside, I think the auto map screencap script is one of the parts of the server code I’m most proud of.
Lesson 3 — S3 is the best
We heart S3 at The Times because it will stay up no matter how much traffic is thrown at it. Because of this, the election server pushes JSON and XML files from its temporary local FTP folder to S3, where the frontend code grabs it. This totally saved us when the server was melting from too much traffic. S3 is your friend if you’re going to deal with a lot of traffic.
(An aside: Do I have to put “melting” in floating quotes if it’s an instance in the cloud? I’m sure whichever server cluster hosted my virtual machine was, uh, running slightly warmer than normal…)
Lesson 4 — Eastings/Northings are stupid
Seriously, who still uses these? Apparently, the cartographers in the Scottish census bureau who made the country’s shapefiles. It took me and Josh an entire evening to convert to GeoJSON, change the projection and then simplify the >200mb file. I’m working on a bit of nodeJS to make this a two-minute process, but in the meantime, perliedman/reproject might just possibly be the best thing ever for this task.