Capacity Planning For LAMP or more Flickr-innards

From the “I wish I attended this talk” department. When I attended Dathan’s talk about Federation at Flickr, he did mention that the next day there was going to be a really interesting talk by John Allspaw, who is an Engineering Manager at Flickr. John has posted up his slides, which I might add, are quite an interesting read (look at the speaker notes, they sort of give a heads up as to what you missed).

Its a pity no one made notes of this on Planet MySQL, so if anyone did attend the talk and you did take notes, please do place them online! This talk isn’t so much teaching you about capacity planning, its really more about all the pointers you can take away, about monitoring, graphing statistics, the fun of deployment, and of course some fun Flickr statistics (on slide 6, 18).

Pictures are fun! Capacity is not the same as speed (and it doesn’t mean performance). He mentions that you probably don’t want to read up about queuing theory (and should probably forget about benchmarks), because its mostly irrelevant to the real world. Testing in production is good, so don’t be afraid (I guess this is why Flickr has built-in notifications now).

Tools of interest: Ganglia, for pretty graphs, rrdtool, memcached, GraphClick (OS X only, and I wonder why they only get MRTG information in graphs, with no raw data at Yahoo!?). For deployment they use SystemImager (oh, I remember using this – it gets “fun” when you try to image dual-boot machines, but thats another story), and Subcon (interesting, I wonder how this compares to Slack).

Technorati Tags: , , , , , , ,

One Comment

  1. Norby says:

    Re: MRTG raw data, it’s all managed and hosted by a completely separate team (network operations), with servers on other networks, etc. etc. Very easy to get at the graphs, not as straightforward to get at the raw data. Possible, but the MRTG graphs (let alone the RAW data) aren’t typically consumed by most groups @ Y! unlike Flickr, so it hasn’t had the same level of exposure/visibility.

    Plus with graphclick you can do quick one-offs that don’t involve hunting down and parsing mrtg or rrdtool output. Probably the right way is to make that data more easily exposed and processable.