Archive for April 2007

Capacity Planning For LAMP or more Flickr-innards

From the “I wish I attended this talk” department. When I attended Dathan’s talk about Federation at Flickr, he did mention that the next day there was going to be a really interesting talk by John Allspaw, who is an Engineering Manager at Flickr. John has posted up his slides, which I might add, are quite an interesting read (look at the speaker notes, they sort of give a heads up as to what you missed).

Its a pity no one made notes of this on Planet MySQL, so if anyone did attend the talk and you did take notes, please do place them online! This talk isn’t so much teaching you about capacity planning, its really more about all the pointers you can take away, about monitoring, graphing statistics, the fun of deployment, and of course some fun Flickr statistics (on slide 6, 18).

Pictures are fun! Capacity is not the same as speed (and it doesn’t mean performance). He mentions that you probably don’t want to read up about queuing theory (and should probably forget about benchmarks), because its mostly irrelevant to the real world. Testing in production is good, so don’t be afraid (I guess this is why Flickr has built-in notifications now).

Tools of interest: Ganglia, for pretty graphs, rrdtool, memcached, GraphClick (OS X only, and I wonder why they only get MRTG information in graphs, with no raw data at Yahoo!?). For deployment they use SystemImager (oh, I remember using this – it gets “fun” when you try to image dual-boot machines, but thats another story), and Subcon (interesting, I wonder how this compares to Slack).

Technorati Tags: , , , , , , ,

Digg.com scales; Japanese Character Set; Data Warehousing

I missed a couple of talks that I’d really have liked to attend, for various reasons (probably the fact that at the MySQL conferences, staff also have a tonne of meetings and customers/people to meet). Thanks to the great bloggers, I don’t feel so bad for missing such talks. And for the ones with no blogs and notes, well, I’ll just hope and dream up the fact that sometime in the future, there will be video recorded sessions, available on the same day on the Internet, in OGG (like was done at linux.conf.au 2007).

Eric Lai has an interesting article in ComputerWorld titled: How Digg.com uses the LAMP stack to scale upward. And Mike Kruckenberg had some of his notes, which I think are also useful. Considering Digg.com don’t really publish information about their technology, or anything for that matter, its interesting to see how they’re scaling and running “the modern Slashdot”. Eli White also has the slides he used, at his website for his Technology @ Digg talk (yay! Only in OpenOffice.org, so go download it). He also has a blog thats worth following.

Sheeri Kritzer has some great information about the Japanese Character Set. Concise notes, its like I didn’t miss the talk. Will prove useful, as I plan on going to the MySQL Conf in Tokyo, right before our developer meeting this year.

Brian gives great tips and tricks on data warehousing, and I was a little upset I didn’t visit his talk. Again, Sheeri comes to the rescue: Data Warehousing Tips & Tricks.

Spoke to Eric Bergen a little while ago after drinks, and he was wondering how come I didn’t post about his tutorial with Jeremy Cole, on MySQL Scaling & High Availability Architectures. I do have a tonne of notes, that I’ve got to actually type up (yay, I have work to do on the plane ride home), but it’ll probably be useful to actually get the tutorial notes. My own notes will come later…

Technorati Tags: , , , , , , , ,

MySQL at Google

MySQL: The Real Grid Database, Mark Callaghan, Chip Turner

A tremendous amount of work also done by Wei Li and Gene Pang.

Google has a large MySQL deployment, and they enhance it as needed.

MySQL@Google: too many queries, transactions, data, and rapid growth. Real workload with OLTP and reporting. Workload at Google is *critical*.

The well known solution is to deploy a “grid database”:

  • use many replicas to scale read performance
  • shard your data over many masters to scale write performance (vertical partitioning of data)
  • sharding is easy, resharding is hard

Large number of small servers, not much capacity lost when a server fails, support as many servers as possible with a few DBAs.

Manageability is important at Google – make all tasks scriptable. Gives you time to solve more interesting problems, and also support hundreds of servers with one DBA.

Google prefers under-utilizing servers – better to have 3 servers at 50%, rather than 2 servers at 75%. Less maintenance, less tuning, load spikes tolerated better.

Monitor everything you can: vmstat, iostat, MySQL error logs, /var/log/messages, SHOW STATUS & SHOW PROCESSLIST output, etc. Archive it for as long as you can. And automate all this as much as possible. Allow to query/visualize data from the archive. There are tools out there, and Google has some internal ones that they use.

They tend to not store the logs in a database. Its more efficient to bzip it or something (I’m thinking the ARCHIVE storage engine might be appropriate for them, possibly).

Many fast queries can be as much of a problem as one slow query.

mypgrep is an open source tool that Google has released.

Changed MySQL to count activity per account, table, and index:

  • SHOW USER_STATISTICS – Displays per account activity
  • SHOW TABLE_STATISTICS – each table, number of rows fetched/changed
  • SHOW INDEX_STATISTICS – rows fetched per index, find indexes that were never used

MySQL High Availability, with even brighter future – with DRBD for instance. Cluster and Replication rock, however they need features right now. Committed to InnoDB and MySQL Replication

Zero transaction loss on failures of a master. Minimal downtime on failures of a master (if they can get downtime within 1-2 minutes, they’ll be happy). Reasonable cost in performance (added latency for a single workload), and dollars.

Readers and writers don’t lock each other.

Failure happens everywhere: OS (kernel OOM or panic), still running a lot of 32-bit servers, mysqld crashes due to code they themselves write, corrupted write (so InnoDB checksums rock), file system becomes inconsistent after an unplanned reboot (they use ext2), bad RAM, people (! rebooting by mistake :) ).

Features they want to see in MySQL: synchronous replication as an option, product that watches a master and initiates a failover, archives of the master’s binlog stored elsewhere (in case the master becomes unreachable), state stored in the filesystem to be consistent after a crash (a more modern FS besides ext2 will make this better).

And everyone’s been pretty much talking about how Google is contributing to MySQL, with some interesting twists to why they’re contributing back and open sourcing their code. google-mysql-tools.

Technorati Tags: , , , , , ,

Extreme Makeover: Database or MySQL@YouTube

Arguably one of the most interesting keynotes (and technical to boot!), Paul Tuckfield not only entertained us in his 40 minute keynote, he also did so outside when the keynotes ended.

Just the DBA at PayPal, just the DBA at YouTube. Only 3 DBAs at YouTube that make it all happen. Only a MySQLer for ~8 months (Oracle for ~15 years). So guess PayPal is a Oracle shop.

MySQL is one (important) piece of the scalability picture.

Technologies: Python, Memcache, MySQL replication. Praises Python, a lot (its much quicker, than C++, to implement goodness).

Click tracking on a separate MyISAM site. But Read/write on InnoDB, using replication. Far more reads than writes at YouTube

4x2ghz Opteron core, 16GB ram, 12x10krpm scsi – constantly crashing, replication saved them

5.0 “mystery cache hits” – when you export and import (mysqldump and load back into 5.0), you boost your performance, rather than if you upgrade in place, because there’s a compact row format. They moved from 4.1 -> 5.0.

Cache is king. Writes, cache by RAID controller rather than the OS. Only the DB should cache reads (not raid, not linux buffer cache)

Software striping atop hardware array.

The oracle caching algorithm – in academia. Not something I’ve heard much about, and definitely need to look into it further.

The talk was too long, but would make a most interesting read, and an actual presentation rather than a keynote. I hope his presentation makes it online, sometime soon.

Note-to-entrepreneurs: If building a web business, and you want to be acquired by Google, its quite largely possible that their due diligence includes “python” compatibility. Most of their released tools, are all python-related or based. Oh, and make sure you use commodity hardware (in fact, do that if you want to get VC funded, even.)

Update: A little note on the oracle algorithm. If anyone has papers, and more credible links, please do drop me a line.

Technorati Tags: , , , , , ,

Lightning talks with Community Contributors

I think this was a really interesting talk (because of all the contributors talking), and my only minor complaint was that it was up against some really good talks, and we didn’t get more people showing up to a talk that was very largely on the great Architecture of Participation. It also is interesting, as it goes to show that blogging can get you good rewards – most of everyone listed below, is a somewhat active blogger.

Martin Friebe – bug reports, patches
Why? Its just cool to contribute. Improves your knowledge. MySQL rewards you (named on the website, Enterprise, etc.).
How? Write code. Look for limitations. Just use MySQL.

Peter Zaitsev
Hates submitting bugs, but he needs a bug free MySQL for himself and customers. Therefore, report them, and scream loud!
Be an early adopter.
Regular hardware, for storage engine benchmarks. Patches, and other cool bits for MySQL.

Sheeri Kritzer – blogger, user group meetings, podcasts
Bugs, but contributing is not only technical. “Just do it” (in terms of user groups)
You set your own deadlines, and you look like a hero when you’re a community member, as opposed to it being your job.
Don’t overcommit: back out earlier, rather than later
“chronic volunteer”

Paul McCullagh – PBXT storage engine
While testing PBXT, he found a few bugs, and thats how he became a Quality Contributor. He didn’t get such a status by writing PBXT. I do think thats wrong, and maybe MySQL needs to drop Quality, and just have it as the Contributor program?

Baron Schwartz – innotop, blogger
MySQL is not perfect, and he misses a lot of Microsoft SQL Server’s tools. His motto is “don’t complain, do something about it.” And the opportunity is obvious.
innotop started as an InnoDB transaction monitor, sort of like mytop for InnoDB.
Next, MySQL Toolkit.

Beat Vontobel – blogger
User since 3.23, active since 5.0-alpha – lots of new features to blog about and a lot of bugs to post about. Surprised that most of his bugs got fixed very quickly. Blogging as a means of sharing knowledge.
Advises to be a customer, as bug reports are free, but if it hits the internal bug database, you’re set at it getting fixed quicker

Yoshiaki Tajika – NEC Japan, MySQL Customer Support
3 years ago, NEC began to support MySQL.
He likes the bug reporting system, as compared to the Microsoft SQL system – bug reports posted anytime, without any cost, and talking with the developers directly happen.

Mike Kruckenberg
Find things that are interesting, write about it, report it, change it.

Jeremy Cole – bug reports, patches, blogging
SHOW PROFILE in 5.0.37! DorsalSource. Builds of MySQL with patches and other interesting stuff. Go to website, upload patch, and you get builds on many different architectures.

Bill Karwin – prolific forum poster!
SHA2() patch – comment that federal government wanted sha2 support in all applications in govt. Then he felt bad, so he wrote a feature! Passes the tests, and this is how a feature got enabled.

Works at Zend Technologies, doing Zend Framework supporting MySQL. Writes articles on Forge.

Ask Bjorn Hansen
Used MySQL since around ’96-1997. Started with mSQL first! Everything thats paid for his bills for ~10 years, have relation to MySQL!
Read the *excellent* documentation (a few times)!
“File a bug a week” goal – this is way too easy. Install a new Linux distribution, read the documentation and file away!
3 underrated MySQL features: Standards!
1. Timezone support (save all your date/datetime columns in UTC)
2. Unicode support (he wants an application that he can place his name in the right way!). Tmp table memory requirements go up, but its OK…
3. Use strict mode! (STRICT_TRANS_TABLES and if brave use STRICT_ALL_TABLES)

Technorati Tags: , , , ,

A Storage Engine for Amazon S3

A Storage Engine for Amazon S3, Mark Atwood

It looks mighty interesting, as transfers to Amazon S3 are free. I think it’ll work well in America and places where bandwidth rocks, but I don’t see this working too well in Australia. Oh how I wish the Internets will improve.

Mark has got all his stuff online at A MySQL Storage Engine for AWS S3. He was also kind enough to upload most of the notes, which made my reporting easier, and don’t forget to view the presentation.

Traditional storage engines use the local disk.

Networked engines: Federated, ODBC, HTTP, MemCacheD and S3 storage engine.

What is S3?
Contents can have 1 byte to 5GB. Amazon has several petabytes of storage for you to use :-)

Owning your own disks kind of suck. Pay for storage, even before you use it.

“An empty disk costs the same as a full one” – pay a lot of money to put disks in data centers. RAID isn’t “excellent”, then what about disaster recovery?

Can’t move existing database over, and the S3 storage engine isn’t ready for a full schema yet. There are hacks that allows this, but maybe it will be available next year.

Over a billion items in a bucket, and they all come back in good time.

A bucket is fully virtually hosted, you get a SQL CMS in the MySQL server. Save your EC2 work.

S3 is very distributed (geographically) and asynchronous. Writes are replicated, so your data may be reordered (and delayed). So there are no temporal guarantees.

Use the WHERE clause – otherwise it will do a full table scan, and you’ll be paying Amazon lots of money :-)

The talk ended with about 20 minutes to spare, and I do certainly hope he hacks on it more for the next year. He’s also soliciting feedback, so try it out if you can. And now, to run to the remainder of the talk on Highly Available MySQL Cluster on Amazon EC2! Two Amazon talks, with emerging technology goodness, at the same time? Pfft.

Technorati Tags: , , , , , , ,


i