Horizontal Scaling with HiveDB

At the MySQL Conference & Expo 2008, Britt Crawford and Justin McCarthy, both from Cafepress.com, gave us a very interesting talk on scaling with HiveDB. I took a few notes (pasted below), their slides are online (warning: 6.1MB PDF), and if you’re after their abstract its available as well.

I also took a video of them (refer to Slide 12, for the IRC conversation):

The quick notes:

  • OLTP optimised (as it serves cafepress.com)
  • Cannot lock tables, or take it offline
  • Constant response time is more important than low latency (little slower query is ok, just not exponentially slower)
  • Queries run might return wildly sized result sets.
  • There can be growth and usage hotspots. You cannot predict this at all.
  • Partition by key (the set of all partition keys is the partition dimension)
  • Partitioned Hibernate from Google (Hibernate Shards). HiveDB is now married up with shards.
  • Thought about MySQL Proxy to support high availability components, but it was dismissed