Sharding for the masses: Introducing the SPIDER storage engine (OpenSQLCamp @ FrOSCon)

Posted on 27/8/2009, 10:45 pm, by Colin Charles, under MySQL.

This is the Sharding for the masses: Introducing the SPIDER storage engine by Giuseppe Maxia, given at OpenSQLCamp, at FrOSCon, in August 2009. These are somewhat live notes, and the slides are available too.

Sharding for the masses

View more documents from Giuseppe Maxia.

Why sharding? Scaling, of course. The MySQL way to solve this, is replication (even Yahoo! and Google use this).

When the master doesn’t have enough resources to cope with what you do (i.e. large data sets), replication chokes.

You can use proxies for sharding. There exists MySQL Proxy (can be programmed using a scripting language – Lua), HSCALE (built on top of MySQL Proxy), SpockProxy (a fork of MySQL Proxy, without LUA scripting, specialised for sharding), in the market these days. This however, is the single point of failure – everything has to pass through one proxy.

Enter SPIDER – a MySQL storage engine, built on top of the partitions engine. It associates a partition with a remote server, and is transparent to the user. Its developed by Kentoku Shiba.

Installation: Get 5.1.37 sources, then get the source code for Spider 1.0, and then get the patch for condition pushdown.

Why the condition pushdown patch? Remote server works less, by receiving the condition. The SPIDER engine without the condition pushdown patch is still fast, but it can be more than 10x faster with condition pushdowns.

http://dev.mysql.com/doc/refman/5.1/en/condition-pushdown-optimization.html (works with NDBCLUSTER), http://dev.mysql.com/doc/refman/5.4/en/condition-pushdown-optimization.html (works with MyISAM). The patch by Kentoku, will add cond_push and cond_pop, to ha_partition – so now, every storage engine that uses table partitioning can get condition pushdown through ha_partition.

You need to setup the engine first: http://datacharmer.org/downloads/spider_setup.sql (the SQL is also available in the DOCS).

spider_remote_employees.sql – use this in conjunction with http://launchpad.net/test-db/ – a good example of how to use the SPIDER storage engine.

Tags: FrOSCon, MySQL, OpenSQLCamp, sharding, spider, storage engine

About

Colin Charles is a businessperson who's big on opensource software. Follow @bytebot on Twitter.

I was previously on the founding team of MariaDB. In previous lives, I worked on MySQL, The Fedora Project, and OpenOffice.org.

This is a personal web log, and the opinions here in no way reflect the opinions of my past, present, or future: clients, employers, or associates. Standard disclaimers apply.
Contacting me? Have a private comment? You can send email to ccharles@gmail.com.
Tags
Advertising apple book Community conference e-commerce event facebook FOSDEM google iOS iPad iphone Life with Rona Linux MacOSX/Apple malaysia management MariaDB meetup mobile movie music MySQL mysqlce09 mysqlce2009 mysqlconf nokia opensource Oracle oscon Percona Percona Live Percona Server quote SkySQL Social Media social networking startup storage engine Sun twitter Ubuntu video Wordpress
Pages
Categories
- Books (7)
- Business (62)
- Databases (418)
  - MariaDB (139)
  - MongoDB (6)
  - MySQL (406)
  - Percona (15)
- Distributions (8)
- General (977)
- Input (1)
- Malaysia (40)
- Opensource (21)
- Tech (58)
- The Malaysian Scene (6)
- Travel (43)
- Work (4)
Archives
Archives
Blogroll
Meta
Search for:

Colin Charles Agenda