A Storage Engine for Amazon S3

A Storage Engine for Amazon S3, Mark Atwood

It looks mighty interesting, as transfers to Amazon S3 are free. I think it’ll work well in America and places where bandwidth rocks, but I don’t see this working too well in Australia. Oh how I wish the Internets will improve.

Mark has got all his stuff online at A MySQL Storage Engine for AWS S3. He was also kind enough to upload most of the notes, which made my reporting easier, and don’t forget to view the presentation.

Traditional storage engines use the local disk.

Networked engines: Federated, ODBC, HTTP, MemCacheD and S3 storage engine.

What is S3?
Contents can have 1 byte to 5GB. Amazon has several petabytes of storage for you to use :-)

Owning your own disks kind of suck. Pay for storage, even before you use it.

“An empty disk costs the same as a full one” – pay a lot of money to put disks in data centers. RAID isn’t “excellent”, then what about disaster recovery?

Can’t move existing database over, and the S3 storage engine isn’t ready for a full schema yet. There are hacks that allows this, but maybe it will be available next year.

Over a billion items in a bucket, and they all come back in good time.

A bucket is fully virtually hosted, you get a SQL CMS in the MySQL server. Save your EC2 work.

S3 is very distributed (geographically) and asynchronous. Writes are replicated, so your data may be reordered (and delayed). So there are no temporal guarantees.

Use the WHERE clause – otherwise it will do a full table scan, and you’ll be paying Amazon lots of money :-)

The talk ended with about 20 minutes to spare, and I do certainly hope he hacks on it more for the next year. He’s also soliciting feedback, so try it out if you can. And now, to run to the remainder of the talk on Highly Available MySQL Cluster on Amazon EC2! Two Amazon talks, with emerging technology goodness, at the same time? Pfft.

Technorati Tags: , , , , , , ,


i