Archive for April 14th, 2008

Memcached and MySQL tutorial

Monday, April 14th, 2008

Memcached by Brian Aker, Alan Kasindorf (dormando). Here are some quick, somewhat sparse notes. Follow the slides, it will help.

Slides: http://download.tangent.org/talks/Memcached%20Study.pdf

Memcached was actually created for LiveJournal. It has evolved a bit over time. Chaos to user based clustering, and then Brad implemented memcached. LiveJournal has about 30GB of cache available between 8-12 machines. The DB reads were down like 10x the moment they started using memcached (its much better now).

Its not only for simple objects (not just a single row)- you can use it for complex queries, and the result can be stored in memcached. Eins.de, Patrick Lenz, is also the freshmeat.net guy. He put memcached on the same machine as the MySQL database server (he has 32-bit machines, and MySQL can only use a certain amount of RAM, so the rest was for memcached). This is definitely not the recommended way. Have separate memcached servers.

PatG comes up to talk about Grazr, which is more of a write-through cache. Refer to Page 8 of the slides. Now, the thought is that maybe Pat should’ve used gearman, rather than writing their own software. Memcached has allowed them to do it asynchronously. They’re using bulk inserts now as well.

DownUnder GeoSolutions uses lustre, which is a clustering filesystem. They’re not a web-based solution. They extract data off lustre, store it in memcached. Processing happens on the memcached RRU.

memcached by itself does very little. There’s a simple daemon, and it responds to gets/sets/add/replace. It sits on top of a very simple slab allocator. Everytime you called it, it ran malloc() and it would free() it when done, during the early days. So, now, it makes one slab allocator for different types of objects.

memcached is event based. libevent is a generic wrapper around epoll/kqueue, and its very scalable for network connections. 10,000 connections to a memcached, is ok - it only cares for how many of them are “active”.

The protocol is very simple. Everyone hates it, but everyone uses it. You can even fire up telnet to talk to memcached. Its very easy to write to protocol.txt and to talk to it.

memcached? A big stupid hash table. In a grid, its a distributed hash table. memcached is 2 hash tables - from client, and one in the server. 30 memcached’s don’t need to know about each other - they’re blind from each other. There is no cross traffic. You just add more servers, to scale up.

Clients hash keys to the server list. Take a single key (250 bytes max), the client hashes it. You have a value, you want to access it, here’s a key. There is multiple hashing going on, as some clients do things like compressing data.

How do I dump data? You don’t. Its a cache.

How is it redundant? Its not. The server itself doesn’t know about other servers around it! PECL and the next version of libmemcached will understand replication. The redundancy happens in clients.

How does it handle failover? It doesn’t. If it dies, it dies. A client can of course, handle it.

How does it authenticate? It doesn’t at all. Don’t stick one of this, open faced, to the Internet - when you connect to it, you have full access to any commands in the server and all contents in the server. You don’t want folk just typing flush in the server ;)

A very simple service, very simple server.

Details on the Server? Page 14, is pretty much all the commands you can use in memcached. You can run this from telnet, even
- set operation throws data inside memcached (it doesn’t care if there’s other data in it)
- add is lightly atomic - it won’t add data that is already there
- stats can give you particular pieces of information, or give you a full dump. Hit ratio, cache efficiency, and lots more, can come out of this

All drivers you are seeing, are just basically extending all these commands. cas (compare and swap atomic!) today is pretty limited

memcached can even run on FreeBSD 4. Most people run memcached on Linux. No one has deployed memcahced on OSX in the audience.

There’s MySQL integration. Most users grab object from database,
store object to memcached. The UDF memcached functions are probably the most successful UDF in MySQL’s history :)

There’s pgmemcache() for Postgresql, but not much is known about it

Apache - mod_memcached, has CAS operations exposed. Different to the lighttpd implementation.

There are limitations (page 23). If you wanted to change things, you can recompile memcached, but you might not want to do that. Largest slab class in the system, is 1 megabyte. So data size is under 1 megabyte. Beware if you’re running on a 32-bit system (going over 4GB and you will segfault). A 64-bit system should be fine, in general.

memcached supports threads, thanks largely to Facebook. You probably don’t need this, unless you are Facebook. Memcached’s CPU footprint is tiny.

If you gave memcached 16GB, you will not get your memory back, even if you run flush. The memory is permanently allocated from the OS (much like how Vista does things?). There is mlockall() support, so you can guarantee there will be no paging. Or just disable swap.

jallspaw: memcached1: 22:02:00 up 992 days, 11:57, 0 users, load average: 0.35, 0.37, 0.37

(posted on IRC at #mysqlconf). memcached hardly every crashes.

You can disable the LRU if you want (there’s a command line option for this).

Hashing comes in 2 flavours - normal and consistent hashing. All drivers support CRC today.

A consistent hash means, that instead of doing a modular divide, you can interlace among many servers across the network. When you have a 100 servers running and add a server into the network, you want to add a server, and not lose the entire cache network at once.

libmemcached can do replicas, so it can take data from servers, and apply it to the ring. So if a server is taken out of the network, it can be found elsewhere on the ring. You can keep these networks up and running, and easily growing, with new servers, without losing cache coherency.

Don’t only look at the return value, look at the fact that zero may actually be a credible value, even. An actual value of zero, versus a “we didn’t find anything” is very different.

Slide 35, the ghetto locking implementation for memcache-client. Creates a pseudo-lock around a process. You’re the only process thats processing this area, so you add a key lock, where you ensure you test for nil, not zero (you’re testing for the existence of the lock). If your process dies, someone else will try in 30 seconds (lock expire). Add will only work if there’s no key existing at that point (remember, an add is not a set).

PHP is probably the best supported language, for memcached. PECL memcached library is C backed, standard, and works fine. libmemcached will probably take over most of its features, eventually, but its not there yet now.

Default, if you call increment by a key, it bumps by one. You can also step it up instead of 1, say like 500 or something. Refer to slide 41. Just like you can increment a key, you can decrement also.

C/libmemcached. C driver, there’s a C++ wrapper. Sync and async cached keys. It supports replication through the network. Has read through cache support.

You can not only store a value, but you can also store flags. Flags to keep track of generations, keep track of MIME type internally (so not only store object type, but MIME type). This is unique for libmemcached. Most other drivers use this flags value to see if its compressed or not (the flag = 1 for compression, 0 for no).

Multiget is 7-9x faster than just a get. Look at Page 48 for an example.

Memcached for MySQL? Uses the UDF API. You can now incorporate most of the memcached stuff, in the SQL server, so you can do deletions and get operations easily.

http://tangent.org/586/Memcached_Functions_for_MySQL.html

What do you think about persistent connections? Use them. libevent supports them.

Spaces to watch: MogileFS. HyperTable. HBase. People have stopped talking about POSIX filesystems, and are more talking about object filesystems. Its what all the cool kids are doing.

Technorati Tags: , , , , , , , , , ,

Keeping track of all the web resources at the MySQL Conference

Monday, April 14th, 2008

During this week, expect there to be a flood of posts on Planet MySQL. If you’re using an offline RSS reader, you might not even get all the RSS feeds (might I suggest something like Google Reader?). If you want to be informed as and when something hits Planet MySQL (say, on your mobile phone or via IM), there’s a Twitter feed available - Planet MySQL on Twitter!

Also, on the wiki, I’m trying to keep track of all blog posts (notes) from the conference. Please do help, if something slips to the cracks. Its organised by day, and and by talk topic. All slides will obviously make it to the conference page after the conference is over, so there will be further linking a little later…

If you’re hip and have a Facebook account, do become a member of the MySQL Conference and Expo 2008 group. It might be a new way to network. Very similar to the LinkedIn group for MySQL Speakers and Presenters that Ronald recently talked about. Just a much lower barrier of entry…

Technorati Tags: , , , , , ,

Chris Blizzard on Mozilla

Monday, April 14th, 2008

Chris Blizzard, now working at Mozilla and Linux integration, gave a most interesting talk, about Mozilla, and their new mobile initiatives. We managed to speak (but not nearly enough) about the mobile strategy afterwards (i.e. I think limiting it to the n810 or tablet like devices alone, seems myopic; phones are where its at), and I hope the conversation continues. Now for some quick notes.

- mozilla.org, is where products create motion. Been around for just over 10 years now
- Mozilla targets human beings (not developers)
- Focus on protecting open standards
- “Creating Joy!” for users
- Avoid feature creep (this is the secret of add-ons) - control the product, and just say, go build an extension. It isn’t just about customising your experience, but its about keeping the core experience joyous and uncluttered.
- Fix real problems on the web (i.e. pop-up blocking)
- 500 contributors to Firefox 3, 75 Localization teams, 200 people, 11,000 patches, 165+ Million users, added +45 million users in the last 6 months, and doubled in the last year - these are impressive statistics (I for one, am impressed by their developer community)
- Who are we targeting? Read Seth Godin’s blog entry “Why downloading Firefox is like getting into college“. Also, Stephen O’Grady’s Blog “Ode to the Common Man
- Bring the full web to mobile. FF3 is where great technology for mobile exists.
- Apple has reset the idea of what the Internet on a mobile should be, thanks to the iPhone. They’ve definitely opened up the market for mobile based browsers. Note, no reason to redesign your website for mobiles in the future…
- Fennec - mobile browser experience
- Performance numbers on the n810 - faster than MicroB and WebKit. Not even optimised for ARM (i.e. no atomic locking), but already at a headstart
- Fennec will support add-ons. Touch and keypad versions are coming soon… Keep in mind all this is just getting started
- Android includes WebKit as part of the base platform. Mozilla on Android? Not quite yet, since Google wants only Java based applications. No mention of native applications yet from Google.
- Not really considered Series 60 (it would be nice), no talk of PalmOS, there is some form of Windows Mobile version, but its not released
- Gecko is hard to embed, in comparison to WebKit. The technology needs to improve, so that the gap that WebKit has, doesn’t widen further

Technorati Tags: , , , , , , , , , , , , ,

My favourite bugfix in MySQL 5.1.24-rc

Monday, April 14th, 2008

I’ve been using MySQL 5.1 a lot more of late. Also, as of about a month ago, I’m now a Mac OS X user, so tend to use MySQL on OS X Leopard 10.5 a lot more for testing. I’ve found a rather annoying bug (in 5.1.23-rc) that is fixed in the current source tree, and will be in 5.1.24-rc…

What’s annoying me? The fact that Control+R (which allows one to save typing, because you can go through the search history of ~/.mysql_history) segfaults the MySQL client. At first I thought something was wrong with my install when I saw the infamous “Segmentation fault” error. Turns out, its just mysql#33288.

I always hit ctrl+r without even thinking… Its just a shortcut ingrained in my fingers, because I predominantly use a shell. So, Mac OS X users can rejoice soon, as 5.1.24-rc is surely around the corner. In fact, there are numerous improvements, just read the changelog.

Technorati Tags: , , , ,

Compiling MySQL UDFs on Mac OS X

Monday, April 14th, 2008

Compiling and installing a User Defined Function for MySQL on Mac OS X seems tricky. There are installation notes, but they seem to be sparse on OS X (the comments are clues, though).

I was looking through the tutorial materials for Roland’s talk, and came up with what I think is the most full-proof way to ensure your UDFs get compiled…

gcc -Wall -dynamiclib -o udf_lightspeed.dylib -lstdc++ udf_lightspeed.c

The above will compile just fine, but MySQL will give you an interesting error saying “no suitable image found”. Its the infamous Error 1126.

Upon further poking, it seemed like the following should work:
gcc -Wall -dynamiclib -o udf_lightspeed.dylib -lstdc++ -lc -I`/usr/local/mysql/bin/mysql_config –cflags` udf_lightspeed.c

And it does. MySQL loads the UDF just fine. But sometimes, you get linker errors, which can be annoying.

So, the full-proof solution to compiling UDFs on OS X, Leopard 10.5? A two-step process:
gcc -c `/usr/local/mysql/bin/mysql_config –cflags` udf_lightspeed.c
libtool -dynamic -o udf_lightspeed.so udf_lightspeed.o -lc

Feel free to add -O2 as an option to GCC. One day I might talk about the amazing mysql_config (use it! Read the documentation, in the meantime.)

Remember, once you’ve your your .dylib, to move it to your plugin_dir (find it by doing SHOW VARIABLES LIKE ‘plugin_dir’;). Also note that on OS X, it doesn’t matter if the extension is .dylib or .so - either one will do.

Happy UDF usage!

Update: As a reader points out in the comments, probably the easiest way, in a one-step process, is to use the bundle_loader. gcc -Wall -bundle -bundle_loader /usr/local/mysql/bin/mysqld -o udf_return_values.so `/usr/local/mysql/bin/mysql_config –cflags` udf_return_values.c works a charm.

Technorati Tags: , , , , , ,

MySQL Community Dinner a great success

Monday, April 14th, 2008

The Sunday before the MySQL Conference and Expo usually means the party at Marten’s house. This year, some of us arrived there rather early, and also made a beeline for the exit, rather quickly. This despite all the good alcohol, amazing food being served, and the great company of colleagues we hardly ever see. Why?

Some of us wanted to crashattend the MySQL Conf 2008 Community Dinner. And we did! The entire MySQL Community Team was there, as was Brian Aker, Mark Atwood, Monty Widenius, Timour Katchaounov, Eric Herman. We also naturally brought Barton George along (he gave us a ride! Thanks Barton). Oh, and did I mention, Rich Green and Jonathan Schwartz were there too? Jonathan picked up the tab, naturally :)

MySQL Community Dinner
Monty and Brian chat, while Jonathan is in the background

A big thank you to Jonathan and Rich. All the conversation, all the people I met (old and new), it was just simply fabulous. Now of course, comes the great fun of following up, and getting things done!

MySQL Community Dinner
Jonathan Schwartz with the MySQL/Sun plush dolphins

Did I mention that Santa Giuseppe also brought along little plush dolphins? Great party favours!

MySQL Community Dinner
Giuseppe with some Sun/MySQL dolphins… Monty egging him on

All in all, a great start to the conference. Pictures (that I took anyway - all Creative Commons licensed by Attribution, non-commercial, share-alike) are at my MySQL Conference and Expo 2008 set. They’ve all been added to the 2008 MySQL Users Conference and Expo group on Flickr as well. I know Ronald Bradford was running around with a camera (a 40D, like I have at home :P), and Barton George took some snapshots, as did Lenz Grimmer. Watch their photo feeds soon…

Update: Giuseppe has blogged it too at Community dinner with Jonathan Schwartz and Rich Green as has Kaj, at Jonathan Schwartz and Rich Green at the MySQL Community Pre-Conference Dinner party.

Technorati Tags: , , , , , , , , ,