Posts Tagged ‘MariaDB’

MariaDB at Percona Live Santa Clara

I for one can say that I’m truly excited that MariaDB will be part of Percona Live Santa Clara. The MariaDB session list includes:

  • A tutorial: Improving MySQL/MariaDB query performance through optimizer tuning by Timour Katchaounov and Sergey Petrunia. You can benefit from this even as a stock MySQL user naturally.
  • MySQL Plugins – why should I bother? by Sergei Golubchik is again something that isn’t only MariaDB specific since MySQL also has a plugin architecture. I don’t know if Sergei will talk about authentication plugins as well, but I can imagine this is on the mind of many people.
  • MariaDB: The 2012 edition by Colin Charles – well, this is me. You might wonder what we’ve been doing since mid-2009 having only made our first release in early-2010. We’ve achieved quite a lot in the project, so come find out why you might consider MariaDB to your alternative to MySQL. We may even have a surprise in the way of some new announcements here.
  • MySQL Optimizer Standoff MySQL 5.6 and MariaDB 5.3 by Peter Zaitsev is focused on the optimizer features between MySQL and MariaDB. MariaDB 5.3 as you might know features a lot of optimizer improvements.

The BoFs schedule has just been released, and on Wednesday evening from 9pm onwards, you can come visit Ballroom A to discuss MariaDB, help with the future roadmap as well as be filled with lots of salmiakkikossu.

We have already confirmed our acceptance of having a DotOrg booth and can’t wait till that information makes it up on the Internet.

All in all, I’m happy to be attending and speaking in Santa Clara yet again in the April timeframe. This has become somewhat of a yearly ritual. Make sure you register, and don’t forget to do a little searching to find some promo codes!

Replication features of 2011 by Sergey Petrunia

Sergey Petrunia of the MariaDB project & Monty Program.

MySQL 5.5 GA at the end of 2010. MariaDB 5.3 RC towards the end of 2011 (beta in June 2011).

MySQL 5.5 is merged to Percona Server 5.5 which included semi-sync replication, slave fsync options, atuomatic relay log recovery, RBR slave type conversions (question if this is useful or not), individual log flushing (very useful, but not many using), replication heartbeat, SHOW RELAYLOG EVENTS. About 2/3rds of the audience use MySQL 5.5 in production, with only 2 people using semi-sync replication.

MariaDB 5.3 brings replication features brings group commit in the binary log, which is merged into Percona Server 5.5. Checksums for binlog events which is merged from MySQL 5.6. Sergey goes in-depth about the group commit for the binary log. To find out a little more about MariaDB replication changes, see Replication in the Knowledgebase.

There are several implementations of group commit. Facebook started it, followed by MariaDB & Oracle. Percona 5.5 is GA so the feature is there, its not in MySQL 5.6 (yet?), and MariaDB 5.3 is where its at. Seems like the MariaDB implementation is the best so far – refer to the Facebook benchmark performed by Mark Callaghan.

Annotated RBR poses a compatibility problem. MariaDB 5.3 has annotate_rows, while MySQL 5.6 has rows_query event. They are different events. So you cannot have a MariaDB 5.3 master and a MySQL 5.6 slave at this moment. So MySQL 5.6 will have a flag to mark “ignorable” binlog events which will be merged into MariaDB and this will make binary logs compatible again.

There is now also optimized RBR for tables with no primary key.

MySQL 5.6 also has crash-safe slave (replication information stored in tables). Crash-safe master (binary log recovery if the server starts & sees the binary log is corrupted). Parallel event execution is something that is new in MySQL 5.6 which is the most important feature for Sergey.

Pre-heating: There is mk-slave-prefetch (famous quote: “Please don’t use mk-slave-prefetch on #MySQL unless you are Facebook.”). There is replication booster by Yoshinori Matsunobu. There is a Python version of mk-slave-prefetch that Facebook uses.

MariaDB 5.3 query optimizer by Sergey Petrunia

Sergey Petrunia of the MariaDB project.

What exactly is not working in MySQL? MySQL is poor at decision support/analytics. With large datasets you need special disk access strategies. Complex queries like insufficient subquery support and big joins are common int he MySQL world.

DBT-3 is used, scale=30, with a 75GB database and run a query “average price of item between a range of dates”. Query time took some 45 minutes to execute. Why? Run iostat -x to see what is going on. See that the CPU is mostly idle, so its an IO-bound load. Next you run SHOW ENGINE INNODB STATUS and you’ll see how many reads per second is happening. Possible solution is to get more RAM or get an SSD (good to speedup OLTP workloads, but analytics over data is probably not viable since SSDs are small and not cheap).

The MySQL/MariaDB solution to the above problem is improved disk access strategies: multi-range read (MRR) and batched key access (BKA). In MariaDB, MRR/BKA need to be enabled (they are not turned on by default). The query time only took 3 minutes 48 seconds, which is some 11.8x faster than the previous 45 minutes.If you look at EXPLAIN output, its almost as same as before, expect the Extra filed. iostat -x will now show some CPU load, svctm down as well (so its not random disk seeks anymore — some 8ms seek time on a regular 7,200rpm disk), SHOW ENGINE INNODB STATUS will show some 10,000 reads per second rather than the previous 200.

If you are on Fedora, check out the Systemtap feature to look at I/O patterns. stap deviceseeks.stp -c “sleep 60”.

Subqueries handling in MariaDB 5.3: check out the Subquery Optimizations Map. Only about 10% of the audience use optimizer hints in MySQL.

Sphinx user stories by Stéphane Varoqui

Stephane Varoqui, Field Services SkySQL, Vlad Fedorkov, Director of PS, Sphinx Inc, Christophe Gesche, LAMP Expert, Delcampe, Herve Seignole, Web Architect, Groupe Pierre & Vacances Center Parcs – this is a big talk!

Pros: Filtering takes place on attributes in separate tables. Rely on the optimizer choice. HASH JOIN can help (MariaDB 5.3). Table elimination can help (MariaDB 5.2). ICP Index Condition Pushdown can help (MariaDB 5.3/MySQL 5.6). Max 80M documents at Pixmania, all queries come in less than 1s using 128GB of RAM (MariaDB 5.2). At PAP.fr, there is 16GB RAM with MariaDB 5.2.

Cons: CPU intensive (replication with many slaves). Need covering indexes to cover various !filter !order. Join & sorting cost on lazy filtering.

The more indexes you have in the system, the more you need to increase the main memory of the server. Keep the Btree’s in memory.

What about denormalized schemas? Not really CPU intensive, just IO. Can go to disk, full partition scan with filtering taking place on record order using covering index. Can shard but not that easy. Use the spider storage engine or shard-query. Can use memory engine for isolation. There are cons like duplicate data, duplicate indexes, missing material views, merge index cost, impact on write performance, and can consume a lot of memory with many indexes.

MySQL can push hardware, so read less/do less/read serialized/map reduce to get better latency. Chose data type wisely, replace string with numeric, vertical & horizontal sharing, snowflake compression (combination of attributes, build a table of the combination and replace it with an ID). If you are lazy, just use Sphinx!

Sphinx is just another daemon that can serve queries. Its easy to setup, easy to scale, storage engine makes it accessible to current MySQL users, API in core MariaDB (SphinxSE), SphinxQL, SphinxSE is transparent to the application layer of MySQL protocol.

Demo done using the Employees DB.

Pierre & Vacances – Centerparcs. Free text search, they use MariaDB using Levenshtein UDF implementation. Went live 01/2011. First implementation of Sphinx (12 indexes). Its grown, they use PHP API. The new goal is to never send an empty result. 1 index per website/market, with a total of 15 million docs. Index built on standalone server. Using internal gearman job schedule to generate index before cache generated. Current monitoring is via Nagios & perl, but the next step is to use Monyog & MariaDB INFORMATION_SCHEMA plugin.

Delcampe is an auction website with 45M ‘active’ items. Its dedicated for collectors. 3 string fields, and 15 attributes. 40-120K new items daily. Started with mysql fulltext in 2007, moved to Sphinx in 2008. There was a need to have more filters. Now they have 5 sphinx servers + 1 MySQL server. HAproxy to load balance.

MariaDB/MySQL users in Paris & Brussels

I’m about to head to Paris to present at the February meetup of the MySQL User Group in Paris, France. It happens 1st February from 6-8pm at the Patricks Irish Pub. Its free to attend, and I understand that SkySQL keeps this event afloat.

I’m also heading to my first FOSDEM right afterwards and will definitely hang out at the MySQL & Friends Devroom. There is an amazing lineup of speakers, with all talks being about 25-30 minutes, it looks like it is going to be a lot of fun. To boot, Michael “Monty” Widenius will also be there, so expect lots of Salmiakkikossu.

If you want to keep track of where Monty Program folk are going to be to talk about MariaDB, make sure you’re subscribed to our news page, which also includes important release information. Pretty much every conference that we plan to attend (and have attended) is at the conference page.

I am looking forward to meeting & learning from many MariaDB/MySQL users!

SCALE 10x – there’s lots of MySQL there!

I’m just about to get on a plane to head to my inaugural SCALE event. It’s their tenth year running!

In a world filled with NoSQL related media, its kind of nice to see that on Friday January 20 2012, we have a MySQL room right next to the PostgreSQL room (schedule). It is awesome to see that the track will have participation from Oracle, Monty Program Ab, and SkySQL Ab.

On Saturday for the main tracks, I’ve got a talk about the growing MySQL diaspora (just got larger this year in case you haven’t paid attention to the packaged up Galera product!). This one is a constant work in progress and I’m hoping to complete research closer towards March ’12.

Monty Program and SkySQL are also sharing a booth in the expo hall, so come by booth #65 for some interesting schwag (t-shirts, poppers, etc.). Looking at the schedule lineup, I’m surprised I’ve never ever been to a SCALE before – looks totally awesome. See you in LAX (well, we’re so close-by the Los Angeles Airport :P)


i