Optimizing your InnoDB buffer pool usage by Steve Hardy

Steve Hardy of Zarafa.

Work that has been done to make Zarafa better. Why do you optimise your buffer pool? To decrease your I/O load. How can you do it? Buy more RAM, page compression, less (smaller) data, rearrange data.

MariaDB or Percona Server allows you to inspect your buffer pool (unsure if this is now available in MySQL 5.6). Giuseppe in the audience says this is available in MySQL 5.6, but Steve used this on MariaDB 5.2.

Strategies to fix it: Make records smaller. Remove indexes if you can use others almost as efficiently. Make records that are accessed around the same time have a higher chance of being on the same page. Use page compression. Buy more RAM. Try Batched Key Access (BKA) in MariaDB 5.3+.

Best to view the presentation since there are specific examples that speak about how Zarafa solves their problems like a user trying to sort their email, etc.

Practical MySQL Indexing guidelines by Stéphane Combaudon

Stéphane Combaudon of Dailymotion.

Index: separate data structure to speed up SELECTs. Think of index in a book. In MySQL, key=index. Consider that indexes are trees.

InnoDB’s clustered index – data is stored with the Primary Key (PK) so PK lookups are fast. Secondary keys hold the PK values. Designing InnoDB PK’s with care is critical for performance.

An index can filter and/or sort values. An index can contain all the fields needed for the query you don’t need to go to the table (a covering index).

MySQL only uses 1 index per table per query (not 100% true – OR clauses), so think of a composite index when you can. Can’t index TEXT fields (use a prefix). Same for BLOBs and long VARCHARs.

Indexes: speed up queries, increases the size of your dataset, slows down writes. How big is the write slowdown? Simple test by Stephane, for in-memory workloads he says adding 2 keys makes performance 2x worse; for on-disk workloads he says its 40x worse. Never neglect the slowdown of your writes when you have an index. There is a graph in the slidedeck.

What is a bad index? Unused indexes. Redundant indexes. Duplicate indexes.

Indexing is not an exact science, but guessing is probably not the best way to design indexes. Always check your assumptions – EXPLAIN does not tell you everything, time your queries with different index combinations, SHOW PROFILES is often valuable. Slow query log is a good place to start.

Many slides with examples, so I hope Stephane posts the deck soon. If possible, try to sort & filter (an index is not always the best for sorting).

InnoDB’s clustered index is always covering. SELECT by PK is the fastest access with InnoDB.

An index can give you 3 benefits: filtering, sorting, covering.

See Userstats v2 – you need Percona Server or MariaDB 5.2+. See also pt-duplicate-key-checker to find redundant indexes easily. See also pt-index-usage to help answer questions not covered by userstats.

MySQL synchronous replication in practice with Galera by Oli Sennhauser

Oli Sennhauser of FromDual.

Synchronous multi-master replication with the Galera plugin. Your application connects to the load balancer and it redirects read/write traffic to the various MySQL Galera nodes. Tested a setup with 17 SQL nodes and you can have even more. Scaling reads and also a little bit for scaling writes is what Galera is good for.

If one node fails, the other two nodes still communicates with each other and the load balancer is aware of the failed node.

Why Galera? There is master-slave replication but its not multi-master, and its asynchronous and you can get inconsistencies. There is master-master replication but its asynchronous and can have inconsistencies and conflicts if you write on both nodes. MHA/MMM/Tungsten are not providing new technology but are based on the MySQL replication technology. MySQL Cluster is another solution but its not InnoDB storage & your need new know-how for Cluster. Also Cluster has problems with fast JOINs. Active/Passive failover clustering, but too often you have resources idling. Schooner being closed & expensive is hard to know much about what they’re doing.

Galera is synchronous & based on InnoDB (others should in theory be possible). Active-active real multi-master topology. True parallel replication on row level. Cluster speaks with each other. There is no slave lag. Won’t lose transactions. Read/write scalability, write throughput can be improved but can’t scale in the way like MySQL Cluster.

Disadvantages? Its not native MySQL binaries/sources but a patch. Codership provides binaries. Higher probability of deadlocks. When you do a full sync (like when a node comes back after downtime), one node is blocked. This is why the minimum you need a 3-node cluster. Also if you do a full sync with a database larger than 50GB, the recommended method is to use mysqldump (which can be very slow). You can use rsync. Percona is working on xtrabackup to do a full sync between nodes.

Setup: 3 nodes are recommended. Or just 2 nodes and one for garbd (Galera Arbitrator Daemon). 2 nodes works but pay attention to a split brain scenario. Go to the Codership website, download their binaries and wsrep (the Galera plug-in). Create your own user on all nodes (don’t use the default root user). You then need to configure my.cnf (there have been discussions for a galera.conf, but Oli just uses my.cnf). Galera works only with InnoDB, so in my.cnf make the default storage engine InnoDB (don’t for example, by accident have MyISAM tables).

The demo has a strange Galera start script, but its not been easy to work. Just start MySQL usually like you would do.

SST is Snapshot State Transfer (SST). Its the initial full sync between the 1st and the other node. SST blocks the donor node (hence why you need 3 nodes). With Galera v2.0, there is also incremental state transfer. It should be GA in February 2012. You can get deltas as opposed to the full sync. You can configure which will be the donor node.

Currently there are 27 variables about Galera in v1.1. You can do it just by doing SHOW GLOBAL VARIABLES LIKE ‘wsrep%’;. The plugin itself, wsrep_provider_options has plenty of options & plenty of room for tuning. SHOW GLOBAL STATUS LIKE ‘wsrep%’; currently has 38 status information fields in Galera v1.1.

For load balancing, you can do it in your application (on your own). You can also use Connector/J which provides load balancing. There is also a PHP MySQLnd that works.

Optimising SQL applications by using client side tools by Mark Riddoch

Mark Riddoch of SkySQL.

This was a talk about the future in general. What people would like. Etc. Not about something that exists yet, hence the sparse notes.

Trace statements are good for the “why”. You move on to the debugger, but what is a useful SQL debugger? Profilers addresses the “when”.

SQL developer tools: manually run queries (traditional route for development, effective way to test SQL statements, some indication of performance), server logs (alerts developers of serious issues like the slow query log – identify poorly written queries, requires server access, not effective in a cloud environment (?)), external monitoring (network sniffing of connection packets, Ethereal dissectors – allow individual connections to be traced, no server access, privileged network access, complex to interpret, possibility of packet loss), intrusive tools (insert “proxy” between client & server to intercept all traffic, imposes delay & requires setup modification).

Client-side tools – client modification or hook via connectors (query logging, profiling). The Java connector has profiling. Should there be a connector slow query log? So there’s no requirement for server access and its per client rather than per server.

Plans: roll out connector query logs for Java, C, and scripting languages built on Connector/C. SQL Parser for report generation/fuzzy matching. Create post-processing tools. Query log comparisons – detect fluctuation in execution time, plan flip alerting, replay log.

MySQL Cluster by Ralf Gebhardt

Ralf Gebhardt of SkySQL.

Cluster: shared nothing architecture (no single point of failure), synchronous replication between nodes, ACID transactions, row level locking. In-memory storage (some data can be stored on disk, but indexes must be in-memory). Checkpointing to disk for durability. It supports two types of indexes – ordered T-trees, unique hash indexes. Online operations like adding node groups, software upgrades, table alterations. Quick standard architecture diagram displayed about MySQL Cluster.

Network partitioning protocol is designed to avoid a split brain scenario. Is there at least one node from each node group? If not then this part cannot continue – graceful shutdown. Are all nodes present from any node group? If so, then this is the only viable cluster – continue to operate. Ask the arbitrator – the arbitrator which parts will continue if no arbitrator is available the cluster shuts down.

Durability – in order for a node to recover fast some data is stored locally. The REDO log is synchronized by global checkpoints (GCP). The DataMemory is synchronized by local checkpoints (LCP).

I agree with Ralf — almost impossible to talk about NDB in 25 minutes. Its very deep, you’d need at least three hours to grasp it well.

Building simple & complex replication clusters with Tungsten Replicator by Giuseppe Maxia

Giuseppe Maxia of Continuent.

MySQL replication is single threaded. Multi-master replication is complex with MySQL. Circular replication works but is very fragile. Once you’ve achieved the feat, how do you avoid conflicts? The lack of global transaction ID today also means you may have slaves that may not be synchronized fully. Finally, some people like to replicate to PostgreSQL, Oracle and MongoDB.

This is where Tungsten Replicator comes into place. Opensource. 100% GPLv2. You can do easy failover (no need to synchronize the slaves manually when a master dies), have multiple masters, multiple sources to a single slave, conflict prevention, parallel replication, and replicate to Oracle/PostgreSQL database (heterogeneous replication – Oracle->MySQL is not opensource, but MySQL->Oracle is).

Parallel replication: ability to replicate with multiple thread at once. Sharded by database (all big sites already use this naturally). Good choice for slave lag problems. Bad choice for single database projects. In their tests to measure slave catch up time between standard MySQL & Tungsten slave with 1hr of sysbench and some 130GB of data, MySQL replication takes 4.5h to catchup, and Tungsten takes less than 1 hour to catchup. This comparison was not made with MySQL 5.6 (the same parallel replication feature at this moment in 5.6 is quite buggy). No need to install Tungsten on the master (Tungsten needs to be on slave). Replication can revert to native salve with 2 commands, but the failover is an issue — so install Tungsten on both!

Conflict prevention. When you have multiple masters you are tempted to do many things that you shouldn’t. Tungsten provides the multi-master topology. Tungsten can help you avoid conflicts. Decide you want to shard your information by database – Tungsten can enforce such rules. Define the rules, applied to either the master or the slave and you can then state to make replication fail or drop silently or drop with a warning.

You can create clusters very quickly with Tungsten. You can install from a centralized point to many servers. You need Java, Ruby, SSH access and a MySQL user with all privileges (used for replication).


i