Posts Tagged ‘group replication’

FOSDEM 2016 notes

While being on the committee for the FOSDEM MySQL & friends devroom, I didn’t speak at that devroom (instead I spoke at the distributions devroom). But when I had time to pop in, I did take some notes on sessions that were interesting to me, so here are the notes. I really did enjoy Yoshinori Matsunobu’s session (out of the devroom) on RocksDB and MyRocks and I highly recommend you to watch the video as the notes can’t be very complete without the great explanation available in the slide deck. Anyway there are videos from the MySQL and friends devroom.

MySQL & Friends Devroom

MySQL Group Replication or how good theory gets into better practice – Tiago Jorge

  • Multi-master update everywhere with built-in automatic distributed recovery, conflict detection and group membership
  • Group replication added 3 PERFORMANCE_SCHEMA tables
  • If a server leaves the group, the others will be automatically informed (either via a crash or if you execute STOP GROUP REPLICATION)
  • Cloud friendly, and it is self-healing. Integrated with server core via a well-defined API. GTIDs, row-based replication, PERFORMANCE_SCHEMA. Works with MySQL Router as well.
  • Multi-master update everywhere. Conflicts will be detected and dealt with, via the first committer wins rule. Any 2 transactions on different servers can write to the same tuple.
  • /
  • Q: When a node leaves a group, will it still accept writes? A: If you leave voluntarily, it can still accept writes as a regular MySQL server (this needs to be checked)
  • Online DDL is not supported
  • Checkout the video

ANALYZE for statements – Sergei Petrunia

  • a lot like EXPLAIN ANALYZE (in PostgreSQL) or PLAN_STATISTICS (in Oracle)
  • Looks like explain output with execution statistics
  • slides and video

Preparse Query Rewrite Plugins – Sveta Smirnova / Martin Hansson

  • Query rewwriting with a proxy might be too complex, so they thought of doing it inside the server. There is a pre-parse (string-to-string) and a post-parse (parse tree) API. Pre-parse: low overhead, but no structure. Post-parse: retains structure, but requires re-parsing (no destructive editing), need to traverse parse tree and will only work on select statements
  • Query rewrite API builds on top of teh Audit API, and then you’ve got the pre-parse/post-parse APIs on the top that call out to the plugins
  • video

Fedora by the Numbers – Remy DeCausemaker

MyRocks: RocksDB Storage Engine for MySQL (LSM Databases at Facebook) – Yoshinori Matsunobu

  • SSD/Flash is getting affordable but MLC Flash is still expensive. HDD has large capacity but limited IOPS (reducing rw IOPS is very important and reducing write is harder). SSD/Flash has great read iops but limited space and write endurance (reducing space here is higher priority)
  • Punch hole compression in 5.7, it is aligned to the sector size of your device. Flash device is basically 4KB. Not 512 bytes. So you’re basically wasting a lot of space and the compression is inefficient
  • LSM tends to have a read penalty compared to B-Tree, like InnoDB. So a good way to reduce the read penalty is to use a Bloom Filter (check key may exist or not without reading data, and skipping read i/o if it definitely does not exist)
  • Another penalty is for delete. It puts them into tombstones. So there is the workaround called SingleDelete.
  • LSMs are ideal for write heavy applications
  • Similar features as InnoDB, transactions: atomicity, MVCC/non-locking consistent read, read committed repeatable read (PostgreSQL-style), Crash safe slave and master. It also has online backup (logical backup by mysqldump and binary backup by myrocks_hotbackup).
  • Much smaller space and write amplification compared to InnoDB
  • Reverse order index (Reverse Column Family). SingleDelete. Prefix bloom filter. Mem-comparable keys when using case sensitive collations. Optimizer statistics for diving into pages.
  • RocksDB is great for scanning forward but ORDER BY DESC queries are slow, hence they use reverse column families to make descending scan a lot faster
  • watch the video