Posts Tagged ‘mysqlconf’

groonga – fulltext search library for cloud & web

This is an incomplete fragment from 2011. Figure its worth publishing this now, considering MariaDB is likely to get groonga in the near future. The groonga team have released MariaDB 10.0.6 binaries as well. This is all part of the mroonga project.

These were my quick notes from the groonga talk at the O’Reilly MySQL Conference & Expo 2011. I haven’t tried it yet (and don’t know if it really is faster than Sphinx), but its something I definitely want to play with. Maybe even get a MariaDB tree going.

groonga is a fulltext search library for cloud & web.

groonga is easy to embed & is scalable. It is written in C.

Highly precise search for any language. Fast searching and indexing in realtime.

PostgreSQL bindings are also available. Can be used with Spider storage engine. CPU scalable. There is also a Ruby binding.

“100x faster than Sphinx in practical use cases”

groonga components:

  • groonga core – embedded search engine
  • groonga column store – data store, strings, numeric values, geographic values. None of the existing engines were good enough for typical search engine queries. Typical queries hits large number of records, filtered by multiple conditions (liker range queries) and then you group by sepcific conditions, order by a dynamic condition, and sometimes output limited number of records.
  • groonga storage engine – pluggable storage engine to mysql

Spider can be used for data sharding on top of it. It is not a component of the groonga product, but works well with it to make it a distributed search engine.

Works for unsegmented languages (like CJK). No whitespaces in CJK.

groonga supports full inverted index (for unsegmented languages). Highly compressed index (no stop words are needed). They use Patricia TRIE lexicon (partial string match on lexicon). Inverted index is designed to reduce disk I/O.

Web is growing and searching & indexing must be performed simultaneously.

Tritonn – patched mysql, myisam and groonga

http://www.twistimage.com/

Problems with it?

  1. MyISAM based – table lock (when updating table, read accesses are blocked)
  2. Patch based – patch maintenance and building patched MySQL is messy

New solution? Groonga storage engine. Uses the new column store instead of MyISM. And it’s no patch any longer — it’s a pluggable storage engine

https://github.com/mroonga/mroonga

Advantages?

  • table lock free – column store is lock free
  • only accesses columns required – not row-based
  • easy to build now

Includes some optimisations:

  • count(*) optimzation for queries like SELECT COUNT(*) FROM table where MATCH(col) against (‘query’);
  • Works also with ORDER BY score and LIMIT optimisation

The groonga storage engine has fast phrase search, fast index update (realtime), inserting records doesn’t block reading records

Spider is a storage engine for database sharding transparently.

Benefits of Spider + Groonga:

  • optimisation of fts with sorting by score
  • optimisation for the sorting by range partition key column
  • optimisation fts with filtering by partition key column

groonga.org – they are all based on mysql 5.5 (packages available)

Contact Team Groonga: bit.ly/fSs5vx

 

The SkySQL Reference Architecture

I have a bunch of notes from the O’Reilly MySQL Conference & Expo 2011, and I figure its about time I started blogging it. These are notes from the panel on the SkySQL Reference Architecture, led by Kaj Arno and Ivan Zoratti. The notes are raw (read their FAQ for more), and I talk a little bit about the SkySQL Configurator at the end (a tool I immediately used, and submitted some bugs/improvements for – 7 at last count, which I hear got fixed in the 0.02 release, which got pushed last night!).

There were 7 panelists. The MySQL world needs:

  • technical support
  • monitoring & administration tools
  • simplified interfaces
  • development & user tools
  • consulting & training
Services & consulting generally are difficult to scale.
The most comprehensive architecture around MySQL, scalable, adaptable and cloud ready
Implementation:
  • select and test specific components
  • integrate components
  • provision the components in a simple interface
  • simplify monitoring & administration
  • technical services & support
  • validate solutions
  • improvements and new releases can be done
  • knowledge sharing related to the reference architecture
Technologies selected from Webyog, Sphinx, Drizzle, Monty Program, Calpont, Tokutek, ScaleDB, Schooner, Linbit, Zimory, Canonical.

SkySQL Provisioning tools:

  • SkySQL Manager – control and administer the SkySQL/MySQL environment
  • SkySQL Configurator – configure and update SkySQL reference architecture modules
  • SkySQL Tuner – analyse the configuration and prepare the packages

I did a test, and it seemed like I got binaries built in under 5 minutes. Custom configurations with a stock build. You get a 70MB binary. Hosted at http://www.enovance.com/. A lot of people never configure their my.cnf, so I think having a GUI on the web might be a good idea to help people have sensible defaults.

lovegood:skysql byte$ ls
total 143352
drwxr-xr-x    3 byte  staff       102 14 Apr 06:13 ./
drwx------@ 598 byte  staff     20332 14 Apr 06:13 ../
-rw-r--r--@   1 byte  staff  73395132 14 Apr 06:12 SkySQL-mariadb-poboffcfrm5bi054559q8iea74.tar.gz

lovegood:skysql byte$ tar -zxvpf SkySQL-mariadb-poboffcfrm5bi054559q8iea74.tar.gz
x etc/
x etc/my.cnf
x install
x packages/
x packages/xtrabackup-1.4-74.rhel5.x86_64.rpm
x packages/MySQL-client-5.5.10-1.rhel5.x86_64.rpm
x packages/MySQL-server-5.5.10-1.rhel5.x86_64.rpm

SkySQL is also going to have a customer advisory board, and they are starting it this week. (I don’t know any further details about this as of yet.)

The SkySQL Configurator can only get better. I expect it will do custom packages including things like Sphinx/SphinxSE, Drizzle, and other things in due time.

MySQL Conference Early Bird ends 31/03/2011

If you’ve been busy and haven’t registered yet, remember that early-bird pricing ends on 31/03/2011. From April 1-10, you’ll have to pay USD$100 more. A discount code for use (I think you save 20-25%): mys11fsd.

We’re full up in terms of the schedule. People are still asking for an opportunity to speak, and there are still opportunities in the Products & Services track. Please contact Yvonne Romaine at yromaine@oreilly.com for more information on this.

Might I also suggest that if you want to speak and there’s no longer an opportunity, you submit a five-minute talk for the Ignite MySQL event. Even though submissions are now closed, contact Brian Aker — he’ll try and help make some magic happen for you.

Don’t forget you can also lead a Birds of a Feather (BoF) session. While it is not a talk, you can still gather like-minded folk and talk about things over pizza & beer (which has always been a popular combination in previous years).

If you’re looking for a new job, don’t forget the Career Zone. There are some great companies participating, so that’s another good reason to come.

Conferences are all about networking. While not enabled by default, I suggest you manually go and turn on access to the Attendee Directory, so you can write messages to people you want to meet, have chats with, and so on.

Some keynote updates about The O’Reilly MySQL Conference & Expo 2011

A quick update on a few keynotes that the O’Reilly MySQL Conference & Expo 2011 managed to recently close:

O'Reilly MySQL Conference & Expo 2011

  • The opening keynote, The State of the Dolphin, given by none other than Tomas Ulin, who is currently the VP of the MySQL Engineering team at Oracle. I am told that this is not just a “what’s new” and “what’s coming up”, as there will also be a Q&A session with an analyst, customer, and Tomas. You must not miss this on Tuesday morning at 9am, 12th April 2011.
  • On Thursday at 9.30am, we have The Next Decade in Data Management, a keynote given by Mike Olson, CEO of Cloudera. More and more I see people using Hadoop/Hbase alongside their MySQL installs, so I think this talk is a must-see.

Early bird registration ends March 15 2011. What are you waiting for? Procrastination will cost you!

Don’t forget to follow the conference via social media: Facebook, Twitter.

O’Reilly MySQL Conference Awards 2010

The O’Reilly MySQL Conference & Expo 2010 is over. I hope all of you had a good time. I have plenty of blog posts and thoughts lined up about this, but first, I’d like to point out something that has become a tradition, that was continued in 2010: the O’Reilly MySQL Conference Community Award Winners.

Conference award winners

Tim O’Reilly was kind enough to hand out the awards this year. In case people were wondering, the awards were pewter wine goblets from Royal Selangor.

Selection of the award winners happened via voting from the alumni of winners, and was all done in a rather short period of time. Kudos to the entire team that voted. Now for the winners…

O’Reilly MySQL Community Member of the Year 2010

  1. Mark Callaghan is known for his work in leading a MySQL engineering team first at Google, and now at Facebook. In addition, the panel appreciated his insightful and always tasteful blogging, ranging from insightful benchmark reports to open source community advocacy.
  2. Kai ‘Oswald’ Seidler is a developer of XAMPP, a multi-platform LAMP stack, especially popular amongst Microsoft Windows users. Many users get their first contact with the AMP (Apache-MySQL-PHP) platform using XAMPP!
  3. Daniel Nichter created the Hack MySQL Kit, hacks on Maatkit and heaps of other software. He’s also a fabulous MySQL DBA.

O’Reilly MySQL Application of the Year 2010
Twitter was unanimously voted to be the application of the year in 2010.

Panellist Marc Delisle described his use of Twitter recently:

“Seven weeks ago I was in Niamey, Niger during the coup d’état. While borders and the airport were closed and a tank was patrolling on my street, I took refuge at the Canadian embassy where Twitter users updated me on the situation, almost minute by minute.”

O’Reilly MySQL Corporate Sponsor of the year 2010

  1. Rackspace received the award for hiring many of the core Drizzle developers, enabling them to work full-time on the MySQL fork. Rackspace also contributes to open source projects like MariaDB, Drizzle and more, providing hosting.
  2. Percona has over the last years hired many valuable MySQL contributors, and have a lot of consultants and developers extending MySQL and tools around it. Percona’s team blog on MySQL performance is also highly regarded within the community.

Another picture from the excellent James Duncan Davidson:


annual MySQL awards

o’reilly mysql conference & expo 2010

It is my pleasure to be your Program Chair, for the O’Reilly MySQL Conference & Expo 2010, to be held April 12-15 2010, in Santa Clara, California.

It is of course, not something I embark on alone. I have a program committee, comprising of some amazing folk: Brian Aker, Kaj Arno, Roland Bouman, Sheeri K. Cabral, Robin Schumacher, Baron Schwartz, and Jeff Wiss.

I can highly encourage you to submit a proposal. You have till January 27, 2010, which basically means, less than a month, so get cracking! I also can highly recommend you to register as an attendee.

I’ll talk more about the processes, et al, in a later blog post, but I want to ensure that in 2010, we are going to be completely open and transparent in our decision making process. And I want you, the MySQL community, to participate. Watch this space for more details.

And again, its a great honour, being your Program Chair for the conference in 2010. I expect it to be a blast.


i