Upcoming MariaDB 10.0.7 will have more engines – mroonga, OQGRAPH

In recent time, MariaDB 10 has been getting many new storage engines. We’ve seen TokuDB, CONNECT, SEQUENCE, SPIDER, CassandraSE for various use cases. For a long time, MariaDB shipped OQGRAPH, but it was disabled in MariaDB 5.5. It will make a come back as OQGRAPH v3 has been worked on actively by Andrew McDonnell. Keep track of this via MDEV-5319.

Another engine being worked on by Kentoku Shiba & team is the mroonga engine, which allows you to do full text search. It is optimised for CJK languages, and is supposedly very fast. To track this, follow MDEV-5222.

What this means is that from the start of the MariaDB project, the only engine that we have disabled and don’t include since 5.5 and greater is PBXT. That’s a pretty good record of having many shipping storage engines that have largely come from the community.

MariaDB-related links in November 2013

Another month has come to an end. If you’re looking to be updated on MariaDB content on a regular basis, don’t forget to be on Twitter (@mariadb), Facebook (MariaDB.dbms), or Google Plus (+mariadb).

There was a question on Quora – Is Facebook considering ditching MySQL in favor of MariaDB like Google did? The best answer really comes from Harrison Fisk, so I’ll leave you to it to read. The older link made its way on social media about Wikipedia_$ mv MySQL MariaDB.

MariaDB 10.0 went into beta (with the 10.0.5 release). We made a 10.0.6 release shortly afterwards to fix some bugs. One cool thing to note — the blog post from Ian Gulliver at Google about how Google is making use of MariaDB today.

The MariaDB Audit plugin is now GA – yes, you have to register to download it, but it’s worth it. There is also a webminar on this come Dec 5 which can be worth attending.

There is a new book out by Daniel Bartholomew: Getting Started with MariaDB. I fully intend to read & review it soon (you can also get this from O’Reilly’s Safari Bookshelf).

Navicat has announced Navicat for MariaDB for all your GUI needs on Windows, Mac or Linux. There is a free trial, or it costs in various prices for their non-commercial, standard or enterprise edition.

The MariaDB Enterprise Beta program started. I myself signed up for the beta to give it a spin. From what I gather most people that signed up qualified to give it a go. It is likely to go GA in mid-December. It is opensource software. Look at the getting started guide for more.

And in case you didn’t already notice, the Knowledge Base has had a redesign. There currently exists 3,165 articles in English licensed under the CC-BY-SA and GNU GFDL.

The rise of VC

Via: Inside the mind of Marc Andreessen – Fortune Management: “I never heard the term ‘venture capital’ until I got to California. I got a job and landed in Silicon Valley, and I found out about this venture capital thing. And I was dumbstruck. ‘You mean there are people who will give you money to invent new things and start a company? Really? Seriously? It’s like wow! That’s really cool!’  And of course we got lucky.”

Caught in the middleThat was 20 years ago. There was a lot less media coverage of VC, deals, angels, etc.

What would be an interesting exercise is to see when VC firms in Asia started? Is it all pushed by the democratization of media?

The fact that today, media is cheaper – everyone and his uncle has started a publication of some sort. In tech, it seems that most of the media will only cover VC-related stories (i.e. money driven). They’ve forgotten real tech.

MAVCAP, the largest VC fund in Malaysia, only started in 2001. A mere 12 years ago! Singapore’s first firm started in 1984 – Seavi Advent Private Equity (29 years ago) – though I’m not sure if they deal with tech much.

So is this the rise of VC/angels/incubators/etc. or the rise of media?

groonga – fulltext search library for cloud & web

This is an incomplete fragment from 2011. Figure its worth publishing this now, considering MariaDB is likely to get groonga in the near future. The groonga team have released MariaDB 10.0.6 binaries as well. This is all part of the mroonga project.

These were my quick notes from the groonga talk at the O’Reilly MySQL Conference & Expo 2011. I haven’t tried it yet (and don’t know if it really is faster than Sphinx), but its something I definitely want to play with. Maybe even get a MariaDB tree going.

groonga is a fulltext search library for cloud & web.

groonga is easy to embed & is scalable. It is written in C.

Highly precise search for any language. Fast searching and indexing in realtime.

PostgreSQL bindings are also available. Can be used with Spider storage engine. CPU scalable. There is also a Ruby binding.

“100x faster than Sphinx in practical use cases”

groonga components:

  • groonga core – embedded search engine
  • groonga column store – data store, strings, numeric values, geographic values. None of the existing engines were good enough for typical search engine queries. Typical queries hits large number of records, filtered by multiple conditions (liker range queries) and then you group by sepcific conditions, order by a dynamic condition, and sometimes output limited number of records.
  • groonga storage engine – pluggable storage engine to mysql

Spider can be used for data sharding on top of it. It is not a component of the groonga product, but works well with it to make it a distributed search engine.

Works for unsegmented languages (like CJK). No whitespaces in CJK.

groonga supports full inverted index (for unsegmented languages). Highly compressed index (no stop words are needed). They use Patricia TRIE lexicon (partial string match on lexicon). Inverted index is designed to reduce disk I/O.

Web is growing and searching & indexing must be performed simultaneously.

Tritonn – patched mysql, myisam and groonga

http://www.twistimage.com/

Problems with it?

  1. MyISAM based – table lock (when updating table, read accesses are blocked)
  2. Patch based – patch maintenance and building patched MySQL is messy

New solution? Groonga storage engine. Uses the new column store instead of MyISM. And it’s no patch any longer — it’s a pluggable storage engine

https://github.com/mroonga/mroonga

Advantages?

  • table lock free – column store is lock free
  • only accesses columns required – not row-based
  • easy to build now

Includes some optimisations:

  • count(*) optimzation for queries like SELECT COUNT(*) FROM table where MATCH(col) against (‘query’);
  • Works also with ORDER BY score and LIMIT optimisation

The groonga storage engine has fast phrase search, fast index update (realtime), inserting records doesn’t block reading records

Spider is a storage engine for database sharding transparently.

Benefits of Spider + Groonga:

  • optimisation of fts with sorting by score
  • optimisation for the sorting by range partition key column
  • optimisation fts with filtering by partition key column

groonga.org – they are all based on mysql 5.5 (packages available)

Contact Team Groonga: bit.ly/fSs5vx

 

Online business models around content

We’ve taken a scare resource and made it infinite, an idea by Adam Curry. In the print world, you had n-number of ads. In the online world, you can place any amount of banners on your site, or there are a multitude of sites serving such banners. This is why its hard to generate revenue.

Robert Scoble brings up a great point on TWiT#423 – tech journalism is everywhere. Its hard to make good quality tech content because it costs money. A good article on how an iPhone is made involves you to head to China to visit the factory floor, and can easily cost $10,000. However when you write a (blog) post, what can you make on viewership in terms of CPM? $5? $15? It is a tiny amount which is why many tech journalists/bloggers end up repackaging press releases.

Worlds that haven’t been touched negatively seem to be fashion & cars. But tech is clearly affected.

Evan Williams gives us tools to express ourselves – Blogger, Twitter, Medium. He’s made tonnes of money as people have been willing to create content for free. How do journalists make a living? Leo Laporte suggests that the Internet has arrived – you figure out to make a living.

The problem is people doing it for free. It devalues the work of people doing it for bucks.

What about the CPM for location based ads? Today you get so many apps that give away the location, with users opting in. 

Attitude matters

I recently read a plea by a fledgling entrepreneur trying to build a global company, who’s been through a bunch of startup competitions, “But struggling on getting grants or investments from local VCs/Angel.”

I recently saw a customer get annoyed with a service provider she had been using at length, only to rally a “hate page”. However it never garnered too much as there was an odd flair to the way she wrote.

In business or in any inter-personal relationship, attitude matters.

You may have the best private security system out there, but if you have a shoddy attitude, you will get no users. You may have the best cause out there, but if you have a shoddy attitude, you will not get followers. You may have the best piece of software out there since sliced bread, but if you take on an aggressive attitude, you may not get as many users as you had hoped for.

Think about how you portray yourself to others, and if need be, improve your attitude. People are a forgiving and forgetful lot.


i