Archive for the ‘Ruby’ Category

Rails… Fails… (sticker)

Tuesday, September 23rd, 2008

I had first seen the interesting Rails logo in a talk by Terry Chay, while I was at OSCON, a few months ago.


Jay Pipes

Now, my esteemed colleague Jay Pipes has it on his laptop. It seems they’re making stickers, even.

Otherwise, my next task is to revamp our Ruby content. Currently, it looks a little sad. It has to at least be as good as Using MySQL With Ruby, no?

Selenium at the MyOSS meetup

Friday, April 4th, 2008

I was at yesterday’s MyOSS meetup, and the topic was on the Selenium Web Testing Framework, presented by Yuen-Chi Lian. Here are some notes, and hopefully the slides and code make it up to the website soon.

- Java guy, who is a MyJUG guy
- Employed by CustomWare Asia Pacific, and is experienced in JIRA, Confluence, and Mule (are they an Atlassian reseller?)
- He started web development using PHP. Didn’t do unit tests then. As he started to do web development again, 2 years ago, he found that JIRA guys used Selenium to test their web UI. He started Selenium, last week :)
- A Common Web Development flow: analysis, design, development, then its testing… Unit test, integration tests, and acceptance tests using Selenium
- for web UI testing, Selenium helps you invoke a JavaScript method, rather than clicking a button to trigger it… These tests can be recorded and scripted. You can do it in a simulated browser environment, or use a real user agent
- He hasn’t tested Sahi yet, but he thinks its better than Selenium, based on the blogs that he’s read
- Selenium can be integrated with Continuous Integration (CI)
- Lots of examples using Java and Ruby
- Imagine doing FOSS development, on the Windows platform. It actually looks scary…

Overall, a rather basic talk, with a highly motivated/dedicated speaker. This being my second talk that I’ve attended on Selenium (last year at the Ruby Conf), and now, its pretty cemented in my head that I’ve got to make use of it, the next time I write a web-front end. Oh, it has great Firefox integration, with the Web Developer Tools plugin…

After that, it was off to Pelita for dinner. This ended up becoming supper, and there was lots of chatter… Drive home was eventful - on the way to Puchong, to send KageSenshi back, got pulled over twice - once for a license inspection, once to find out where we were headed. Odd. This has never happened before.

Technorati Tags: , , , , , , , ,

Hitchhacker’s Guide to the MySQL Conference - Ruby/Ruby on Rails Edition

Monday, March 31st, 2008

The Hitchhacker’s Guide to the MySQL Conference
Ruby/Ruby on Rails Developer’s Edition
Following on from the excellent Hitchhacker’s Guide to the MySQL Conference - PHP Edition by Jay Pipes, I’m doing one for the Ruby and Ruby on Rails developer in you. If you haven’t registered yet, remember, I can provide you a 20% discount code, so make sure you get it from me, via emailing me. Tutorials, are selling out, fast.

The Keynotes
A Head in the Cloud - The Power of Infrastructure as a Service is a keynote delivered by Werner Vogels, CTO of Amazon.com. Why would the typical Rubyist be interested in this? Clearly because of the fact that many sites, end up using Amazon Web Services to scale. An example, being Twitter, whom store a lot of data on AWS, and we all know they’re written using Rails.

While the other keynotes aren’t Rails specific, they are a great start to the day, and interesting announcements tend to be made before or after a keynote, so wake up early! Yes, even if you were at a BoF till 2am in the morning…

Tutorial Day (Monday, 14 April 2008) - schedule
My recommendations would include going to the MySQL Replication Tutorial by Lars Thalmann and Mats Kindahl, in the morning, as this is a great way to ensure your Rails application scales. This can be followed by either Memcached and MySQL: Everything You Need to Know by Brian Aker, and Alan Kasindorf (dormando, if you’re on IRC) or Real World Web: Performance & Scalability MySQL Edition by Ask Hansen, but I’m told that its completely sold out! If you’re interested in something to spend the entire day in, I can highly recommend either Giuseppe Maxia and Jan Kneschke’s talk on MySQL Proxy: The Complete Tutorial, or head over to Stewart Smith’s talk on MySQL Cluster Tutorial. Keep in mind that Stewart’s tutorial is going to be quite hands on.

Sessions (Tuesday, 15 April 2008) - schedule
After the keynotes (remember to be there!), I would personally recommend that you attend Lessons Learned in Building a Highly Scalable MySQL Database presented by the folk, behind The Hive. Alternatively, visit Mike Zinner’s talk about MySQL Workbench: The Ultimate Database Design Tool for Developers - the ease of creating and designing databases, makes it feel very “Rails-like”.

At 11:55am, your choices are wide-varied, and choosing one talk, is going to be hard. Consider paying a visit to Practical MySQL for Web Applications by Domas Mituzas. Not only is he a support engineer at MySQL, he also engineers the back-end for Wikipedia; sure, they’re not a Rails-based application, but I can assure you, the commonality between web applications will ensure this is an interesting talk. If your application makes use of full-text search, consider going to Full-Text Search with MySQL 5.1: New Features & How To, by Alexander Rubin. If you’re writing a new application, it probably makes sense to start using 5.1. Replication for Dummies by Pat Galbraith should also be an interesting talk - if you missed the tutorial the day before, definitely go for this; if you attended the tutorial before, you’ll be re-enforcing your knowledge, and might make use of good Q&A time as well.

After the hearty lunch, a must not miss would be Big Bird (Scaling Twitter) by Blaine Cook. Popular, used by many, micro-blogging tool, they’ve gone through improving how the application performs, and scaling it to heights, much beyond simple ActiveRecord usage. Be there.

At 3:05pm, Monty Taylor’s High Availability Landscape of MySQL should be the one you’re at. Monty is very conversant with many a language as well, as part of his work on the NDB connectors, but here, you’ll be learning about replication, DRBD, and possibly Cluster/NDB. Anyone know of a Rails application using MySQL Cluster/NDB?

At 4:25pm, definitely go to the talk by Jeremy McAnally, titled: Talk = Ruby + MySql.new(bie): An Introduction to Using MySQL with Ruby. Jeremy’s the author of books and articles on Ruby, Rails, and MySQL, and granted the talk is targeted at beginners, no matter what level you’re at, you will probably learn something new.

By 5:15pm, your energy levels might slowly be sapped, but persevere, and you’ll be rewarded. While none of the talks are Ruby/Rails specific, I’d recommend attending either Backup and Recovery Basics by Kai Voigt (trainer at MySQL, will be an interesting talk as you will have to backup, and recover your database at some stage) or Services Oriented Architecture with PHP and MySQL by Joe Stump (learn from Digg, and apply your knowledge to Ruby/Rails).

Sessions (Wednesday, 16 April 2008) - schedule
Begin the morning having a hearty breakfast, and moving on to the keynotes. After which, if you’re into GIS and spatial extensions, the talk you must be at is the one offered by Seth Fitzsimmons, from Yahoo!, titled Using MySQL’s Spatial Extensions with Rails. If however you aren’t a GIS junkie, consider either Giuseppe Maxia’s talk on MySQL Sandbox, or Roland Bouman’s talk on Information Schema and its applications.

At 11:55am, I can recommend either: Falcon from the Beginning by Jim Starkey & Ann Harrison, or Tom Daly’s talk on Web Workloads for Comparing, Testing and Tuning MySQL Performance , SPECjAppServer2004, EAStress and Faban. Its definitely a tough sell, with the Applied Partitioning and Scaling Your Database System by Phil Hildebrand, and if you’re after contributing to MySQL, and being part of our community, don’t hesitate to come see me talk about Paying It Forward: Harnessing the MySQL Contributory Resources.

After lunch, at 2:00pm, if you’re into database normalization, consider How to be Normal, a Guide for Developers by Mike Hillyer. However, the two talks I’d personally be at (if only I can split myself) would be Monty’s talk (the father of MySQL!), titled Architecture of Maria: A New Storage Engine with a Transactional Design and Jay & Tobias’s talk, titled MySQL Performance Under a Microscope: The Tobias and Jay Show. Did I mention, that nowadays, it only make sense to write code and web applications with great character set support, and Domas Mituzas will be giving you an interesting session on Practical Character Sets. With so many conflicting talks at 2pm, all I can hope is that there are some excellent blog entries, and folk end up taking videos! (And no, I checked - no way to split myself, 4-ways, unlike today’s modern processors).

At 3:05pm, take a break, and enjoy yourself at Astronomy, Petabytes and MySQL by Kian-Tat Lim. If you’re still fuelled, go take a gander at Markus & Dups talk, titled Integration of Frameworks for Rapid Web Development (sure, it might be PHP-centric, but you can learn). Once thing Rails developers will want to know though, is how to make sure you’re using the query cache properly - Baron Schwartz will tell you in The MySQL Query Cache.

At 4:25pm, I can highly recommend going to Grazr: Lessons Learned Building a Web 2.0 Application Using MySQL by Patrick Galbraith and Michael Kowalchik, because even if you don’t find much Rails-centric content, the lessons learned and what worked/didn’t, will be useful to all. However, if you’re interested in storage engines, where Falcon is really the “Web 2.0 enhanced” storage engine for MySQL, consider Falcon for InnoDB users by Kevin Lewis and Ann Harrison.

At 5:15pm, its a no-brainer that you should be at ActiveRecord Under the Microscope by Jeremy McAnally. There are times to use it, and there are times to avoid it - Jeremy will tell you when, in general.

Sessions (Thursday, 17 April 2008) - schedule
After the fun from last night (quiz show, et al), make sure you’re at the keynotes yet again…

At 10:50am, I’m sure the realisation that this is the last day of this great conference is going to hit. Never fear, you better savour it. Assuming you’re deploying your application on software made by our new corporate overlords, consider visiting Frank Mash’s talk, titled: Optimizing MySQL and InnoDB on Solaris 10 for World’s Largest Photo Blogging Community. That’s Fotolog for what it’s worth. There are a whole bunch of scaling related talks that are sure to be interesting, but let’s try to focus!

At 11:55am, DTrace and MySQL by Ben Rockwood from Joyent, might be interesting to ensure you get a performant database. However, seeing that the hooks themselves aren’t mainstream (yet), consider
Sheeri Kritzer Cabral’s talk titled: Database Security Using White-Hat Google Hacking or if you’re wondering what Google does, visit Mark Callaghan, and listen to him speak about Helping InnoDB Scale on Servers with Many CPU Cores and Disks.

At 2:00pm, Ronald Bradford will share the Top 20 Database Design Tips Every Architect Needs to Know, and it would be my definite pick.

At 2:50pm, the last session of this great conference before the closing keynote, I’ll recommend High Availability MySQL with DRBD and Heartbeat: MTV Japan Mobile Services by Patrick Bolduan (whom I met in Japan last year, great guy, and the talk should be real interesting). While centred around PHP, Arjen Lentz should impart general knowledge of what a deadly sin is in general, if you weren’t using ActiveRecord and you were querying things the wrong way from your Rails application, so consider visiting Deadly Sins Using MySQL & PHP by Arjen Lentz and Jonathon Coombes.

And that my friends, is me signing off, and telling you what a jam-packed week this is going to be.

Register Now!
There are a ton of options for the discerning Ruby, or Ruby on Rails developer. If you’ve not registered yet, please do so now! If you’re not averse to paying full-price for items, I (and any other speaker) have a 20% discount code, so drop me an email at colin AT mysql DOT com.

Technorati Tags: , , , , , , , ,

Ruby Gems, Mono System.Windows.Forms on Ubuntu

Friday, August 10th, 2007

I’ve recently started doing more development locally on my Ubuntu (Feisty Fawn) laptop (as opposed to being logged in via ssh to various machines, generally running Fedora), and have noticed some quick snags.

Ruby Gems
They’re currently installed in /var/lib/gems/1.8 which is not in your PATH. So if for example, you use cheat, you’re not going to find it. Fix it via adding /var/lib/gems/1.8/bin to your PATH (my .bashrc has it looking such as: PATH=$PATH:$HOME/bin:/var/lib/gems/1.8/bin)

Mono, and System.Windows.Forms
I have no problems with Mono and .NET related applications, normally. I run Tomboy (which I like, a lot), I can fire up f-spot, and when I need to Beagle runs fine too. But of late, I’ve had to run an application that required System.Windows.Forms, aka WinForms. Little did I know I’d need to install the winforms stuff, so a sudo apt-get install libmono-winforms* fixed this for me.

This still hasn’t made my required application run properly, but I’m now a step closer to finding out compatibility with Windows-based .NET applications and Mono. All thanks to the useful Mono Migration Analyzer (MoMA). Hat tip to Ditesh for pointing me to MoMA.

Technorati Tags: , , , , , ,

Pimping my friends: an ODF e-Note and haze.net

Thursday, August 2nd, 2007

A couple of my good friends have had some recent achievements that I clearly should help them blow their trumpets for.

First up, we have Ditesh, who’s an active proponent of ODF, have a little e-Note published on Electronic Document Standards. I got to read it back when it was in an ODF document (*grin*), and not much has changed since all the comments were pushed. Do read it, and consider giving it to upper management to read as well. Its a very well thought out document, and should be making its rounds on the Internet soon enough. Ditesh welcomes comments via email or his blog entry.

Incidentally, this is also one of the first notes that the UNDP/APDIP have published that carry a disclaimer - “The views expressed in this APDIP e-Note are those of the author and do not necessarily represent those of the United Nations, including UNDP, or their Member States.” I thought that was a little soft-cock, but this is the power of lobbying I guess.

Next up, we have Aizat creating haze.net.my, aka the Malaysian Air Pollution Index. Yes, do laugh out loud - Malaysia is very well polluted, and the API readings are pretty high usually, and the government of the day always insists its still safe. Aizat built it using Ruby on Rails, and there’s some active scraping of data (via hpricot), which then all mashes up with Google Maps. The site’s well designed (i.e. its simple), there’s an RSS feed if you’re so inclined to read details that way, and if you’re just interested about a certain area (say, Kuala Lumpur), you can dig deeper, and look at the graphs (via Gruff Graphs) of when it started becoming unhealthy and so on. Exporting it to CSV works too, in case you were using it for a project/paper on the haze.

All in all, a good side-project, very informative for those living in Malaysia or visiting Malaysia. Don’t see a good income stream (ads? pfft.), but definitely very informative. Maybe sell it to a ministry :-)

Technorati Tags: , , , , , , , , , , ,

Scaling Twitter: “Is Twitter is UDP or TCP? Its definitely UDP.”

Monday, April 23rd, 2007

Presented by Blaine Cook, a developer from Odeo, now probably CTO of Twitter (Obvious Corp spawed, I think). There’s a video and slides (yes, you need evil Flash so I haven’t viewed it myself). Then there are my notes… possibly with some thoughts attached to them. No, they’re not organized, I’m too busy and tired…

Rails scales, but not out of the box. This will cause Twitter to stop working very quickly.

600 requests/second, 180 rails instances (mongrel), 1 DB server (MySQL) + 1 slave (read only slave, for statistics purposes), 30-odd processes for misc. jobs, 8 Sun X4100s.

Uncached requests in less than 200ms in most of the time.

steps:
1. realize your site is slow
2. optimize the database
3. “Cache the hell out of everything”
4. scale messaging
5. deal with abuse
6. profit.

Have stats (something Twitter didn’t have before): munin, nagios, awstats/google analytics (latter doesn’t obviously work if your site itself doesn’t load), exception notifier/logger (exception logger is what they use at Twitter, so you don’t get lots of email :P). You need reporting to track problems.

Benchmarks - they don’t do profiling, they just rely on their users! What torture for the poor users…

“The next application I build is going to be easily partionable.” - Stewart Butterfield
Dealing with abusers…
Inverse spamming - The Italians - receiving SMS gives you free call credits!
9,000 friends in 24 hours doesn’t scale!
Just be ruthless, delete said users. This is where you thank the reporting tools, to allow you to detect abusers.

They’ve looked at Magic Multi Connections, it looks great, but it wouldn’t work for Twitter.

Main bottleneck is really in DRb and template generation. Template optimizer that Steven Kays wrote doesn’t work for them.

Twitter: built by 2 people first. And now, they’re just 3 developers.

When mongrels hit swap, they become useless. So turn swap off.

Twitter themselves don’t seem to want to give out details of how many users, etc. they have. Shifty, beyond the fact that they claim its “a lot of users”.

Twitter is not built for partitioning. Social applications should be designed to be easily partionable. Wordpress, anything 37signals builds, tends to be partionable. Things start becoming hairy when you have 3,000+ friends!

Index everything - Rails won’t do it for you, but you need to repeat for any column that appears in a WHERE clause.

Denormalize a lot - heresy in the rails book? but he hopes not. This is single handedly what saved Twitter.

They use InnoDB. Don’t do status.count() when there’s millions of rows… it’ll stop working. MyISAM will be faster, but still, don’t.

email like “$#!$” - search. Twitter has disabled search right now… This makes their database enjoy life.

Average DB time is 50ms (to at most 100ms)

They’re not hurting on the DB. The master DB machine is at a quarter CPU usage. So they don’t see the need to partition at this point.

Twitter does a lot of caching, they use MemCache. If you really need status.count() use memcache.

Query for friends status on your Twitter homepage, is a complicated query using a lot of JOIN. They use ActiveRecord, they store the status in memory, and they don’t touch the DB. They plan to use memcache in the future for the statuses too.

ActiveRecord objects are huge (which is why its not stuck in memcache yet). They’re looking at implementing ActiveRecord nano or something simiar - smaller, store in cache critical attributes, and use add method missing if you don’t find what you’re looking for.

90% of Twitter’s requests are API requests. So cache them. No fragment or page caching on the front-end, but for API requests, lots of caching.

Producer(s) -> Message Queue -> Consumer(s)

DRb: zero redundancy, tightly coupled.

They use ejabberd for Jabber server.

When the Jabber client went down, everything went down. So they moved to using Rinda. Its O(N) for take() so if the queue has 70,000 messages, you just shut it down, restart it, and lose those 70,000 messages. Sigh.

“Someone asked if Twitter is UDP or TCP? Its definitely UDP.” — Blaine Cook

LiveJournal has a horizontally scaled MySQL, that is just MySQL + Lightweight Locking. RabbitMQ (erlang) is something they’re looking at, quite clearly, but it looks ugly, and they don’t want to possibly implement it.

Starling was written. Ruby, will be ported to something faster. Does 4000 transactional messages/second, will have multiple queues (like a cache invalidation one), speakes MemCache (set, get), writes it all to disk. First pass was written in 4 hours, and its been working fine for the last few days (i.e. since Wednesday). Twitter died on Tuesday at the Web 2.0 conference! Starling will probably be open source.

Use messages to invalidate your cache.

Dealing with abusers…
Inverse spamming - The Italians - receiving SMS gives you free call credits!
9,000 friends in 24 hours doesn’t scale!
Just be ruthless, delete said users. This is where you thank the reporting tools, to allow you to detect abusers.

They’ve looked at Magic Multi Connections, it looks great, but it wouldn’t work for Twitter.

Main bottleneck is really in DRb and template generation. Template optimizer that Steven Kays wrote doesn’t work for them.

Twitter: built by 2 people first. And now, they’re just 3 developers.

When mongrels hit swap, they become useless. So turn swap off.

Twitter themselves don’t seem to want to give out details of how many users, etc. they have. Shifty, beyond the fact that they claim its “a lot of users”.

Technorati Tags: , , , ,

svrc2007 links

Sunday, April 22nd, 2007

More so that I will remember, the 2nd Annual Silicon Valley Ruby Conference 2007 have some interesting web links.

  • sv ruby conf on Tumblr - looks like Twitter for groups? They sell it as “blogs with less fuss”, but I’m definitely not signing up for yet another blogging platform.
  • Conference Meetup site for svrc2007 - this is pretty cool. People logs, make connections, upload presentations, plan gatherings. This is something conferences should definitely look at having running. Its just cool stuff.

I’m sure if you search on Flickr or Technorati for svrc2007 or something similar, that tag will be used…

Technorati Tags: , ,

ActiveRecord

Sunday, April 22nd, 2007

ActiveRecord, by Rabble.

  • Rails ActiveRecord is mostly database agnostic.
  • Good subset of the SQL standard is supported, so you can migrate very easily (this is what OS X Leopard will do - develop using sqlite on your workstation, then migrate to mysql on the server).
  • Integer primary keys, and classname_id foreign keys. Single table inheritance is what really works well.
  • What ActiveRecord doesn’t like:
  1. views
  2. stored procedures
  3. FK constraints
  4. cascading commits
  5. split/clustered DBs
  6. enums
  • Complexity is best located in the code, not in the database, according to the Rails developers.
  • Avoid SQL injection with find. Use an array or let it do the quoting for you. By itself, it allows you to do stupid things - so avoid cross site scripting, etc.
  • Joins are very rarely used, you use ActiveRecord relationships.
  • Some associations:
  1. has_one - when the FK is in the OTHER table
  2. belongs_to - when the FK is in THIS table
  3. has_many
  4. has_and_belongs_to_many
  • DBAs like FK constraints, but the Rails way is to really just use validations.
  • ActiveRecord returns enumarable objects, which look like arrays
  • ActiveRecord also allows for output formats - like to_yaml, to_xml, to_json. So your database records can become XML pretty easily.
  • find_by_sql - use it when you’re finding ActiveRecord isn’t suitable for you, i.e. you’re doing something that is not very standard.

Rabble also gave us an anecdote from his Odeo days. They thought that grabbing feeds should be done in parallel. Which is when they used RubyThreads. And suddenly, they were getting some horrendous database problems, duplicate records and so on. Moral of the story: RubyThreads and ActiveRecord don’t mix. A member of the audience mentioned that using find_by_sql consumed about 1MB of memory in his application, however, using ActiveRecord to do the same thing was costing about 20MB per thread.

Technorati Tags: , , , , ,