Archive for June 2008

Migrating Firefox/Thunderbird from Linux to OS X

Today, I completed my migration of my personal machine to one that runs OS X. For those not following Twitter, I picked up a MacBook Air last week, and have slowly been moving my stuff off from the Dell. The Dell can now serve as a full development machine, and I can start running “unstable” Linuxes on it now (“unstable” like Rawhide).

But I digress. This is about how I moved Thunderbird and Firefox over to my new box.

Thunderbird:
Copy ~/.thunderbird over, and place it in ~/Library/Thunderbird on OS X. Only problem I found was with the Lightning plugin, which managed to grab itself an update, and all was dandy.

Firefox:
Copy ~/.mozilla/firefox over, and place it in ~/Library/Application Support/Firefox. All the plugins I had, just ran fine.

Only snag? I couldn’t find a copy of Firefox 2 online. Good thing I had a copy on another Mac… Why did I need Firefox 2? Google Browser Sync. Though I suspect that in the very near future, I’ll move over to Mozilla Weave, and get all my systems up to speed with Firefox 3.

Next up, lets see how long I run OS X on the Air… or do I replace it with Linux if it annoys me significantly enough?

How Facebook serves pictures

I caught Facebook – Needle in a Haystack: Efficient Storage of Billions of Photos on Flowgram. First up, I’m not a big fan of Flowgrams – the format is sensible, slide and voice, is excellent, but the delivery in a web browser isn’t optimal… make downloadable videos!

The talk however, was excellent. Do watch it, and learn a bit more about Facebook’s infrastructure. Anyway, some notes I took from the talk:

  • “We’re one of the largest MySQL installations in the world”
  • Use memcache – “We have memcache because databases aren’t fast” (later on in the questions)
  • Separate team focusing on APE (Apache, PHP and Extensions that they work on)
  • 6.5 billion total images, 4-5 sizes stored for each, so 30 billion files, of about 540TB total… During peak? 475,000 images served per second, and growing by 100 million uploads per week
  • Images are usually pulled from a Content Delivery Network (CDN), so it reduces the request rate on their servers
  • They use NetApp Storage, but basically their upload servers speak NFS to write to NetApp.
  • Cachr (evhttp based) and File Handle Cache use memcache as a backing store… FHC is based on lighttpd!
  • Makes use of a “haystack” – user-level abstraction, storing a separate index file that has more efficient metadata (to reduce disk seeks – 1 disk seek or less for any workload). Pretty deep in the discussion of the haystack server architecture, also evhttp-based
  • MySQL use? Very few transactions, very few joins
  • Video is a very different beast, and the design is a little different

If you’re into information about photo storage sites, don’t hesitate to also read my previous notes on Flickr.

One Buck Short on Channel V AMP!

Tonight, One Buck Short, will be on Channel V, AMP. Its a 30 minute show, on the 24th of June 2008, at 8pm. ASTRO carries Channel V (and if you’re not in Malaysia, its on various networks). Watch them!

I’m excited about One Buck Short. Its a band I tend to track, and photograph when the opportunity arises. I have sets, from 2005 when there were at Celcom/8TV Homegrown, and in 2007 when they were at JamAsia’s re-opening. I have photos from their launch that still require processing, so expect them to arrive online soon enough.

Snow Leopard to have ZFS

The next release of Mac OS X, Snow Leopard, will have ZFS enabled by default. There’s a good article for the masses, at ZDNet on ZFS on Snow Leopard – do read it.

We all know running any form of server using HFS+ tends to be a bit of a joke. So, Snow Leopard Server will be where ZFS makes its debut. It won’t be long before regular users will want it in their Mac Pro’s and so on…

OS X as a deployment platform for production MySQL servers? This is not far off, I’m sure.

Spacewalk, and what we can learn about naming

Red Hat releases Spacewalk. It is described as: “the upstream community project from which the Red Hat Network Satellite product is derived“. Congratulations to all whom have worked on it, especially my friends who tired endlessly over it in the past.

Red Hat, is sticking true to its promise, of open sourcing everything they make. Best of all, they recognise Fedora (they always did, since say, Fedora Core 2 or 3), CentOS (a direct “competitor”/rebuild of RHEL), and Scientific Linux (I know of a certain university’s sysadmin who will be blessing Spacewalk, as her life will now be a lot easier).

There have been a few blogs about it… Matt Asay asks about a community (Red Hat traditionally wasn’t good at this, but with Fedora, I believe they’ve learned, and I’m happy to say I think, I helped in the education process). No one however, focused on the technical aspects around Spacewalk/RHN.

Case in point: Oracle is at the heart of it. RHN was designed almost seven years ago, and I’ve heard amazing stories from Gafton, Greg, and Peter. How Gafton found hidden “secrets” inside Oracle to boost performance, and a whole bunch of interesting things, best to talk about over a beer (the irony? When I first met these folk, I couldn’t even legally drink a beer in the US…)

Read the Developer Documentation, note that they use Perl, Python and Java in the current code base (but only Perl and Java is the way forward). There’s a DB Schema available… and I wonder when someone will port this to MySQL?

The Spacewalk FAQ mentions the lack of resources in the past to add an open source database, but would want to do so soon. There’s even help on getting Oracle XE running. The glimmer that there is to be an open source database behind Spacewalk, is what tells me that the MySQL community, that benefit from such a tool (so you’re a DBA and a sysadmin at a fairly largeish installation), should port this to run on MySQL.

What else can we take away from Spacewalk? The excellent positioning. A community project from which the RHN product is derived. This is similar to what Fedora is positioned as: Another striking difference of Fedora is our goal to empower others to pursue their vision of what a free operating system should be like. Fedora now forms the basis for derivative distributions such as Red Hat Enterprise Linux , the One Laptop Per Child XO and Creative Commons’ Live Content DVDs.

Distinctive naming. Helps create a lack of confusion (at the price of an ubiquitous name? Sure, you just have two ubiquitous names now). MySQL Enterprise vs. MySQL Community. They’re both MySQL (don’t even get started on the odd/even numbering scheme…). I dream the day, when we have MySQL Enterprise and Sakila (formerly known as MySQL Community).

O’Reilly to offer DRM-free ebooks…

I for one, welcome that O’Reilly will soon be offering books for the Kindle, and as DRM-free digital bundles.

There’s a reason why I have a Safari subscription: I like reading technical books on my laptop (because I can search through it), and more importantly, because recently I have in my possession, an eBook reader (from Sony).

There is no mention if Safari subscribers get access to the eBook bundles (currently, I already am at the most expensive Safari subscription you can get). My current method is downloading chapter by chapter, but I’m only given 5 download tokens a month, for free.

James Webster posts a comment on the blog, asking if this will be available for Safari users… Andrew Savikas replies saying they’re working on it.

Today, each PDF chapter I get off Safari, is DRM-free (its just a PDF, no need to input password, etc.). It’d be nice if I could just download the entire book (with one download token), and read it on my eBook reader, as is…

O’Reilly, if you’re reading, it’d be the smartest thing you’d do. Again, you will then lead the pack, from Apress or Packt.


i