Posts Tagged ‘data recovery’

Thumb drive data recovery

I haven’t done any data recovery or data rescue work in sometime (the last time was on Linux, with a combination of dd, ddrescue, and some throwaway code to parse JPGs – it was a Compact Flash card that needed saving). This time, all I had was macOS, a 16GB thumb drive, and the files were someone’s life’s work, which were more than just JPGs but also AI (adobe illustrator), DOC, XLS, PDF, TTF, etc. files.

So via Homebrew, I installed ddrescue again. A command like ddrescue -v -n -c 4096 /dev/disk2 helena.dmg helena.log seemed to work. On macOS, fdisk totally couldn’t get me anything useful and if I ran diskutil list the output would be as follows:

/dev/disk2 (external, physical):
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:                                                   *15.5 GB    disk2

For good measure I wanted to also make an image via dd, dd if=/dev/disk2 conv=sync,noerror bs=4096 of=helena.img. It was clearly throwing many errors, an example of which:

13399375872 bytes transferred in 1263.864380 secs (10601910 bytes/sec)
dd: /dev/disk2: Input/output error
dd: /dev/disk2: Input/output error

The real problem was mounting either the DMG or the IMG. On Linux you have this option to loopback mount a file; on macOS this isn’t quite there. There is hdiutil but frankly, this doesn’t work if there is no partition record. I tried to mount it using hdiutil attach -noverify -nomount helena.img but that didn’t work to then do a diskutil mountDisk.

Then… I found a tool: PhotoRec. I wouldn’t have to write something to parse the magic numbers and extract files. PhotoRec just works. It parsed the IMG file, and spat out plenty of files to look at. Recovery was generally full.

For reference, on Linux, there are some good resources: Mounting a raw partition file made with dd or dd_rescue in Linux, and Guide to Using DDRescue to Recover Data. From a forensic standpoint, Disk-Arbitrator looks like a good tool as well.

On Ma.gnolia, and data recovery

There’s a good podcast from Chris Messina and Larry Halff, about what really happened at Ma.gnolia. If you’re at all interested in what happened (i.e. how did they lose all their bookmark data), don’t hesitate to watch the video. I took some quick notes:

  • half a terabyte database file got corrupted
  • a mysql 5 database
  • everything was running even though there was corruption, and eventually, the site went down
  • backup system also failed, as it didn’t backup the data from mysql
  • backup was just backing up corrupted data (file sync over a firewire network was the backup mechanism)
  • a Rails application, he now recommends clouds over running your own infrastructure for startups
  • a couple of xserves (for database, etc.) and four intel mac minis as front end web servers
  • the site didn’t actually make any money

So I don’t know if Baron can rescue Ma.gnolia, per se, but I think the problem was largely:

Doing a file sync over the Firewire network, as the backup mechanism

You can’t safely backup MySQL that way. I don’t know what mechanism was used, but it sounds like rsync, and as much as I love rsync, I wouldn’t use it to backup a live running MySQL database that way.

With two servers, there should have been MySQL replication.

I’m curious if the data recovery Baron talks about is that of using the utility ddrescue? After all, ddrescue gets the raw data off the block device, without even trying to mount it. After that, you can attempt to recover the MySQL data off disk. In fact, I was surprised that the Ubuntu folk have a very nice Data Recovery page – no information about extracting MySQL databases, but its nothing a little hackery won’t get you.

I tried to ping Larry on Twitter, to ask what engine they were using… No response, per se. Good luck, and I hope the users get their data back, in time!


i