– We have shut it down last night as usual and when we came back to the office in the morning I’ve turned it on but could not reach it over the network. I’ve took it to my local computer shop but they said there is nothing they can see and even spoke to Iomega support which could do nothing to help. We have important files on it to be delivered yesterday and no backups. Can you please save us and recover the data?

Well, me being very optimistic, thinks there is a usual SATA NAS with some kind of linux and software RAID

– Sure, bring it on, it will be an hour of work until I can tell you if it is recoverable

And the thing arrives by mail from Rome to Milan. I remove the cover and see IDE 3 disks. WTF? why not 4? And where am I going to find an IDE controller. Stupid me didn’t check the year of production and specs. But the good news is that it has a VGA and USB and PS/2 connectors.

Great! Hooking up a monitor and an USB keyboard. Booting

Apparently another awful assumption. FreeBSD. Ouch, this one is not from my league. University guys, all their cleverness and stuff. Not my style and the last one I had was back in 2001 🙂 Anyway, there is apparently a “broken” array that blocks the init process so I want to tinker some with it. The message says “press enter”, I do. Another WTF. USB disabled. Keyboard doesn’t light up. This is easy. Reboot, BIOS, Enable USB. Booting. Enter. WTF? OK, usb driver is not there. Need PS/2 keyboard. The only one available in the closest electronics mall (a shop for retarded) is the one on the left.

Laughing like hell coming back to office. Hook up the pink wonder, boot. No joy, atkbd driver is not there too 🙂 At this point the only remaining option seems to boot another OS so googling FreeBSD PXE. Setting all up and boot FreeBSD 8.2. And here comes the big one. The FreeBSD livefs knows nothing about this software raid.

At least I should be able to mount the root fs, this can be nothing but level 1 RAID. And yes it is, so I mount one of the disk’s partition and start greppin g around. Not a word about raidctl or configuration. The Iomega guys removed all of the configs from the final image.

Hours of research to understand that there was a period of 2 years when something called “RAIDFrame” ported from NetBSD was in use and later it was abandoned and forgotten in favor of another software RAID. Well, then I need a CD, the only old computer with IDE on board I found has one IDE CDROM (40x!) trying to hook it up and put some BSD Live CD in. It doesn’t open and when open with a clip and CD is loaded it doesn’t boot. FML. Remaining option, hook up another HDD (lucky me, still have 1x80GB IDE at home) and install FreeBSD on it. Using the other computer to power the disk, and a 33Mhz cable, the disk is attached and PXE install of 5.2.1 is started.

As the author of RAIDFrame port says (2001!), the 5.0-stable contains the driver code, and mailing lists from 2004 complain about it stop working in 5.3-release, I again assume that it will eventually work. So far so good, installed and booted from my disk. Loading the raidframe.ko and silence. It doesn’t see the arrays. This is bad. No idea how to actually “discover”. Now, coming from Linux you are used to certain terminology, like assemble, initialize or maybe reboot. Here it seems all different. And did I mention the lack of information? Google brings some links, most of them 404 and if not, these are mailing lists  where people assume I should either read code or their thesis. Whichever can be found first. A guide? sure, you are being sent to google “NetBSD RAIDFrame” that actually brings a page with some explanations about how to setup a raid. Even a step-by-step guide for RAID1. Not a word about recovering from my situation.

Another couple of hours reading, googling and stuff. I come to a conclusion that I need a config file to feed to raidctl program:

# raidctl -c raid2.conf raid2

this should do the trick as far as I have good config. So trying to guess and put some reasonable values:

START array
# numRow numCol numSpare
1 3 0

START disks
/dev/ad2s3e
/dev/ad0s3e
/dev/ad1s3e
START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
64 1 1 5
START queue
fifo 100

running raidctl on it and boom! kernel panic. kewl. Trying it again and it is still locks.

Well, not being in the mood to debug BSD kernel, I decide to lower the version, (Third day of hacking BTW). Reinstalling everything and logging into 4.6.2-RELEASE. Did I say that running `strings` on the original kernel finds it is a:

@(#)Unicorn 1.2-RELEASE #85: Fri Nov  1 01:38:15 CST 2002
[email protected]:/usr/src/sys/compile/raid
Unicorn

Feeling like an Alice, I get my hands on the 4.6.2 this morning, this one doesn’t have raidframe.ko and I follow the instructions on the author site to apply the patches for 4-STABLE. Some more googling on how ports and /usr/src work on FreeBSD and finally kldload raidframe.ko. OMG. I see dead people raid arrays :

Component on: : 483278
Row: 0 Column: 0 Num Rows: 1 Num Columns: 2
Version: 2 Serial Number: 113211 Mod Counter: 4420
Clean: No Status: 0
sectPerSU: 64 SUsPerPU: 1 SUsPerRU: 1
RAID Level: 1  blocksize: 512 numBlocks: 483200
Autoconfig: Yes
Contains root partition: Yes
Last configured as: raid0
Component on: : 417690
Row: 0 Column: 0 Num Rows: 1 Num Columns: 2
Version: 2 Serial Number: 543211 Mod Counter: 17145
Clean: No Status: 0
sectPerSU: 64 SUsPerPU: 1 SUsPerRU: 1
RAID Level: 1  blocksize: 512 numBlocks: 417600
Autoconfig: No
Contains root partition: No
Last configured as: raid1
Component on: : 78648230
Row: 0 Column: 0 Num Rows: 1 Num Columns: 3
Version: 2 Serial Number: 101940 Mod Counter: 1237288795
Clean: Yes Status: 0
sectPerSU: 64 SUsPerPU: 1 SUsPerRU: 1
RAID Level: 5  blocksize: 512 numBlocks: 78648128
Autoconfig: No
Contains root partition: No
Last configured as: raid2

And it even starts the root fs one! Great, very excited at this point I google again on how to start an array which is not configured automatically. People say it again, raidctl -c config.conf  raidX should be enough. So I create the config (wow, all the params are available from labels read directly to dmesg!):

START array
# numRow numCol numSpare
1 3 0
START disks
/dev/ad0s3e
/dev/ad1s3e
/dev/ad2s3e
START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_5
64 1 1 5
START queue
fifo 100

Which seems more than correct for me. Very logical. Running raidctl that refuses to create the array:

raid1: Component /dev/ad2s3e being configured at row: 0 col: 2
Row: 0 Column: 0 Num Rows: 1 Num Columns: 3
raid1: Component /dev/ad1s3e being configured at row: 0 col: 1
Row: 0 Column: 2 Num Rows: 1 Num Columns: 3
Column out of alignment for: /dev/ad1s3e
raid1: Component /dev/ad0s3e being configured at row: 0 col: 0
Row: 0 Column: 1 Num Rows: 1 Num Columns: 3
Column out of alignment for: /dev/ad0s3e

And so on. Ok, seems like the order of the array members is important. I can understand that. But why the hell doesn’t it construct it in the order from the label??? Whatever. Reading from various sources suggest that the order of disks in the config file is not important. Then how? Nobody knows 🙂 Great. But wait, it is always worth trying isn’t it? and yeah, using some scientific guessing I change the disks section to:

/dev/ad2s3e
/dev/ad0s3e
/dev/ad1s3e

and OMG, OMG, it works!!!!

Apr  1 11:23:05 iomega /kernel: Waiting for DAG engine to start
Apr  1 11:23:05 iomega /kernel: Warning: p_fd fields not set
Apr  1 11:23:05 iomega /kernel: raid1: Summary of serial numbers:
Apr  1 11:23:05 iomega /kernel: 101940 3
Apr  1 11:23:05 iomega /kernel: raid1: Summary of mod counters:
Apr  1 11:23:05 iomega /kernel: 1237288795 3
Apr  1 11:23:05 iomega /kernel: raid1: Component /dev/ad2s3e being configured at row: 0 col: 0
Apr  1 11:23:05 iomega /kernel: Row: 0 Column: 0 Num Rows: 1 Num Columns: 3
Apr  1 11:23:05 iomega /kernel: Version: 2 Serial Number: 101940 Mod Counter: 1237288795
Apr  1 11:23:05 iomega /kernel: Clean: Yes Status: 0
Apr  1 11:23:05 iomega /kernel: raid1: Component /dev/ad0s3e being configured at row: 0 col: 1
Apr  1 11:23:05 iomega /kernel: Row: 0 Column: 1 Num Rows: 1 Num Columns: 3
Apr  1 11:23:05 iomega /kernel: Version: 2 Serial Number: 101940 Mod Counter: 1237288795
Apr  1 11:23:05 iomega /kernel: Clean: Yes Status: 0
Apr  1 11:23:05 iomega /kernel: raid1: Component /dev/ad1s3e being configured at row: 0 col: 2
Apr  1 11:23:05 iomega /kernel: Row: 0 Column: 2 Num Rows: 1 Num Columns: 3
Apr  1 11:23:05 iomega /kernel: Version: 2 Serial Number: 101940 Mod Counter: 1237288795
Apr  1 11:23:05 iomega /kernel: Clean: Yes Status: 0
Apr  1 11:23:05 iomega /kernel: RAIDFRAME: Configure (RAID Level 5): total number of sectors is 157296256 (76804 MB)
Apr  1 11:23:05 iomega /kernel: RAIDFRAME(RAID Level 5): Using 20 floating recon bufs with head sep limit 10
Apr  1 11:23:17 iomega /kernel: Opening raid device number: 1 partition: 2
Apr  1 11:23:17 iomega /kernel: Building a default label...

This was the moment of a real happiness. Small but very pleasant victory. The array was fsck’ed and mounted without problems and I am copying the 50GB of important data to my office server with hardware RAID.

Conclusion:

Reverse engineering of any level is a lot of fun. But when you have time to do that. And I don’t. Ans so are many good hackers I know.

So first lesson is probably for the owner of the NAS. If you bought some device to store important data on it then you”d better change it once in 3 years. I would recommend to change it as soon as warranty expires. Sell it on ebay and buy a new one.

Lesson for me. Stop f***ing assuming. No good at all.

The only thing which is still unclear in the story is the original reason of failure. I have no idea. Most likely the disk reordering was caused by the “local shop” guys taking the disks and the motherboard out. So maybe this mess could be avoided. Don’t know.

Update 2014-03-04:
Gregory did a great job saving his files from a similar but worse disaster. He actually had a failed disk that didn’t start blocking the array from assembling properly. Despite his lack of practical experience with BSD systems, this gentleman rolled up the sleeves and recovered his data from the beast. Here is the full story.

Post Navigation