I am paranoid about backups. Until this week I was very fond of Windows Home Server which is Microsoft’s backup solution for the home. You simply put it on your network and every night it backs up your household computers to a server which contains redundant disks. It is a real server and you can have shares on it which offer duplication.
The great feature is that the disks do not all have to be the same size, you can mix and match. I bought each one as I needed it. They are combined by a cool bit of technology called “Drive Extender”. This article will show that although this is a defining feature of WHS and really cool it is also a bad idea because with the best will in the world no software solution is ever going to give your data the protection that hardware RAID will.
I have three 500 GB and one 1 TB drive. At least I did.
One of the three 500 GB drives failed and the machine would not boot – this would not happen with RAID.
- Hardware raid is fault tolerant and tested by millions of installations.
I could not identify which drive had broken – this would not happen with RAID.
- Usually a red light comes on. If it did not then you can still connect with HTTP and find out which drive is sick (NAS).
The instruction manual says “if you have a disk fault use the software to remove the disk from the set”. I could not do this because I could not boot. Failure to do this was to have dire results later on – this would not happen with RAID
- You take out the drive with the red light flashing put in a new one and the RAID set rebuilds.
I took all the disks out and then put the system disk back in. It rebooted! I then pressed the magic “rebuild my server button” on the front of my HP Mediasmart 475 and it rebuilt.
Unfortunately because I did not rebuild it with the other (good) disks in it all the indexes to my redundant data was lost.
User error resulted in 50% data loss and because by default WHS does not store backup sets redundantly all backups where lost – this would not happen with RAID.
However because I am totally paranoid I do have a third backup system I use Second Copy to copy data from the server shares to another disk. This will allow me to recover 95% of my data. I will still lose some data because I have discovered that recently WHS has been corrupting files (perhaps due to the failing disk) and I have backed them these corrupt copies.
It would seem that Microsoft agree with my analysis that Drive Entender is a great technology but too difficult. They have cut it from WHS version 2 and are advising manufactures to supply RAID systems.
After about 3 days work I will have managed to restore my files. I should name check
I am now going to use KeepVault to automatically sync an “Offsite” directory with their offsite storage and BDBB to make WHS store redundant copies of my PC backup databases.
I am going to continue to use WHS because if I don’t I will have a £ 800 room heater but if you are considering a back up solution I would recommend either waiting for the next version of WHS (currently in beta) deployed on a server with hardware RAID or a good quality NAS from someone like Buffalo or
QSnap QNAP NAS. Before you purchase make sure that
- it has RAID
- if a disk fails it has a red light next to it on the chassis
- you can replace a failed disk and the set will automatically rebuild without any user interaction
- An optional nice to have feature would be built in replication or sync to offsite storage
–update 2011-03-11 –
Before rebuilding my Windows Home Server I copied off all the data “just in case”. I think I have just found my missing 50%. Although the indexes are all broken I will try to fix them with File Conflict Resolver and then copy the data back into the server. So the pure WHS solution should result in zero data loss. However if I had not had my extra backup I would have died of a heart attack. This does not change my opinion that WHS Drive Entender is not suitable for mass market use because the disk recovery use cases are too complex.
–update 2011-03-15 –
Well I am still at it but I think I have a working solution. The problem is that my files are scattered across 3 disks and corrupt files are mixed in. The result of this is that the WHS permanently shows error warning and I don’t know where to find the good file.
Delete all the content in old WHS shares. This is the only way to remove file conflicts when you have 1000s.
Before doing this copy out the good files into a new WHS share. I will use Photos1. This will rebuild the index. This is tricky, most tools such as SyncToy (free) will fail after a few bad reads and I have thousands of corrupt files.
RDP to the server and open a command window
Use xcopy /c /d /v /s “d:\shares\Photos” “\\server1\Photos1″ (we will build an new WHS index on this share)
- /c ignore errors!
- /d don’t overwrite files with same date (allows restarting from where you were if necessary)
- /v verify the write
- /s include subdirectories
There are other switches that you could consider but these are the most important.
Having used xcopy to copy all the files from all your disks into the single WHS target share (duplication off) you can use SyncToy with preview to verify that you have what you think you should have.
Now delete all the files in the old share “eg Photos”.
Now a clever bit.
Use RDP to share “d:\shares”
In windows explorer or at the command line
move “\\server1\shares\photos1″ to “\\server1\shares\photos”
because the server knows that these are on the same volume (accessed through shares) this will only take a few seconds as the index is updated.
You now have a nice clean indexed version of Photos. It will have every good file from the orignal set of data.
I have now totally rebuilt my WHS shares using this method. I am pretty sure that 5% of the files were missing. It is possible that WHS had not duplicated them and therefore when the disk failed they were not available on another drive. If so then this is a WHS fault or perhaps a user fault if it told me this but I ignored it.
– update 15/3/2011 –
I now have permissions hell. The files are in the user shares but the users cannot access them. I suspect this is because of the security descriptors on the files from before the restore. A system restore deletes users and you have to recreate them.
Solution. Copy every file to the “Public” share to strip its permissions and then back to the user share.