Why does windows find the need to modify the partition table at boot...

...or how I almost lost 1.5 TB of data today.

If you use software/fake RAID, this might happen to you too.

See, this morning I thought I would upgrade my HTPC's BIOS just for the heck of it. This is the PC that has a 2TB RAID5 array based on nVIDIA's MediaShield.
Now, despite what I had selected during the BIOS update, the BIOS settings and DMI data were reset after reboot, which means that the HDDs were back to individual IDE emulated drives, rather than members of the RAID array.

Normally, this wouldn't be a big deal, except that, before I cancelled the Windows boot, it was apparently able to look at the disks (using the MediaShield driver), find out that the capacity of the disk it was booting from (now a single 1TB IDE/AHCI HDD) was less than the capacity reported in the partition table, and re-write the partition table of HDD1 to reduce the dimensions of the last partition.

Of course, re-writting a partition table without anybody asking you to is the shortest way to screw up a disk or RAID array, and screw up it did: As soon as I restored the RAID settings in the BIOS and booted Windows, my 1.5 TB data partition was now identified as unformatted and gone! Talk about massive data loss...

No respectable O/S should ever modify a partition table without asking the user first. It's just common sense: The O/S is never, and I have to stress out that part, NOT EVER, smarter than its user (no matter what the O/S developers might think, or how smart they think they are themselves). You do not modify a partition table without asking, EVER, it's really simple as that!

Now, after much cursing, and some accidental good luck, I found that if the first drive was disconnected from the RAID5 array (which happened accidentally as I was trying to invert HDD#2 and HDD#3, since it originally looked like the BIOS upgrade has modified the SATA IDs), the rest of the array booted fine, albeit in degraded mode, and saw the old 1.5 TB data partition alright. Definitely makes sense with the fact that Windows would of course only have modified the partition table of the boot drive while the HDDs were in IDE mode.
But of course, as soon as you remove one drive from your RAID5 array, and boot in degraded mode, the array will flag that drive as failed on next reboot

From there on, the solution is to re-add the drive to the array to resync. Takes a while, but if you trust your other disks not to fail duing the super-lengthy re-sync, probably the safest solution.
Otherwise, it's probably a good idea to have a copy of the Master Boot Record (i.e. the first 512 bytes) of every single drive from your array, and restore it using a decent O/S like Linux. Plus, as experience will show you time and again, it's also always good practice to keep a copy of the MBRs of all your disks that contain important data, so that you can try to address any kind of partition formatting catastrophe.


Displaying International Domain Names in the browser's address bar

  • In IE (>7): Tools -> Internet Options -> Advanced -> International -> Always show encoded addresses -> Uncheck
    A typical "All or nothing" Microsoft solution to a problem that demands a smart approach. Well done Redmond!
  • In Firefox (from Marc Blanchet's post): about:config and filter using keyword "IDN". There you have two possibilities:
    1. Create a "network.IDN.whitelist.com" boolean key and set it to true => will display all the .com IDNs using their Unicode characters, a.k.a. "the Microsoft way"
    2. Create a new network.IDN.whitelist boolean entry for your full domain (eg: "network.IDN.whitelist.xn--abc.com") and set it to true.
Of course, none of this is really satisfying, as domains that do not present any risk of homograph attack (eg: sequence of glyphs that are not used in any language, or one glyph characters that cannot be mistaken for any other), are still not displayed as Unicode by default in those browsers, which really sucks.

The way to address this issue would be for unicode.org to maintain a list of lookalike characters (i.e. characters considered as dangerous to use in an IDN because they can easily be mistaken for another character). The browser could then use this list (along with a list of all the valid Unicode characters at the time the list was created, as you want to take provision against new spoofable Unicode characters) to sort this issue.

Of course, Unicode is likely to have cold feet about producing such a list, as they could end up being sued if they miss a lookalike and end up being sued as a result, but really, as the authority, it should be part of their role...


Unicode charts as a single image

Well, one guy was crazy enough to create a single image containing all the Unicode charts from unicode.org and put it online, and he was even nice enough to provide both an interface to dynamically visualize the rendered image, as well as allow people to download the whole thing (which, even if slightly outdated, I find super useful).

Props to Ian then!

Update 2009.08.24 - You might also be interested in this


Windows Command Prompt in UTF-8

  1. Change the default font to Lucida
  2. issue the command chcp 65001