2009-08-26

Search for low pointer values in IDA Pro

Use the following regexp:
DPTR, #0x[0-9A-F]{3}$

This will lock only on DPTR assignment values in 0x0000 0x0FFF

Switch statements, 8032/8051 style

Now this next trick is a little bit more clever.

If, when disassembling 8032 / 8051 code, you ever see the start of an lcall'ed routine that looks like this
ROM:0000B5BA                 pop     DPH             ; Data Pointer, High Byte
ROM:0000B5BC pop DPL ; Data Pointer, Low Byte
and then proceeds to use DPTR for movc instructions, the effective DPTR address used in that routine will be the next address after the LCALL (because lcall did push the current PC on the stack)

This is effectively used by, hum, some code, to implement switch statements that might be a bit tricky to detect during reverse engineering, as the disassembler obviously expect lcall to return, and will start disassembly the switch table.

Example code:
ROM:00006CA7 ROM_6CA7:                               ; CODE XREF: ROM_6C60+15j
ROM:00006CA7 ; ROM_6C60+1Bj ...
ROM:00006CA7 mov DPTR, #0xC46A
ROM:00006CAA movx A, @DPTR
ROM:00006CAB xrl A, #0x60
ROM:00006CAD jnz ROM_6CF4
ROM:00006CAF mov DPTR, #0xB03E
ROM:00006CB2 movx A, @DPTR
ROM:00006CB3 lcall case_switch_byte
ROM:00006CB3 ; ---------------------------------------------------------------------------
ROM:00006CB6 .word ROM_6CE4
ROM:00006CB8 .byte 0
ROM:00006CB9 .word ROM_6CE4
ROM:00006CBB .byte 0x20
ROM:00006CBC .word ROM_6CE0
ROM:00006CBE .byte 0x2A
ROM:00006CBF .word ROM_6CDC
ROM:00006CC1 .byte 0x2B
ROM:00006CC2 .word ROM_6CDE
ROM:00006CC4 .byte 0x2D
ROM:00006CC5 .word ROM_6CE2
ROM:00006CC7 .byte 0x2F
ROM:00006CC8 .word ROM_6CDA
ROM:00006CCA .byte 0x39
ROM:00006CCB .word ROM_6CD2
ROM:00006CCD .byte 0x5A
ROM:00006CCE .word 0
ROM:00006CD0 .word 0x6CEC
ROM:00006CD2 ; ---------------------------------------------------------------------------
ROM:00006CD2
ROM:00006CD2 ROM_6CD2: ; DATA XREF: ROM_6C60+6Bo
ROM:00006CD2 mov DPTR, #0xB03E
ROM:00006CD5 mov A, #0x30 ; '0'
ROM:00006CD7 movx @DPTR, A
ROM:00006CD8 sjmp ROM_6D54
ROM:00006CDA ; ---------------------------------------------------------------------------


with case_switch_byte being:

ROM:B5BA
ROM:B5BA ; =============== S U B R O U T I N E =======================================
ROM:B5BA
ROM:B5BA ; Iput: A = matching case value (byte)
ROM:B5BA
ROM:B5BA case_switch_byte: ; CODE XREF: ROM:4046p
ROM:B5BA ; ROM_4552+178p ...
ROM:B5BA pop DPH ; Data Pointer, High Byte
ROM:B5BC pop DPL ; DPTR = lcall return address
ROM:B5BE mov R0, A
ROM:B5BF
ROM:B5BF loop_through_cases: ; CODE XREF: case_switch_byte+24j
ROM:B5BF clr A
ROM:B5C0 movc A, @A+DPTR ; NB: @ is confusing. It's just A+DPTR
ROM:B5C1 jnz valid_dest ; make sure dest @ != 0
ROM:B5C3 mov A, #1
ROM:B5C5 movc A, @A+DPTR
ROM:B5C6 jnz valid_dest
ROM:B5C8 inc DPTR
ROM:B5C9 inc DPTR ; if null dest, just use the next
ROM:B5C9 ; word as dest address
ROM:B5CA
ROM:B5CA dest_match: ; CODE XREF: case_switch_byte+1Fj
ROM:B5CA movc A, @A+DPTR
ROM:B5CB mov R0, A
ROM:B5CC mov A, #1
ROM:B5CE movc A, @A+DPTR
ROM:B5CF mov DPL, A ; Data Pointer, Low Byte
ROM:B5D1 mov DPH, R0 ; dest into DPTR
ROM:B5D3 clr A
ROM:B5D4 jmp @A+DPTR
ROM:B5D5 ; ---------------------------------------------------------------------------
ROM:B5D5
ROM:B5D5 valid_dest: ; CODE XREF: case_switch_byte+7j
ROM:B5D5 ; case_switch_byte+Cj
ROM:B5D5 mov A, #2
ROM:B5D7 movc A, @A+DPTR
ROM:B5D8 xrl A, R0 ; cmp val with parameter
ROM:B5D9 jz dest_match
ROM:B5DB inc DPTR
ROM:B5DC inc DPTR
ROM:B5DD inc DPTR ; skip 3 bytes to next switch table entry
ROM:B5DE sjmp loop_through_cases
ROM:B5DE ; End of function case_switch_byte
ROM:B5DE
ROM:B5E0


The same kind of routine also exists for a word parameter instead of a byte.
Most of the time, the case values will follow some kind of logical order, so if you see a bunch of sequencing bytes of word, interlaced with what look like offsets, and preceded by an lcall, you might want to chack what's on the other end of that lcall.

OR, the preferred way once you have made your initial pass at identifying code, look for a function that starts by popping DPH and DPL, and seek all the lcalls that cross reference to it to identify the switch tables.

Oh, and for those who might wonder, of course, as soon as you pop the PC address that's been enqueued on the stack, the lcall never returns, and becomes exactly like a jump.

Coming next: How the hell are these bloody strings and other data sitting in standalone data sections referenced, where there does not appear to be any obvious address referencing to them anywhere in the disassembly...

2009-08-24

Now I remember why I hate the 8032...

...and pretty much EVERY CPU designed by intel, ever.

It's not like they could have turned a 16 bit CPU into a 24 or 32 bit addressing one, because that's way harder to do than extending the 32 bit x86 to 64 bit for instance (or the original 16 bit x86 to 32 bit)...

"Look Ma! It's a flock of randomly chosen flying search engine terms to direct a certain kind of people here!": Mediatek, MT8226, 8226, LCD, Samsung, 8032, 8051, MT13x9, bank, page, 0xFFFF, reverse engineering, firmware.

The usual trick, apparently originally devised by Philips, is to use one of the 8032 ports (usually P1) as an extention of the 16 bit address bus, thus acting as a ROM bank selector => if you use 8 bits of that port, you now have 24 bit addressing.

Of course, this doesn't mean that your PC will suddenly become 24 bit aware, and since it still remains 16 bit, you need to have overlap code, for each of your 64 KB banks ([0000-FFFF]) so that, as far as the CPU is concerned, the same bank is still being used.

This usually results in LOADS AND LOADS of ROM space wasted with the following kind of crap:

ROM:8301 ; --------------------------------------------------------------------------
ROM:8301 mov DPTR, #0x7E2D
ROM:8304 ljmp bank_0E
ROM:8307 ; ---------------------------------------------------------------------------
ROM:8307 mov DPTR, #0xE4D1
ROM:830A ljmp bank_0B
ROM:830D ; ---------------------------------------------------------------------------
ROM:830D mov DPTR, #0xF51E
ROM:8310 ljmp bank_0F
ROM:8313 ; ---------------------------------------------------------------------------
ROM:8313 mov DPTR, #0xFA0D
ROM:8316 ljmp bank_0F
ROM:8319 ; ---------------------------------------------------------------------------
ROM:8319 mov DPTR, #0xF924
ROM:831C ljmp bank_0F
ROM:831F ; ---------------------------------------------------------------------------
ROM:831F mov DPTR, #0xFDA7
ROM:8322 ljmp bank_0D


with the bank switching function of the like:

ROM:8288 ; ---------------------------------------------------------------------------
ROM:8288 bank_0E: ; CODE XREF: ROM:84F6j
ROM:8288 ; ROM:852Cj ...
ROM:8288 mov A, P1 ; Port 1
ROM:828A anl A, #0xF ; 8 bits for high addressing => 16 banks
ROM:828C cjne A, #0xE, ROM_8291 ; bank 15 (0x0E)
ROM:828F clr A
ROM:8290 jmp @A+DPTR
ROM:8291 ; ---------------------------------------------------------------------------
ROM:8291 ROM_8291: ; CODE XREF: ROM:828Cj
ROM:8291 swap A
ROM:8292 rr A
ROM:8293 push ACC ; Accumulator
ROM:8295 mov A, #0xBF ; '+'
ROM:8297 push ACC ; Accumulator
ROM:8299 push DPL ; Data Pointer, Low Byte
ROM:829B push DPH ; Data Pointer, High Byte
ROM:829D ljmp ROM_BF70
ROM:82A0 ; ---------------------------------------------------------------------------
(...)
ROM:BF70 ; ---------------------------------------------------------------------------
ROM:BF70 ROM_BF70: ; CODE XREF: ROM:829Dj
ROM:BF70 anl P1, #0xF0 ; Go trough bank 0
ROM:BF73 orl P1, #0xE ; Set the high address lines to bank 0x0E on Port 1
ROM:BF76 ret
ROM:BF76 ; ---------------------------------------------------------------------------


As indicated above, each bank has to DUPLICATE this whole bank switching business at the exact same address. And there will be a short period where code is actually ran in Bank 0 (right after the "anl P1, #0xF0") before switching to the actual destination bank (here Bank 0Eh)

Only thing I can say is, seeing how crappy a CPU it is (don't get me started about how poorly optimized all the 8032 code I see seems to be), licensing for the 8032 must come really REALLY cheap!

But then again, having SoC makers like MediaTek design their systems on the cheap might have its advantages...

2009-07-23

Why does windows find the need to modify the partition table at boot...

...or how I almost lost 1.5 TB of data today.

If you use software/fake RAID, this might happen to you too.

See, this morning I thought I would upgrade my HTPC's BIOS just for the heck of it. This is the PC that has a 2TB RAID5 array based on nVIDIA's MediaShield.
Now, despite what I had selected during the BIOS update, the BIOS settings and DMI data were reset after reboot, which means that the HDDs were back to individual IDE emulated drives, rather than members of the RAID array.

Normally, this wouldn't be a big deal, except that, before I cancelled the Windows boot, it was apparently able to look at the disks (using the MediaShield driver), find out that the capacity of the disk it was booting from (now a single 1TB IDE/AHCI HDD) was less than the capacity reported in the partition table, and re-write the partition table of HDD1 to reduce the dimensions of the last partition.

Of course, re-writting a partition table without anybody asking you to is the shortest way to screw up a disk or RAID array, and screw up it did: As soon as I restored the RAID settings in the BIOS and booted Windows, my 1.5 TB data partition was now identified as unformatted and gone! Talk about massive data loss...

No respectable O/S should ever modify a partition table without asking the user first. It's just common sense: The O/S is never, and I have to stress out that part, NOT EVER, smarter than its user (no matter what the O/S developers might think, or how smart they think they are themselves). You do not modify a partition table without asking, EVER, it's really simple as that!

Now, after much cursing, and some accidental good luck, I found that if the first drive was disconnected from the RAID5 array (which happened accidentally as I was trying to invert HDD#2 and HDD#3, since it originally looked like the BIOS upgrade has modified the SATA IDs), the rest of the array booted fine, albeit in degraded mode, and saw the old 1.5 TB data partition alright. Definitely makes sense with the fact that Windows would of course only have modified the partition table of the boot drive while the HDDs were in IDE mode.
But of course, as soon as you remove one drive from your RAID5 array, and boot in degraded mode, the array will flag that drive as failed on next reboot

From there on, the solution is to re-add the drive to the array to resync. Takes a while, but if you trust your other disks not to fail duing the super-lengthy re-sync, probably the safest solution.
Otherwise, it's probably a good idea to have a copy of the Master Boot Record (i.e. the first 512 bytes) of every single drive from your array, and restore it using a decent O/S like Linux. Plus, as experience will show you time and again, it's also always good practice to keep a copy of the MBRs of all your disks that contain important data, so that you can try to address any kind of partition formatting catastrophe.

2009-07-20

Displaying International Domain Names in the browser's address bar

  • In IE (>7): Tools -> Internet Options -> Advanced -> International -> Always show encoded addresses -> Uncheck
    A typical "All or nothing" Microsoft solution to a problem that demands a smart approach. Well done Redmond!
  • In Firefox (from Marc Blanchet's post): about:config and filter using keyword "IDN". There you have two possibilities:
    1. Create a "network.IDN.whitelist.com" boolean key and set it to true => will display all the .com IDNs using their Unicode characters, a.k.a. "the Microsoft way"
    2. Create a new network.IDN.whitelist boolean entry for your full domain (eg: "network.IDN.whitelist.xn--abc.com") and set it to true.
Of course, none of this is really satisfying, as domains that do not present any risk of homograph attack (eg: sequence of glyphs that are not used in any language, or one glyph characters that cannot be mistaken for any other), are still not displayed as Unicode by default in those browsers, which really sucks.

The way to address this issue would be for unicode.org to maintain a list of lookalike characters (i.e. characters considered as dangerous to use in an IDN because they can easily be mistaken for another character). The browser could then use this list (along with a list of all the valid Unicode characters at the time the list was created, as you want to take provision against new spoofable Unicode characters) to sort this issue.

Of course, Unicode is likely to have cold feet about producing such a list, as they could end up being sued if they miss a lookalike and end up being sued as a result, but really, as the authority, it should be part of their role...

2009-07-16

Unicode charts as a single image

Well, one guy was crazy enough to create a single image containing all the Unicode charts from unicode.org and put it online, and he was even nice enough to provide both an interface to dynamically visualize the rendered image, as well as allow people to download the whole thing (which, even if slightly outdated, I find super useful).

Props to Ian then!

Update 2009.08.24 - You might also be interested in this

2009-07-15

Windows Command Prompt in UTF-8

  1. Change the default font to Lucida
  2. issue the command chcp 65001

2009-06-15

Convert a grayscale or RGB layer to an alpha channel in Photoshop CS 4

Goddammit! I don't know if 3D graphics designer use Photoshop or not, and if they do, how they manage to put with it all day, but talk about a product not being intuitive... On the other hand, it's true that I know a thing or two about how Adobe doesn't care that much about any of the software they design to be either customer friendly or intuitive...

So, today, in our new "What we thought we'd do in about 30 seconds in Photoshop CS4 but ended up doing in hours" saga, we'll see how to convert a shadow effect layer into an alpha channel. The purpose here is to display a nice transparent text overlay texture, with a blending shadow against a non uniform background in a 3D animation (eg. DirectX, OpenGL, ...). 

And because I'm in a generous mood today, I'm even gonna drop some illustrations here in there. Let's say that you are creating an animated "motivational poster", like the one on the left, and your goal is to do some nice animation of the motto like a zoom out for instance, while making sure the motto has a shadow for added impact.

You will of course break down your animation between a static background and a text layer, loaded as a texture, and animate the latter.

Now, if you merely save the text layer with a shadow effect in Photoshop and use that as your texture, it will not blend nicely at all, as the shadow will be a solid colour, and eat part of the the nice aperture green logo. Therefore you want to add an alpha component to your texture, which you'll use for shadow blending.

At this stage, I'll suppose that you have your nice text layer with its shadow effect. The good part about all this ordeal is that you won't have to create a separate layer out of the effect, or play with adding/substracting selections, because a black text display with its shadow is pretty much exactly what you need to use for the alpha channel. The only thing is, if you use coloured or texturized text, you'll want to convert it to solid black before the next operations, through a duplicate layer or something.

Now, if you're a Photoshop novice and you've done your research, you should be aware that the alpha components of an image is dealt with as a channel in Photoshop, and that there is a tab for that (again non-intuitive as hell: where is my "right click -> new channel" option, Adobe? And how am I supposed to know what the small icons at the bottom represent when there is no bleeping contextual help in your product?!?). So you'd think that, with all the graphics people dealing with channels, depths masks and alpha all day long, there'd be a bloody "Convert Layer to Channel" function readily available by default in the interface. Well, if it was straightforward to work with alpha channels in Photoshop, that would make the life of graphics designers too easy, and obviously this is not something that you would ever want.

So, after cursing for about 10 minutes about those non-intuitive menus, we have to fallback to looking on the internet for some helpful souls (and I got to give it to CreativeCow there, for providing all the actual solution steps):

  1. Search into your Photoshop installation directory for .atn files (AcTioNs) and locate the one labelled "Video Actions.atn". Mine was in "D:\Program Files\Adobe\Adobe Photoshop CS4 (64 Bit)\Presets\Actions"

  2. In Photoshop, go to Window -> Actions. You will see a set of "Default Actions", but of course, not all the Actions that come with Photoshop have been loaded by default in one of the more obscure menu of the product, where noone would mind if they had been.

  3. Then click on what has to be the most useful button for advanced users, and therefore the one that's been made as easy to overlook as possible, namely that bleeping contextual dropdown menu. And I really have to wonder what is it with this late redesign of UIs by Adobe and Microsoft (eg. Office 2007), whose only purpose seems to make the power users' life a living hell by hiding the most powerful feature in a place where the sun don't shine no more...

  4. In that menu, you will find a "Load Action" option, which you should use to navigate to the folder where you spotted the "Video Actions.atn" file, and open that file. Alternatively, you can skip step 1 and directly select "Video Actions" in that menu to make the Video Actions appear. Why "Video Actions" is in that menu but not in the main action window by default is really beyond me here...

  5. OK, now we're talking: In the "Video Actions" section, the first 2 actions are: "Alpha Channel from Visible Layers" and "Alpha Channel from Visible Layers (inverted)". The Ark of the Covenant at last! And all we had to do was get our face and soul melted for it - a bargain! Merrily we go then by setting only our "black text with shadow" layer visible (make sure you don't select the background) and then clicking the little "Play" button in the Actions menu with "Alpha Channel from Visible Layers" selected. You should end up with a new channel that looks like the one on the left. In this image, white is for pixels with no alpha (completely opaque) whereas black is for pixels with 100% alpha (completely transparent).

  6. Now, before we save our image with its newfound alpha channel, we want to lay it over black solid background for the shadow, so just create one and save only the text layer (with or without the shadow effect - it shouldn't matter) over the black background as your image with alpha. You might want to enable the alpha channel (set the eye) after creating the black background to have a better idea of what exactly you're saving.

  7. Now because there does not appear to be clear conventions as to what a zero alpha value is meant to represent (this can vary from application to application), you might find that the alpha mask you get in your application does the opposite effect of what you wanted. If that's the case, rather than redo the steps using the "Inverse" function for layer to alpha we've seen in the menu, you can simply double click on your Alpha channel in Photoshop and toggle between "Color Indicates Masked Area" / "Color Indicates Selected Area", as this will invert your Alpha channel

And that, sir, is how you do it.

Of course, if you look at the actions, you can probably figure out how to achieve the same from the various menus (haven't really figured that one just yet, especially where a on/off selection is converted to a grayscale range). I also haven't figured out how you can insert one of those actions into the standard Photoshop menus (wouldn't it sit nicely in the layer menu?). Oh and finally, if like me you've cursed Photoshop's stupid and uninformative handling of undo (Why is it that a 10 year old program like Paint Shop Pro 5 does a much better job at undo than Photoshop?), note that you can access the undo history from the Window menu as well, where you will actually get a description of the actions you have taken.

2009-04-20

More vi search and replace

To replace ^M characters that did not quite make it into carriage return when displayed in vi:
:%s/<Ctrl-V><Ctrl-M>/<Ctrl-V><Enter>/g
Note that <Ctrl-V> is a magic sequence that allows vi to uninterpret the next special key, inserting its control code instead.

2009-03-12

redhat/centos services and mdadm quicknotes

Always seem to forget stuff there. The command I'm looking for to manage services/change runlevel on redhat/centos is chkconfig (chkconfig --list).

Also, to re-add a partition to an existing (failed) RAID array:
mdadm /dev/md0 --manage --add /dev/sdb1

2009-03-09

Mysql logging

One day or another, you'll want to check what exactly your MySQL server is doing (to check for SQL injections for instance). Therefore:
  • To log all SQL queries, add the following to your my.cnf (in the [mysqld] section)
    log=/var/log/mysqld.log
  • To log queries that actually modify the database:
    log-bin
    If you don't modify the default log-bin parameters, this would create a <servername>-bin.###### & index in your /var/lib/mysql, which you can then analyze with mysqlbinlog

Getting some bloody files from Gigabyte

There's not ONE motherboard manufacturer to save the others. NOT ONE!
Listen, we all know that you're hardware developers, and that providing files is an afterthough, but that is no reason to:
- force customers to use God-awful, slow, and completely useless interfaces on your website when all they want is access BIOSes or drivers
- host your bloody files on a server that must still be powered by DX2 66MHz CPU with 8 MB RAM. Yeah, it really looks that slow from our end!!!
- suddenly decide that you're gonna change all the files location, because it was a bit too convenient for customer to google and access the data directly, so what better way to say "f*ck you!" than changing the interface and means of accessing the files every 6 months or so.

Asus, Gigabyte, MSI, Supermicro, whatever: YOUR SUPPORT & DOWNLOADS WEBISTES ALL SUCK, BIG TIME!!!

It's not like we're asking for much. We just want to access the BIOS, drivers and utilities without having to jump in slow motion (or should I say bullet-time, considering how time crawling to a standstill page refreshes appear to be) through hoops. And I won't hold you much of a grudge for designing websites that suck, because that's not your main area of expertise, but having them run on underperforming hardware is hardly the best way to put your customers in confidence.

All this rant to tell you that, if you actually want to download a file from GigaByte, don't bother going to gigabyte.com.tw. Just use ftp://download.gigabyte.ru/

2009-02-26

isw: Could not find disk /dev/sd# in the metadata error with dmraid

Reinstalled my old rig yesterday, to switch from a RAID0 to a RAID1 conf, now that the bulk of my data is stored in the RAID5 array from my HTPC. As usual, using the Motherboard's fakeraid driver. This time it's the intel's ICH8R Matrix Storage RAID driver (and boy is it slow on rebuilding RAID1 arrays, which I had to do since I only got a brand new Vista64 stable for four hours before a complete freezout that required a hard reset!).

As usual, went through a Linux reinstall, only to find out that dmraid 1.0.0.rc15 would return the following:
root@dusk:/usr/src/dmraid/1.0.0.rc15# dmraid -ay
ERROR: isw: Could not find disk /dev/sdc in the metadata
ERROR: isw: Could not find disk /dev/sdb in the metadata
no raid disks
Well, once again, it was Debian patches to the rescue. Just make sure you apply the latest dmraid_######.diff patches to your dmraid install (just like we did for grub, see below) and voila!:
root@dusk:/usr/src/dmraid/1.0.0.rc15# dmraid -ay
RAID set "isw_dadaadaecb_Ark" was activated
RAID set "isw_dadaadaecb_Ark1" was activated
RAID set "isw_dadaadaecb_Ark2" was activated

2009-02-24

Booting with a failed primary HDD on a Linux RAID1 array

Thought it would be straightforward, and that the system would automagically take care of it by itself, think again:
  1. GRUB is dumb, until we tell it exactly what to do, so, assuming your /boot partition is located on sda1/sdb1, you need to make sure that you manually setup both disks, using:
    [root@whatever ~]# grub

    GNU GRUB version 0.95 (640K lower / 3072K upper memory)

    [ Minimal BASH-like line editing is supported. For the first word, TAB
    lists possible command completions. Anywhere else TAB lists the possible
    completions of a device/filename.]

    grub> root (hd0,0)
    Filesystem type is ext2fs, partition type 0xfd

    grub> setup (hd0)
    Checking if "/boot/grub/stage1" exists... yes
    Checking if "/boot/grub/stage2" exists... yes
    Checking if "/boot/grub/e2fs_stage1_5" exists... yes
    Running "embed /boot/grub/e2fs_stage1_5 (hd0)"... 15 sectors are embedded.
    succeeded
    Running "install /boot/grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/boot/grub/stage2
    /boot/grub/grub.conf"... succeeded
    Done.

    grub> root (hd1,0)
    Filesystem type is ext2fs, partition type 0xfd

    grub> setup (hd1)
    Checking if "/boot/grub/stage1" exists... yes
    Checking if "/boot/grub/stage2" exists... yes
    Checking if "/boot/grub/e2fs_stage1_5" exists... yes
    Running "embed /boot/grub/e2fs_stage1_5 (hd1)"... 15 sectors are embedded.
    succeeded
    Running "install /boot/grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/boot/grub/stage2
    /boot/grub/grub.conf"... succeeded
    Done.

    grub> quit

  2. If you have additional disks that you fsck (last columns of your fstab), be mindful that libsata will reorganize their names around, so if you set your /dev/sdc1 to be checked at bootime, you might end up getting stuck with fsck not finding the disk, and not being able to boot at all!

2009-02-10

Booting fakeraid RAID5 Linux, the less half assed way

This will be the final episode of our ongoing installation of Linux on a nVidia MediaShield fakeraid RAID5 array.

By now, you should have a kernel and an initrd.gz image that allow you to boot onto your RAID5 partition, the only last remaining problem being that you need an external non RAID5 disk to do so. That disk could probably be put to better use, so let's remove it from the equation. And no, we're not gonna use an USB stick, CD-ROM or even a floppy, because what we need is a solution always at our fingertips, that we can rely on 24/7, without having to look for a flimsy little device/disc, that we're sure to lose or scratch anyway. Instead, we're going to use something MUCH better.

Re-introducing, the best invention since sliced bread: the fully hacked WRT54G, with supporting cast: PXE!

If by now you don't have a hacked WRT54G in your house, you're an idiot, and if you didn't get a motherboard + ethernet controller that supports PXE, you also need to rethink your life.

You'll find plenty of (seriously confusing) guides on how to use PXE, so I'll just cut down to the chase. What we need here is a PXE executable that can boot our vmlinuz + initrd.gz, as well as relinquish the boot operation to the default disk device, so that we can chose between Vista and Linux.
If you do the groundwork, you'll probably find out that GRUB can be compiled to be run from PXE. The only issue we have here is that the nForce forcedeth network driver is not embedded with GRUB 0.97, and while you can pick one up from gPXE (which is where all the network drivers from GRUB come from anyway), if you want to do things clean, you'll have to recreate the configure/Makefile settings to add the forcedeth, which is a major pain in the ass (or you can simply hijack one of the existing drivers files, and replace it with the forcedeth.c code, but this lazyman's option of choice is not really gratifying).

Instead, we're going to use PXELINUX, a spinoff of the Syslinux project.
Since our WRT router is using DNSMasq, we'll practically follow the instructions from the link above, so if you need more than what I am to present below, feel free to refer to the official documentation.
  1. Create a tftpboot directory on your multipurpose router. For convenience, I am creating mine in /share/tftpboot as the /share directory is already shared using Samba, and this of course gives us great convenience to setup our PXE server files
  2. Copy your vmlinuz and initrd.gz images in this directory
  3. Download and extract the latest syslinux.zip from kernel.org.
  4. Copy the pxelinux.0 from the syslinux.zip core/ directory into your tftpboot dir
  5. Copy menu.c32 from /com32/menu into your tftpboot dir
  6. It's also a nice feature to be able to boot DOS floppy images using PXE as well (eg: Win98 bootdisk, Memtest86+, etc.). To do that, copy memdisk from the memdisk/ directory to your tftpboot dir. Note that memdisk can also handle gzip compressed images, which can help reduce transfer time
  7. Optional: If you want to confirm that memdisk works, or if you simply want to have the Memtest86+ option, you can also download the compressed Memtest86+ v2.10 image that I am providing here. Note that this is the same memtestp.bin that you can download from http://www.memtest.org/ except padded to reach the standard 1.44 MB floppy size. Despite multiple attempts, I haven't yet found how to get memdisk to auto-pad floppy images...
    Oh, and the reason I use Memtest86+ rather than Memtest86 is that the latter didn't seem to handle my 4 GB RAM properly
  8. Create a pxelinux.cfg/ directory in your tftpboot dir
  9. Create a graphics.conf text file there, with the following content:
    menu color tabmsg 37;40      #80ffffff #00000000
    menu color hotsel 30;47 #40000000 #20ffffff
    menu color sel 30;47 #40000000 #20ffffff
    menu color scrollbar 30;47 #40000000 #20ffffff
    menu master passwd yourpassword
    menu width 80
    menu margin 22
    menu passwordmargin 26
    menu rows 6
    menu tabmsgrow 15
    menu cmdlinerow 15
    menu endrow 24
    menu passwordrow 12
    menu timeoutrow 13
    menu vshift 6
    menu passprompt enter password:
    noescape 1
    allowoptions 1
  10. Create a default text file in the pxelinux.cfg directory, and adapt the following content to your needs (NB: By default, this menu will apply to all PXE devices. If you want to target specific MACs, just change the "default" name to the MAC address of your interface (eg: "01-88-99-aa-bb-cc-dd"):
    default menu.c32
    prompt 0

    menu title PXE Boot Menu
    menu include pxelinux.cfg/graphics.conf
    menu autoboot Starting Local System in # seconds

    label bootlocal
    menu label ^Local System
    menu default
    localboot 0
    timeout 35

    label linux
    menu label Slackware 12.2 (RAID5)
    kernel vmlinuz
    append initrd=initrd.gz vga=791 vt.default_utf8=1

    label dos
    menu label DOS (Win98)
    kernel memdisk
    append floppy initrd=win98.gz

    label memtest
    menu label Memtest86+
    kernel memdisk
    append floppy initrd=memtestp.gz
  11. Now, you need to tell your DHCP server that it should handle PXE, and which files it should provide. If you are using DNSMasq, this is as trivial as adding the following lines:
    # PXE Booting
    enable-tftp
    tftp-root=/share/tftpboot
    # This is the file that will be ftpd's accross
    dhcp-boot=pxelinux.0
    and then restarting the daemon (/etc/init.d/dnsmaq restart)

  12. Finally, you can enable PXE in your PC BIOS, and watch in awe as you can now conveniently boot whatever you want from the network. If it doesn't work, I'd suggest you run a test with memdisk and a bootdisk (eg: memtest), as it shouldn't require much of any configuration to run properly.

RE: Installing XBox controller on Vista 64

Well, I already gave you some tips, but each time I do it, it's a complete pain finding the right driver and getting it to work. I'll tell you; there are WAY TO MANY versions of the xbcd driver floating on the internet.

The one you want to use then is the one listed from this thread (0.2.6 at the time of this article - I also put a copy of the installer here if you need). Don't bother installing anything else.
Then the nice thing is that this latest version is already self signed, and the installer will also prompt you to install the driver's root CA, so if you're in test mode, you're golden.

2 things to know though:
- To access the setup utility, you need to have disabled UAC (use msconfig for that). You can re-enable it afterwards
- If you already installed an xbcd driver, or if it simply doesn't appear to work, just select "update driver" on the device, and point it to your Windows\Inf directory. The update should get you sorted.

2009-02-06

Booting fakeraid RAID5 Linux, the half assed way

OK, by now you have a whole Linux system sitting idly on one of your RAID5 device mapper partitions. First thing you want to do then is edit the etc/fstab there to have the root filesysten point to the right device (in our case /dev/mapper/nvidia_aeejfdbep2). Not that it's actually necessary considered how we are going to boot, but it can't hurt.

If you've done your groundwork, you found out by now that using GRUB or LILO as is won't be of much help, as none of them is able to handle a RAID5 device mapper array. We don't have a choice her but have to run dmraid -ay before we try to access the disks, and of course, that means we need to build an initrd image.
As a side note, RAID1 or RAID0 shouldn't be an issue with GRUB, so you can probably follow this tutorial and complete the bootloader installation on your main RAID5 array, and have it work.

Now, our final solution here will still require an external partition with a /boot directory that the bootloader can refer to at boottime, but for the sake of this exercise, and also for our final solution, which will no longer require an external non RAID5 partition (more about that in the next post), we're gonna setup our RAID5 Linux system as close to standalone as possible, which means that what we ultimately want is the ability to run the bootloader from our RAID5 system (so that the day we have a bootloader that properly handles RAID5, we can just install it on the RAID5 MBR rather than the external disk, and ditch the latter in a heartbeat).
Unfortunatley, this means that we'll need to use GRUB rather than LILO, so we'll break things in 2 parts: First we'll configure initrd and setup LILO from the non RAID Linux to boot our RAID, and once we're there we'll setup GRUB to boot our RAID from the RAIDed Linux (but still using the non RAIDed boot). If you're confused, just hang on.

Part 1: Setting up initrd to boot the RAID5 Linux

Well, time and time again, I find more compelling reasons to use Slackware, the last one being /boot/README.initrd, which is installed by default, and in which Patrick Volkerding tells you anything you want to know about how to create initrd. Not that there is much you need to know in the end as:
root@stella# cd /boot
root@stella# mkinitrd -c
is all you need for now. The mkninitrd will have created a brand new initrd-tree and initrd.gz for you - isn't that nice?

Now, obviously, we need to add our dmraid executable to the initrd-tree and recreate initrd.gz. But if you're thinking "fine, we'll just pick up the exec", think again!
The dmraid executable we built was the dynamically linked version, so if you strace the files used, you'll see that we're gonna have to copy a whole bunch of libraries as well:
root@stella# strace -e trace=file dmraid -ay 2>&1 | more
execve("/sbin/dmraid", ["dmraid", "-ay"], [/* 35 vars */]) = 0
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY) = 3
open("/lib/libdevmapper.so.1.02", O_RDONLY) = 3
open("/lib/libc.so.6", O_RDONLY) = 3
open("/proc/mounts", O_RDONLY) = 3
(...)
RAID set "nvidia_aeejfdbe" already active
RAID set "nvidia_aeejfdbep1" already active
RAID set "nvidia_aeejfdbep2" already active
RAID set "nvidia_aeejfdbep3" already active
NB: The removed output above has to do with /proc, /dev or /sys, which won't be an issue.

The smart way (or lazy way, which is even better), is to compile dmraid as a static binary, so that we don't have to care about those pesky libraries. Therefore:
root@stella# cd /usr/src/dmraid/1.0.0.rc15/
root@stella# make clean
root@stella# ./configure --enable-static_link
root@stella# make
root@stella# ls -alF tools/dmraid /sbin/dmraid
-rwxr-xr-x 1 root root 204658 2009-02-05 23:01 /sbin/dmraid*
-rwxr-xr-x 1 root root 816664 2009-02-06 16:47 tools/dmraid*
root@stella# cp tools/dmraid /boot/initrd-tree/sbin/
That's 600 KB more to our initrd right there, but at least we know that we have everything we need.
Now, all that's left is editing the init script in /boot/initrd-tree to call our command.
Depending on your distro, the name & content of the init script could be very different, so you might have to be creative. In the case of slackware, the init script is called "init" (which makes sense, because then you don't have to specify it as a kernel parameter), and in the
if [ "$RESCUE" = "" ]; then
section, which already contains some disk detection routines, we're gonna add:
  # Initialize DMRAID:
if [ -x /sbin/dmraid ]; then
/sbin/dmraid -ay
fi
Now, we shall rebuild our initrd:
mkinitrd -r /dev/mapper/nvidia_aeejfdbep2

The stage is now setup to see if we can boot our RAID5 system using LILO, by adding the following section in /etc/lilo.conf:
image = /boot/vmlinuz
initrd = /boot/initrd.gz
root = /dev/mapper/nvidia_aeejfdbep2
label = RAID5_Linux
read-only
Now, reinstall LILO:
root@stella# lilo
Warning: '/proc/partitions' does not match '/dev' directory structure.
Name change: '/dev/dm-0' -> '/dev/disk/by-name/nvidia_aeejfdbe'
Warning: Name change: '/dev/dm-1' -> '/dev/disk/by-label/Vista64'
Warning: Name change: '/dev/dm-2' -> '/dev/disk/by-name/nvidia_aeejfdbep2'
Warning: Name change: '/dev/dm-3' -> '/dev/disk/by-label/Media'
Added RAID5_Linux *
Added Slack_New
Added Slack_Old
Added Rescue_1
4 warnings were issued.

Don't worry too much about the warnings. Just reboot and, yay, it works!... Err, well... kinda, because while you should have seen the dmraid disks being mapped, and you do end up with everything running as it should from the RAID5 partition, you might end up with a boot where the rc.d/init scripts from Slackware are not being displayed on the console at bootime as they should.
If you look in /var/log/messages, you will see that all the scripts do indeed run, but you're being left with a very silent screen right before you end up with the prompt.

The problem is actually due to udev screwing up the /dev repository after the root filesystem is mounted. The solution to that? Keep the /dev used by the kernel even after root is mounted by compiling your kernel with:
Device Drivers ---> Generic Driver Options ---> "Create a kernel maintained /dev tmpfs (EXPERIMENTAL)" and "Automount devtmpfs at /dev" bot selected.

Part 2: Setting up GRUB on the RAID5 Linux partition

There are 2 (well 3) reasons why we want to replace LILO with GRUB here:
1. GRUB is more likely to be patched for full RAID5 support compared to LILO, so when that happens, we want to be ready
2. If you're using RAID0/RAID1 instead of RAID5, GRUB should actually be able to install the bootloader on your RAID array
3. Having GRUB handle RAID takes some trickery which you probably want to read about.

Once more, we'll be using grub 0.97, however, if you use the vanilla version, no matter what you do or where your /boot partition might be located (even on a standard non-RAID disk), you might end up with the infamous:
grub> setup (hd0)
Checking if "/boot/grub/stage1" exists... no
Checking if "/grub/stage1" exists... no

Error 2: Bad file or directory type
AFAIK, this could be due to GRUB 0.97 being unable to access ext3 256 bytes inodes, OR it could happen if you have more than 2 GB RAM, or it could have to do with the 2.6 kernel new geometry. Well, all I know is that one of the grub 0.97 patches from Debian fixes the problem. Thus:
4) Reinstall grub on RAID:
root@stella# wget ftp://alpha.gnu.org/gnu/grub/grub-0.97.tar.gz
root@stella# tar -xvzf grub-0.97.tar.gz
root@stella# cd grub-0.97
root@stella# wget http://ftp.de.debian.org/debian/pool/main/g/grub/grub_0.97-47lenny2.diff.gz
root@stella# gunzip grub_0.97-47lenny2.diff.gz
root@stella# patch -p1 < grub_0.97-47lenny2.diff
root@stella# cat debian/patches/00list
OK, that last line gives us the order in which we should do the installation of the patches, so from there on you just need to run a bunch of:
patch -p1 < whatever.patch
In the order provided from the 00list file (i.e. starting with cvs-sync.patch and ending with use_grub-probe_in_grub-install.diff). Once you're there, just compile and install grub so that we move to the final phase. Now, we'll still tell GRUB to use our /boot directory on /dev/sdc1, because we don't really have a choice here (if you don't believe me, you can try installing on RAID5 and see your 'setup (hd#)' command fail miserably), but we will also tell it how to "see" our RAID5 array, and to be able to do that, we will need to know our disk geometry, which we can get from fdisk. What we want are the C(ylinders) H(eads) and S(sectors) value:
root@stella# fdisk /dev/mapper/nvidia_aeejfdbe
Command (m for help): p

Disk /dev/mapper/nvidia_aeejfdbe: 2000.4 GB, 2000409722880 bytes
255 heads, 63 sectors/track, 243202 cylinders
These days, most disks have 255 heads and 63 tracks anyway (which are the greatest values you can set), so what you really need is the number of cylinders. This we will use to provide the geometry of our RAID5 "disk" to GRUB, in C H S order, because it is unable to figure it our by itself. And also, since we are doing a GRUB installation from scratch, we have to copy the stage1 & stage2 files to the /boot/grub directory (which is information that the clueless people using ready made packages are apparently unable to provide - damn you Ubuntu!):
mount /dev/sdc1 /mnt/hd
mkdir /mnt/hd/boot/grub
cp /usr/local/lib/grub/i386-pc/* /mnt/hd/boot/grub/
grub --device-map=/dev/null
grub> device (hd0) /dev/sdc

grub> device (hd1) /dev/mapper/nvidia_aeejfdbe

grub> geometry (hd1) 243202 255 63
drive 0x81: C/H/S = 243202/255/63, The number of sectors = -387927166, /dev/map
per/nvidia_aeejfdbe
Partition num: 0, Filesystem type unknown, partition type 0x7
Partition num: 1, Filesystem type is ext2fs, partition type 0x83
Partition num: 2, Filesystem type unknown, partition type 0x7

grub> find /boot/grub/stage1
(hd0,0)

grub> root (hd0,0)
Filesystem type is ext2fs, partition type 0x83

grub> setup (hd0)
Checking if "/boot/grub/stage1" exists... yes
Checking if "/boot/grub/stage2" exists... yes
Checking if "/boot/grub/e2fs_stage1_5" exists... yes
Running "embed /boot/grub/e2fs_stage1_5 (hd0)"... 16 sectors are embedded.
succeeded
Running "install /boot/grub/stage1 (hd0) (hd0)1+16 p (hd0,0)/boot/grub/stage2
/boot/grub/menu.lst"... succeeded
Done.
Good, now we're ready to create our boot menu:
vi /mnt/hd/boot/grub/menu.lst

default 0
timeout 3

title Vista (64 bit)
rootnoverify (hd1,0)
chainloader +1

title Slackware 12.2 (RAID5)
root (hd1,1)
kernel (hd0,0)/boot/vmlinuz root=/dev/mapper/nvidia_aeejfdbep2
initrd (hd0,0)/boot/initrd.gz
This allows us to boot both Vista and Slackware on the RAID5 array using the /boot partition on the non RAID disk. Note that at this stage, it is probably a good idea to duplicate the /boot directory from the non RAID to the RAID partition, and recreate a small boot partition from scratch on the non RAID.

Of course, as mentionned before, because we still need a non RAID HDD, this is an half assed solution. In the next post, we'll see a better assed solution, where we do away with that extra HDD, and where we'll explore some new interesting stuff...

2009-02-05

And here we go again - RAID5 fakeraid + Linux

It's that time of the decade where I get a new machine, this time an nVidia MCP79 (a.k.a. GeForce 9400) based motherboard from Gigabyte. Oh, and yes, the MCP79 is the same all-inclusive nForce chipset nVidia uses in the new MacBook Pros, as well as their Atom "can of whoop-ass" for set top HDTV boxes.

As most people, I'm pleasantly surprised by the features of the chipset. My goal was to build an HTPC, and with one good durable mobo (with both HDMI and optical SPDIF), the adding of a 45 nm quad core and 4 GB dirt cheap DDR2 RAM, I should have a solution that will last for some time. Oh, and of course, since we're hopefully going to store a lot of media data there, we can't forget the 3x 1 TB Samsung SpinPoint HDDs in a RAID5 array, which is what brings us here today.

Yep, as you guessed from the title, it's ye olde getting Linux to work with fakeraid. And the first thing I gotta say is I'm sick and tired of all the posts you see on the net where people asking how to make Linux work with their fakeraid are met with a "why don't you use Linux's md instead".
Well, I'm sorry, but there are still people out there who want to dual boot with Vista and trust nVidia RAID5 driver (MediaShield) enough not to want to make a complete mess of their Windows installation. Hell, I'd rather spend time figuring out how to make my RAID5 work in Linux than have to spend time doing the same thing in Windows. Plus, I like the convenience of having my BIOS handle things like RAID.

Here we are then, with Vista installed and painfully updated at last (see below) and a blank partition on the nVidia RAID5 MediaShield array that's just dying to get Linux installed.

OK, it's not that I mind a challenge here, but I've got a blank HDD lying around, so this time, what we're going to do is install Linux on that standalone drive, conf it so that it sees the RAID5 partition, and then move over our system rather than have to create our own custom RAID5 installation CD.

As usual, I'm gonna use the latest Slackware (12.2) so ymmv.

1. Slackware install with ALL the options you really want in the end (we'll be using that partition as the source) and LILO booting the single HDD = piece of cake. Just make sure that you don't install LILO on anything else BUT the HDD that's not part of the array
2. First bad surprise - if you keep your 2.6 kernel as is, no matter what you try, dmraid will return:
ERROR: device-mapper target type "raid45" not in kernel
Don't waste your time changing the raid45 to raid456 identifiers in dmraid either - you'll still get the same error. The kernel actually needs to be patched for dmraid to be happy.

[UPDATED]One year later, and still a mess. As of 2.6.35, the dm-raid45 module still hadn't been integrated to the kernel, plus the evil people at Red Hat who were supposed to maintain that module have completely dropped the ball, the Gentoo people also have dropped the ball (after 2.6.31) and the last set of dm-raid45 patches I could only go to 2.6.33 (and don't work against 2.6.35). So I guess we're stuck with 2.6.33 then. This last set of dmraid patch can be found from Mandriva. Can't help but find it strange that, while the dm-raid45 module is commonly used in Ubuntu (except that dmraid itself is useless with large disks as it doesn't support GPT), nobody seems to care about trying to merge it with the official kernel tree. Oh well...
root@stella# cd /usr/src/linux-2.6.33/
root@stella# wget http://tmb.mine.nu/Mandriva/Cooker/dm-raid/dm-raid45_2.6.33-rc1-20091126.patch
root@stella# patch -p1 < dm-raid45_2.6.33-rc1-20091126.patch
root@stella# wget http://tmb.mine.nu/Mandriva/Cooker/dm-raid/dm-raid45-buildfix-for-2.6.33.patch
root@stella# patch -p1 < dm-raid45-buildfix-for-2.6.33.patch
OK, so you just go through your usual kernel recompile. Now, a couple of things I wanna point out:
- You do want to reduce the size of your kernel if you're planning to remove the standalone HD and still boot your RAID5 array Linux partition with a decent solution
- Of course you want to add Device mapper support (Device drivers -> Multiple devices driver support (RAID and LVM)) and the new experimental "RAID 4/5 target" that was added by the kernel patch
- Despite being a Gb adpater, the nForce Ethernet driver (forcedeth) is to be found in: Ethernet (10 or 100Mbit) -> EISA, VLB, PCI and on board controllers -> nForce Ethernet support
- if you're using ext4 for root, don't forget to include ext2 support as well, as your root partition cannot be mounted without it.

Once you have a kernel that boots, you can move on to the dmraid install. This time, no need for devmapper. Just download the current dmraid source from http://people.redhat.com/~heinzm/sw/dmraid/src/:
root@stella# cd /usr/src
root@stella# wget http://people.redhat.com/~heinzm/sw/dmraid/src/dmraid-current.tar.bz2
root@stella# tar -xjvf dmraid-current.tar.bz2
root@stella# cd dmraid/1.0.0.rc15
root@stella# ./configure; make; make install
root@stella# dmraid -ay
RAID set "nvidia_aeejfdbe" was activated
RAID set "nvidia_aeejfdbep1" was activated
RAID set "nvidia_aeejfdbep2" was activated
RAID set "nvidia_aeejfdbep3" was activated
Yay! it works!

If you only see see the RAW disk (nvidia_aeejfdbe) but not the partitions, that's probably because you're using a GPT disk. In that case, you need to also run kpartx as follows:
root@stella# kpartx -a -v /dev/mapper/nvidia_aeejfdbe
add map nvidia_aeejfdbe1 (252:1): 0 262144 linear /dev/mapper/nvidia_aeejfdbe 34
add map nvidia_aeejfdbe2 (252:2): 0 209715200 linear /dev/mapper/nvidia_aeejfdbe 264192
add map nvidia_aeejfdbe3 (252:3): 0 186478592 linear /dev/mapper/nvidia_aeejfdbe 209979392
add map nvidia_aeejfdbe4 (252:4): 0 11324626944 linear /dev/mapper/nvidia_aeejfdbe 396457984
kpartx itself can be obtained from the multipath-tools.

OK, so now we can get going. Be sure you format the RELEVANT partition. In my case, I'll be using nvidia_aeejfdbep2, the second partition on the RAID5 array for the GNU/Linux system.
root@stella# mkfs.ext4 /dev/mapper/nvidia_aeejfdbep2
root@stella# mount /dev/mapper/nvidia_aeejfdbep2 /mnt/hd
root@stella# mkdir /mnt/hd/proc
root@stella# mkdir /mnt/hd/sys
root@stella# cp -ax / /mnt/hd
This could take a while...
Note that the -x option ensures that we stay on a single file system

OK, in the next post we'll see the grub way to (kinda) boot this whole mess

2009-02-04

These are not the ads you're looking for

Here's a link to the hosts file I use on my Windows systems:
http://www.mvps.org/winhelp2002/hosts.htm

2009-02-02

Windows update error 0x8024402c

Well, like everybody else, this error has been driving me nuts. Plus it manifested itself in such a weird way I couldn't believe.

I already had that issue last week on a brand new Vista 64 install, and somehow, it appeared that resetting the LAN paramaters to default did the trick, and after a short hour struggle, I was able to apply the updates.

I was not so lucky with that old laptop of mine running XP, which was badly in need of critical updates, but that couldn't seem to download them. Well, actually, at first, it started to download the updates alright (got through about 8-10 of them) but then it failed! Tried every hint from the link above. None worked...

This is a job for... Is it a bird? Is it a plane? No, it's Wireshark!
OK, so Wireshark reveals that, despite everything Microsoft wants to make you believe, the issue is really with au.download.windowsupdate.com not being resolved. Of course, one has to wonder why an Irish based Windows has to pick up its updates from Australia, but hey, that's Microsoft for you.
Indeed, a few tests from the commandline show that au.download.microsoft.com is indeed not resolvable. OK, let's see what the DSL router (which can also do basic diagnostic checks) has to say about it... WTF? The DSL router sees no issue there:
Resolving au.download.windowsupdate.com ... 65.54.84.196
Reply from 65.54.84.196
Reply from 65.54.84.196
Reply from 65.54.84.196
Ping Host Successful
And this is where it gets really weird, as it appears that my Linux machines also cannot resolve the host... while the router can?!?! No way!

OK, by now I need to apply these updates badly, and therefore, yes, you got it, we're gonna edit the systems32/drivers/etc/hosts file of course and get done with it.
So here goes:
65.54.84.196 au.download.windowsupdate.com
and try downloading the updates again (with Wireshark still running in the background).
What the hell?!? No difference - We still get DNS requests for au.download.windowsupdate.com that don't seem to resolve. But how on earth is that possible with a triple checked hosts file?

The answer of course is that Microsoft deliberately screwed up their DNS resolution so that their update servers can NOT be overridden by the hosts file. Yeah, I can understand the logic of it, but how about relying on https and certificates instead, to check that the server is legit, rather than breaking down DNS. Talk about another half assed approach to security...

Well, that will teach me to try to work around a problem rather than solve it, because with no DNS override, I'm back to square one. And I'm not really willing to install my own DNS server just to work around this problem either.

Now, further use of nslookup with various Irish DNS seem to indicate that they ALL have the same issue??? Is someone out there to get me or what, because none of this makes any sense!

At this stage, instead of banging your head against the wall, you gotta try to apply logic. If all of the Irish DNSs were failing to resolve the Windows Update servers, you'd probably get people armed with pitchforks on the streets within minutes, so I must be doing something wrong.
A quick change of NS lookup tool actually confirms it (DON'T USE NSLOOKUP ANY MORE STUPID!). dig quickly shows that it's only the DNS that I was using (Eircom's 159.134.237.6 & 159.134.248.17 - never trust the incumbant!) can't resolve the bloody server, while my own ISP's DNS can.

A quick change of DNS servers on the DSL router, a full reboot (ipconfig /release + /renew would probably have done the trick, but with M$, you can never be too sure), and Windows Update is happy again.

Good thing I didn't have anything better to do this morning. And just to sum up the lessons for today:
- 0x8024402c happens if Windows cannot resolve the IP of the download servers
- Windows update CAN switch download servers on the fly while downloading (!)
- Use Wireshark to find out which download servers Windows Update is trying to use
- Windows name resolution is screwed so that it will always try to use DNS and not the hosts file for the download servers
- Don't trust what your DSL router tells you about Name Resolution - it might not be using the configured DNS servers!
- DON'T USE NSLOOKUP ANY MORE! Use dig instead.
- Screw Eircom!

2009-01-15

This script will save your life!

Or at least your data.

I have been unofficially sysadm'ing a bunch of HP DL380 running Linux over the past few years, and over the Christmas period, one of those, which had a complex mix of ReiserFS, SWAP and XFS partitions on a RAID5 Smart Array device (dev/cciss/c0d0), found nothing better than to start overwritting the MBR!

What this meant of course was, bye bye partition table, and, as you might guess, I didn't find the need to backup the partition data, thinking: "Heck it's RAID5 - what's the worse that can happen?"

ADVICE #1: If you're running a server in PROD, ALWAYS save a copy of the output of fdisk someplace safe!

Thus, there I was, with a non bootable server and a blank partition table, but with data I very much wanted to get back to. And the only tool I had at my disposal (through a remote console connection, because of course, the server had to be remote) was just a Slackware boot CD (because it detects cciss drives) running busybox.

ADVICE #2: If you have physical access to your server, and it's not using a nonstandard disk device, you can probably find a Linux rescue CD with gpart (but better use the latest Debian version, which is more up to date) that supports more partitions types, and will do a much better job. The script below is really if you are in a hurry or you have limited resources.

How difficult can it be to detect partitions with only a shell script then? Well not that difficult, as the script below will prove. But first let me be clear about what the script is meant to achieve:
  1. This script is meant to detect the POTENTIAL beginning of a partition only. It will tell you the first cylinder, but it will not tell you the size, so unless it's the last one on disk, you'll have to figure out where the next partition begins as well
  2. This script will detect primary partitions only. If you have extended partitions, you're on your own
  3. The script only detected EXT2/EXT3, ReiserFS, Swap (version 2) and XFS. It does not detect FAT or NTFS (because I had no use for it). However, if you know the Magic string and location of other partitions types, you should be able to easily modify the script to add them
  4. Not counting the comments, I kept the script short and basic, because you're most likely to have to type it in by hand, so shorter is better
  5. Be mindful that there are likely to be false positives
  6. And of course, while I succesfully tested this script and all the partitions types on various systems, I am not responsible for any damage occuring from using it.
OK, so below is the part you are interested in:
#!/bin/sh
#
# gpart.sh v1.1 - Linux partitions detection script
# Copyright (C) 2009 >NIL:
# Based in part on gpart (C) 1999-2001 Michail Brzitwa et al.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#

#
## Set your disk parameters below
#
device="/dev/sda"
cyl_start=1
# To find your max cyl_end, use fdisk or cfdisk
cyl_end=35697

#
## FS Magic length, string and offset
## Uncomment only one of the sections below
#

## XFS
#bs=4
#magic="XFSB"
#magic_offset=0x0

## ReiserFS
#bs=4
#magic="ReIs"
#magic_offset=0x10034

## Swap space (v2 only!)
#bs=4
#magic="ACE2"
#magic_offset=0x0FFC

## Ext2 / Ext3
bs=2
## OK, here I have to curse the EXT FileSystem devs for
## not chosing a *PROPER* ASCII Magic like everyone else,
## but using 0x53 0xEF instead as this contains i-umlaut
## Thank God for the -e option of echo, which translates
## a "\0###" sequence into the relevant octal character
magic=`echo -ne 'S\0357'`
magic_offset=0x0438

# on almost any recent disk, a cylinder is
# 255(tracks)*63(sectors/track)*512(bytes/sector) = 8225280
# If you're not sure, check what fdisk tells you
cyl_bytes=8225280
mbr_bytes=32256

#
## You shouldn't have to modify anything below this
#

# dd skips in multiples of bs, so we need to compute
# the cylinder size and magic_offset in bs blocks
cyl_blocks=$(($cyl_bytes / $bs))
mbr_blocks=$(($mbr_bytes / $bs))
magic_blocks=$(($magic_offset / $bs))

for i in $(seq $cyl_start $cyl_end); do
if [ $i == 1 ]; then
# For the first cylinder, we need to skip the MBR as well
skip=$(($magic_blocks + $mbr_blocks))
else
skip=$(($(($i-1)) * $cyl_blocks + $magic_blocks))
fi
# Look for the magic block
header=`dd if=$device bs=$bs count=1 skip=$skip 2>/dev/null`
if [ "$header" = "$magic" ]; then
echo "MATCH: cylinder $i ($header)"
fi
done

2008-12-17

Incoming packet was garbled on decryption (putty)

For some reason, I decided to install openssh on openWRT (Kamikaze), and I have been getting "Incoming packet was garbled on decryption" in putty. This seems to be a ongoing issue linked to the encryption cyphers used by openSSH on low resources systems.

The solution: In the Connection -> SSH menu of putty, make sure 3DES is at the top, and the problem will go away

2008-08-12

mmc driver for openwrt, part deux

Went over the MMC configuration for my openWRT router yesterday, as I wanted to relinquish the use of the SES button by the driver (GPIO4) to use it instead as a toggle button for the Wifi, and use GPIO5 for DO.
As I did just that, I also "fixed" the MMC driver to stop the Power LED from blinking every time the SD card was accessed.
The way I did that was to use the following config.h:
#define SD_CS_POWER 0x82
And the modify the spi.c to have:
static inline void mmc_spi_cs_high(void) {
port_state |= SD_CS_POWER;
(...)
Also, as I am now using GPIOs #2,3,5,7, I had to add a line
echo "0xac" > /proc/diag/gpiomask
in one of the init scripts.

For /etc/hotplug.d/button/01-radio-toggle, I am currently using the following, but I might change that to just turn the radio on/off:
if [ "$BUTTON" = "ses" ] ; then
if [ "$ACTION" = "pressed" ] ; then
WIFI_RADIOSTATUS=$(wlc radio)
# touch /tmp/$WIFI_RADIOSTATUS
case "$WIFI_RADIOSTATUS" in
0|"")
uci set wireless.wl0.disabled=0
uci commit wireless
wifi
wlc radio 1
echo f > /proc/diag/led/wlan
;;
1)
uci set wireless.wl0.disabled=1
uci commit wireless
wifi
wlc radio 0
echo 0 > /proc/diag/led/wlan
;;
esac
fi
fi

Despite the non-quite-thought-through use of the GPIOs in the default MMC driver, I'm beginning to think openWRT is the best invention since sliced bread!

2008-08-10

Installing unsigned drivers on Vista 64

OK, if you ended up here, it's probably because you've been trying to install an unsigned driver (eg. XBCD Xbox Gamepad, PSPLinkUSB), and found out about the requirement for all drivers to be signed in Vista 64.

Now, you shouldn't rush with the first article you found on the web that tells you how to disable signed drivers in Vista altogether. The MUCH smarter way is to run Vista 64 in test mode instead, and self sign your drivers. And to be clear, NO, this does NOT require you to recompile the drivers! You can just pick up the drivers you got from someone and sign them away. Of course, one could comment on yet another of Microsoft's stupid "we don't trust our users" decisions of having to enable the test mode to have users install their self signed drivers. A MUCH SMARTER way would have been to do that outside of the test mode as well. After all, if a user went as far as installing their own root certificate, it's probably that it should be trusted.

Anyway, the procedure is as follows (and it is described in much more details here):

1. Get Vista to boot in test mode always with the command:
bcdedit.exe /store C:\Boot\BCD /set testsigning yes
(And there again, I have to curse Microsoft for NOT indicating with bcdedit /? that you can use the /store option to specify your store, and having to spend HOURS trying to figure out why I was getting the following error which is apparently expected, if you boot multiple OSes and don't let Microsoft take over your boot record:
The boot configuration data store could not be opened.
The system cannot find the file specified.
)

After you enter that command, you MUST reboot Vista.

Note: Once Test Mode is enabled, you will get the Windows Version as well as "Test Mode" displayed over the background image. If you're bothered by this, what on earth are you doing with your computer? Staring at the background?

2. Download the necessary DDK SelfSign files, which I am CONVENIENTLY providing to you HERE, as Microsoft is also an ass there - People shouldn't have to download 2.7 GB to gain access to 700 KB worth of files!
Extract them to the directory where you have your driver

3. Let's say you want to install the PSPLinkusb driver. First you want to generate your own root certificate for that driver with:
makecert -$ individual -r -pe -ss "Self Signed Drivers" -n CN="Self Signed Drivers" selfsign.cer
4. Then you install the certificate you just created to the trusted root directory:
certmgr /add selfsign.cer /s /r localMachine root
(NB: if you have UAC on, you will need to run this command in a "run as administrator" command prompt)

5. Finally, you sign EACH .sys file using the certificate:
signtool sign /v /s "Self Signed Drivers" /n "Self Signed Drivers" libusb0.sys
signtool sign /v /s "Self Signed Drivers" /n "Self Signed Drivers" libusb0_x64.sys
Voila! Now you can install these drivers and get on with your life.

For completion, I am providing below the result of a successful certification for the libusb drivers:
E:\Program Files (x86)\OpenOCD\0.2.0\drivers\ft2232>makecert -$ individual -r -pe -ss "Self Signed Drivers" -n CN="Self Signed Drivers" selfsign.cer
Succeeded

E:\Program Files (x86)\OpenOCD\0.2.0\drivers\ft2232>certmgr /add selfsign.cer /s /r localMachine root
CertMgr Succeeded

E:\Program Files (x86)\OpenOCD\0.2.0\drivers\ft2232>signtool sign /v /s "Self Signed Drivers" /n "Self Signed Drivers" libusb0.sys
The following certificate was selected:
Issued to: Self Signed Drivers
Issued by: Self Signed Drivers
Expires: 2040.01.01 00:59:59
SHA1 hash: E0CEAD6474EFD1BF0F6D47501FF3F069C20FD7C7

Done Adding Additional Store

Attempting to sign: libusb0.sys
Successfully signed: libusb0.sys

Number of files successfully Signed: 1
Number of warnings: 0
Number of errors: 0

E:\Program Files (x86)\OpenOCD\0.2.0\drivers\ft2232>signtool sign /v /s "Self Signed Drivers" /n "Self Signed Drivers" libusb0_x64.sys
The following certificate was selected:
Issued to: Self Signed Drivers
Issued by: Self Signed Drivers
Expires: 2040.01.01 00:59:59
SHA1 hash: E0CEAD6474EFD1BF0F6D47501FF3F069C20FD7C7

Done Adding Additional Store

Attempting to sign: libusb0_x64.sys
Successfully signed: libusb0_x64.sys

Number of files successfully Signed: 1
Number of warnings: 0
Number of errors: 0

E:\Program Files (x86)\OpenOCD\0.2.0\drivers\ft2232>

2008-08-04

No more "Access Denied" for your files on Vista

Should really be in techno rant, because Vista's "security" is done completely backwards, but here we go. If you try to move/copy/delete/rename the file or directory C:\access_denied, this is what you need to do to recover access:

Open an admin command prompt and issue the following:
takeown /f C:\access_denied
icacls C:\access_denied /grant <username>:F
For what is worth, this is how you manage to delete the files from the infamous C:\Windows\System32\DriverStore\FileRepository, to stop Windows from being a BLEEPING BLEEP with the BLEEPING drivers. I'm gonna have the final say about WHICH driver I want to get installed dammit!