2010-08-09

That darn iBFT iSCSI Windows installation error

If you ended up here, it's probably because you too tried to install Windows 7 or Vista on an iSCSI bootable disk using gPXE and, even though Windows setup could see your disk alright, you got one of the following errors:

"Windows cannot be installed to this disk. Setup does not support configuration of or installation to disks connected through a USB or IEEE 1394 port" (Vista)

"Windows cannot be installed to Disk <#> Partition <#>. (Show details)" -> "Windows cannot be installed to this disk. iSCSI deployment is disabled since no NICs referenced in the iBFT can be resolved to actual NT-visible devices. Windows cannot be installed to this disk. This computer's hardware may not support booting to this disk. Even if you're probably smart enough to know what you're doing, and we could definitely let you install to this disk to sort booting later, we're going to be asses about it and prevent you from overriding the idiotic setup decisions we made (by the way, did we mention the Windows recovery partition yet?). Why? Because we're Microsoft and screw you, that's why." (Windows 7)

OK. Let's forget about Microsoft's stupid decisions for a while, and attempt to work within them, to figure out how we can address the issue.

Firs of all, if you see the message above, then I can guarantee that, no matter how properly you think you did setup your iSCSI PXE boot, you screwed up something and your iSCSI boot sequence is wrong.
And yes, getting an iSCSI boot error on a blank disk is to be expected, but NO, not all the errors you see during gPXE iSCSI boot can be safely ignored (if you manage to see them at all, but we'll come to that). There's good error and there's bad error.

The first thing I'll point out, if you're like me and thought you could get your dchp/tftp server to:
  1. Supply the iSCSI disk boot parameter (that dhcp-option=net:gpxe,17,"iscsi:192.150.23.3::3260:2:sheeva:disk1" line or similar, along with the keep-san option)
  2. Attempt to boot from it and
  3. If that fails, fallback to executing WinPE/pxleinux to launch a WinPE installation image
is that such a scheme just won't work. Unless you're fiddling with the gPXE scripting options (and even then), you can only have either
  • Boot from iSCSI then fail and hands things over back to BIOS, or,
  • If a boot image is specified, ignore the iSCSI options provided by the dhcp server altogether and just boot from that image.
You can't just have the dhcp/tftp server alone tell gPXE: "try to boot to iSCSI and if that fails, boot from something else while keeping the iSCSI boot options", to boot WinPE for installation onto a blank iSCSI disk for instance. No siree. If you tried that, well, that was your first mistake. Not to say that this can't be achieved at all (we'll see how to do just that from the commandline below, and, in a next post, I'll try to show you how to do it automatically as well), but that can only be achieved outside of the dhcp/tftp options.

In short, if you're using dnsmasq and with something like:
dhcp-match=gpxe,175   # tags the request with net:gpxe if gPXE was supplied
dhcp-option=175,8:1:1 # turn on the keep-san option (allows installation)
dhcp-option=net:gpxe,17,"iscsi:192.150.23.3::3260:2:sheeva:disk1"
dhcp-boot=net:#gpxe,pxelinux.0 # if NOT (#) gPXE, use pxelinux.0
dhcp-boot=net:gpxe,Boot/startrom.n12 # if gPXE, use WinPE
Then, when WinPE boots, it will not have any of the options that you think gPXE should have fed it with regards to the iSCSI boot disk. Especially, the "dhcp-option=net:gpxe,17," option will be completely ignored. Yeah, that makes as much sense to me to as anybody else, but that's how gPXE works for now.

And that's also the reason why, in most of the guides you see, they'll tell you to first try to boot from an unbootable iSCSI disk with gPXE, let it fail and then use BIOS fallback to boot from an installation CD or DVD. Again, simply chaining WinPE in there from PXE does not work without additional effort that none of these guides provide.

Also, and this is the most important part if you want Windows install to accept your iSCSI disk as bootable, as long as you do not see the following lines during boot:
Booting from root path "<your iSCSI path>"

Registered as BIOS drive 0x80
Booting from BIOS drive 0x80
Boot failed
Preserving connection to SAN disk
Then it's game over, plain and simple.

Granted, those line may be hard to spot at during boot, when gPXE will hand things back over to the BIOS on failure (which it should do, if you followed what I said above), as those darn BIOS makers forgot that the Pause key we have on our keyboards could be put for some good use, but if you try a few times, and you don't see any mention of a BIOS drive 0x80, Windows will simply not see your iSCSI driver as bootable, simple as this.

For your reference, here's a screenshot from a VMWare diskless machine that illustrates what you should see when gPXE executes:



As long as you see the lines I highlighted above, after the iSCSI boot attempt, whatever error is thrown out will come from the iSCSI disk itself, rather than your boot process, so you can ignore it. But if you don't see the "Registered as BIOS drive" line from gPXE however, you should pay very close attention to the iSCSI error you get.

So, of course, now your question is: "I'm not seeing these lines (or they're too fast for me to see). How then can I validate that my iSCSI target is good, and that it can be used for installation with gPXE?"

Well, duh, through the gPXE commandline of course, which you can enter with Ctrl-B at boottime. Gotta wish proprietary PXE was as easy to troubleshoot for power users as gPXE is. But you're in a hurry and don't want to learn about the whole gPXE/DHCP/TFTP internals, so I'll cut down to the chase. The sequence of command you are after:
dhcp net0
set keep-san 1
sanboot iscsi:<iscsi server ip>::<iscsi port>:<iscsi lun>:<iscsi target id>
# and if the above line works and you want to boot to WinPE for instance, you could
chain tftp://<server ip>/Boot/startrom.n12
  • dhcp net0 initializes DHCP and allows you to communicate with the server (for tftp, etc)
  • then the keep-san option is to ensure that Windows can see the iSCSI disk as bootable, which of course is the feature you're after
  • finally the sanboot line is the one that will tell you if something is wrong with your iSCSI access.
But first, let's see an example of what happens when everything works as expected (for an uninitialized disk):



Here, we have the Registered as BIOS drive and the Preserving connection lines so we're good. You might also want to note that I am specifically specifying that I'm using port 3260 (default for iSCSI) and that my device is on LUN2 (very non default).

Now, let's see some common errors:



0x2c0d603b is usually an indication that your iSCSI path is wrong. In the case above, I used the non-existing disk0 instead of disk1 for the target part.



Ah, 0x1d704039 (and now, aren't you glad you found this page)...
Yes, this is an error you should not get, even with a non bootable iSCSI disk. And yes, I agree, that an I/O error is precisely what you'd expect from a non-bootable disk, but actually, that I/O error is unrelated to the disk being bootable or not. On the other hand, it has very much to do with trying to use an iSCSI device that cannot be used as a disk, which, if you are using Linux tgtd/tgtadm is exactly what you're going to get if you're leaving the sequence of four columns as is (::::) because that means that LUN 0 will be used, and LUN 0 is reserved by tgtd for the virtual controller.
In short, if you're using tgtd, your actual device LUNs start at 1, and if you're keeping the options part as '::::', then the default of LUN 0 will be used, so you're not actually accessing your disk!
This is why I'm using iscsi:192.150.23.3::3260:2:sheeva:disk1 in the line that works, because I'm trying to access the second disk I created on that specific target. Even if that was the very first disk I created with tgtadm, I would have to add 1 for the LUN in the line above, because the default of 0 is not a disk.

Then, other errors you might get are 0x0b8080a0 (Operation cancelled) or 0x2e852001 (Exec Format Error), but these should occur after you get the "Registered as BIOS drive" line, so you should be able to safely ignore them. For other errors, google is your friend.

So, to summarize, if Windows doesn't like your iSCSI boot device, it's probably because, despite what you think, gPXE didn't find anything it could use as a bootable disk, and to find out why, you should try to boot from it using the gPXE low level commands.

In a next instalment, we'll see how we can create a nice iSCSI aware WinPE image, that we can launch from PXE, for all of our installation needs, and how we can solve the problem of automating WinPE fallback from a non bootable iSCSI disk, as well as how we can use pxelinux to boot from multiple iSCSI disk.

3 comments:

  1. Nil,

    Can you help me with a xp install with the help of winpe.
    I can install the install files on the target and start the winnt prog but thena get an 0x07b .....
    Can this be done this way ?
    Please help

    h.zutt@telfort.nl

    ReplyDelete
  2. Thank you very much for fixing , 0x1d704039 :)

    ReplyDelete