More 8032 / 8051 trickery

Don't worry, the (serial) printf handler is coming. In the meantime, some more hard to spot Mediatek 8032/8051 (Geez, couldn't at least agree on the base architecture name, and use declinations that make sense?) compiler trickery with lcall:

ROM:C32C                 lcall   ROM_B613
ROM:C32F lcall copy_4_bytes_to_xRAM ; input: DPTR = dest xRAM address
ROM:C32F ; PC = src bytes address
ROM:C32F ; ---------------------------------------------------------------------------
ROM:C332 .byte 0x00,0xC,0x00,0x00 ; 0
ROM:C336 ; ---------------------------------------------------------------------------
ROM:C336 mov DPTR, #0xFFFC
ROM:C339 lcall ROM_B613
ROM:C33C lcall copy_4_bytes_to_xRAM ; input: DPTR = dest xRAM address
ROM:C33C ; PC = src bytes address
ROM:C33C ; ---------------------------------------------------------------------------
ROM:C33F .byte 0,0xC,0,0 ; 0
ROM:C343 ; ---------------------------------------------------------------------------
ROM:C343 mov R6, #0x12
ROM:C345 mov R7, #0

When people extensively start/have to use instructions in other ways than they were designed for (shouldn't all lcalls' return flow of execution pcik up right after the instruction itself?), you know that you have a poorly designed CPU, no matter what people manage to achieve with it.

Yeah, it's another lcall'ed function that doesn't return where it's supposed to and breaks our nice disassembly flow. Thanks for making us having to manually edit our subroutine ends in IDA Pro, morons!

So, how does it work this time? Similarly to the previous trick actually:

ROM:B498 ; =============== S U B R O U T I N E =======================================
ROM:B498 ; input: DPTR = dest xRAM address
ROM:B498 ; PC = src bytes address
ROM:B498 copy_4_bytes_to_xRAM: ; CODE XREF: ROM:2A49p
ROM:B498 ; ROM:2A77p ...
ROM:B498 mov R0, DPL
ROM:B49A mov B, DPH ; DPTRd [DPHd DPLd] -> B R0 = Dest Pointer
ROM:B49D pop DPH
ROM:B49F pop DPL ; DPTRs [DPHs DPLs] = Src Pointer
ROM:B4A1 lcall copy_byte
ROM:B4A4 lcall copy_byte
ROM:B4A7 lcall copy_byte
ROM:B4AA lcall copy_byte
ROM:B4AD clr A
ROM:B4AE jmp @A+DPTR ; DPTRs (= original return PC) has
ROM:B4AE ; End of function copy_4_bytes_to_xRAM ; been incremented 4 times at this stage
ROM:B4AF ; =============== S U B R O U T I N E =======================================
ROM:B4AF copy_byte: ; CODE XREF: copy_4_bytes_to_xRAM+9p
ROM:B4AF ; copy_4_bytes_to_xRAM+Cp ...
ROM:B4AF clr A
ROM:B4B0 movc A, @A+DPTR ; read [DPTRs]
ROM:B4B1 inc DPTR ; DPTRs++ (as well as [DPTRs] -> A)
ROM:B4B2 xch A, DPH
ROM:B4B4 xch A, B ; DPTRs <-> DPTRd (B R0)
ROM:B4B6 xch A, DPH
ROM:B4B8 xch A, R0
ROM:B4B9 xch A, DPL
ROM:B4BB xch A, R0 ; A is still [DPTRs] at this stage
ROM:B4BC movx @DPTR, A ; [DPTRs] -> [DPTRd]
ROM:B4BE xch A, DPH ; DPTRs <-> DPTRd
ROM:B4C0 xch A, B
ROM:B4C2 xch A, DPH
ROM:B4C4 xch A, R0
ROM:B4C5 xch A, DPL
ROM:B4C7 xch A, R0
ROM:B4C8 ret
ROM:B4C8 ; End of function copy_byte

Gotta love these scores of xch instructions. Kind of the 3 cups & one red ball street magician classic. The only thing you need to know though, is that all a sequence like the following 3 lines does:
xch A, register1
xch A, register2
xch A, register1

is simply exchanging the content of register 1 & register 2, and keeping A unchanged.

I won't say this isn't a nice trick to access a small amount of data right within your program, but I'll take my general purpose register generous 68000 any time over your crappy 8051. It's a wonder really: How comes there are so few consumer devices using embedded 68000, when it would be so much more efficient for embedded applications? If you're gonna play with these kind of intel x32/x51 hurdles, it won't be that much different to go RISC all the way!


Search for low pointer values in IDA Pro

Use the following regexp:
DPTR, #0x[0-9A-F]{3}$

This will lock only on DPTR assignment values in 0x0000 0x0FFF

Switch statements, 8032/8051 style

Now this next trick is a little bit more clever.

If, when disassembling 8032 / 8051 code, you ever see the start of an lcall'ed routine that looks like this
ROM:0000B5BA                 pop     DPH             ; Data Pointer, High Byte
ROM:0000B5BC pop DPL ; Data Pointer, Low Byte
and then proceeds to use DPTR for movc instructions, the effective DPTR address used in that routine will be the next address after the LCALL (because lcall did push the current PC on the stack)

This is effectively used by, hum, some code, to implement switch statements that might be a bit tricky to detect during reverse engineering, as the disassembler obviously expect lcall to return, and will start disassembly the switch table.

Example code:
ROM:00006CA7 ROM_6CA7:                               ; CODE XREF: ROM_6C60+15j
ROM:00006CA7 ; ROM_6C60+1Bj ...
ROM:00006CA7 mov DPTR, #0xC46A
ROM:00006CAA movx A, @DPTR
ROM:00006CAB xrl A, #0x60
ROM:00006CAD jnz ROM_6CF4
ROM:00006CAF mov DPTR, #0xB03E
ROM:00006CB2 movx A, @DPTR
ROM:00006CB3 lcall case_switch_byte
ROM:00006CB3 ; ---------------------------------------------------------------------------
ROM:00006CB6 .word ROM_6CE4
ROM:00006CB8 .byte 0
ROM:00006CB9 .word ROM_6CE4
ROM:00006CBB .byte 0x20
ROM:00006CBC .word ROM_6CE0
ROM:00006CBE .byte 0x2A
ROM:00006CBF .word ROM_6CDC
ROM:00006CC1 .byte 0x2B
ROM:00006CC2 .word ROM_6CDE
ROM:00006CC4 .byte 0x2D
ROM:00006CC5 .word ROM_6CE2
ROM:00006CC7 .byte 0x2F
ROM:00006CC8 .word ROM_6CDA
ROM:00006CCA .byte 0x39
ROM:00006CCB .word ROM_6CD2
ROM:00006CCD .byte 0x5A
ROM:00006CCE .word 0
ROM:00006CD0 .word 0x6CEC
ROM:00006CD2 ; ---------------------------------------------------------------------------
ROM:00006CD2 ROM_6CD2: ; DATA XREF: ROM_6C60+6Bo
ROM:00006CD2 mov DPTR, #0xB03E
ROM:00006CD5 mov A, #0x30 ; '0'
ROM:00006CD7 movx @DPTR, A
ROM:00006CD8 sjmp ROM_6D54
ROM:00006CDA ; ---------------------------------------------------------------------------

with case_switch_byte being:

ROM:B5BA ; =============== S U B R O U T I N E =======================================
ROM:B5BA ; Iput: A = matching case value (byte)
ROM:B5BA case_switch_byte: ; CODE XREF: ROM:4046p
ROM:B5BA ; ROM_4552+178p ...
ROM:B5BA pop DPH ; Data Pointer, High Byte
ROM:B5BC pop DPL ; DPTR = lcall return address
ROM:B5BE mov R0, A
ROM:B5BF loop_through_cases: ; CODE XREF: case_switch_byte+24j
ROM:B5BF clr A
ROM:B5C0 movc A, @A+DPTR ; NB: @ is confusing. It's just A+DPTR
ROM:B5C1 jnz valid_dest ; make sure dest @ != 0
ROM:B5C3 mov A, #1
ROM:B5C5 movc A, @A+DPTR
ROM:B5C6 jnz valid_dest
ROM:B5C9 inc DPTR ; if null dest, just use the next
ROM:B5C9 ; word as dest address
ROM:B5CA dest_match: ; CODE XREF: case_switch_byte+1Fj
ROM:B5CA movc A, @A+DPTR
ROM:B5CB mov R0, A
ROM:B5CC mov A, #1
ROM:B5CE movc A, @A+DPTR
ROM:B5CF mov DPL, A ; Data Pointer, Low Byte
ROM:B5D1 mov DPH, R0 ; dest into DPTR
ROM:B5D3 clr A
ROM:B5D4 jmp @A+DPTR
ROM:B5D5 ; ---------------------------------------------------------------------------
ROM:B5D5 valid_dest: ; CODE XREF: case_switch_byte+7j
ROM:B5D5 ; case_switch_byte+Cj
ROM:B5D5 mov A, #2
ROM:B5D7 movc A, @A+DPTR
ROM:B5D8 xrl A, R0 ; cmp val with parameter
ROM:B5D9 jz dest_match
ROM:B5DD inc DPTR ; skip 3 bytes to next switch table entry
ROM:B5DE sjmp loop_through_cases
ROM:B5DE ; End of function case_switch_byte

The same kind of routine also exists for a word parameter instead of a byte.
Most of the time, the case values will follow some kind of logical order, so if you see a bunch of sequencing bytes of word, interlaced with what look like offsets, and preceded by an lcall, you might want to chack what's on the other end of that lcall.

OR, the preferred way once you have made your initial pass at identifying code, look for a function that starts by popping DPH and DPL, and seek all the lcalls that cross reference to it to identify the switch tables.

Oh, and for those who might wonder, of course, as soon as you pop the PC address that's been enqueued on the stack, the lcall never returns, and becomes exactly like a jump.

Coming next: How the hell are these bloody strings and other data sitting in standalone data sections referenced, where there does not appear to be any obvious address referencing to them anywhere in the disassembly...


Now I remember why I hate the 8032...

...and pretty much EVERY CPU designed by intel, ever.

It's not like they could have turned a 16 bit CPU into a 24 or 32 bit addressing one, because that's way harder to do than extending the 32 bit x86 to 64 bit for instance (or the original 16 bit x86 to 32 bit)...

"Look Ma! It's a flock of randomly chosen flying search engine terms to direct a certain kind of people here!": Mediatek, MT8226, 8226, LCD, Samsung, 8032, 8051, MT13x9, bank, page, 0xFFFF, reverse engineering, firmware.

The usual trick, apparently originally devised by Philips, is to use one of the 8032 ports (usually P1) as an extention of the 16 bit address bus, thus acting as a ROM bank selector => if you use 8 bits of that port, you now have 24 bit addressing.

Of course, this doesn't mean that your PC will suddenly become 24 bit aware, and since it still remains 16 bit, you need to have overlap code, for each of your 64 KB banks ([0000-FFFF]) so that, as far as the CPU is concerned, the same bank is still being used.

This usually results in LOADS AND LOADS of ROM space wasted with the following kind of crap:

ROM:8301 ; --------------------------------------------------------------------------
ROM:8301 mov DPTR, #0x7E2D
ROM:8304 ljmp bank_0E
ROM:8307 ; ---------------------------------------------------------------------------
ROM:8307 mov DPTR, #0xE4D1
ROM:830A ljmp bank_0B
ROM:830D ; ---------------------------------------------------------------------------
ROM:830D mov DPTR, #0xF51E
ROM:8310 ljmp bank_0F
ROM:8313 ; ---------------------------------------------------------------------------
ROM:8313 mov DPTR, #0xFA0D
ROM:8316 ljmp bank_0F
ROM:8319 ; ---------------------------------------------------------------------------
ROM:8319 mov DPTR, #0xF924
ROM:831C ljmp bank_0F
ROM:831F ; ---------------------------------------------------------------------------
ROM:831F mov DPTR, #0xFDA7
ROM:8322 ljmp bank_0D

with the bank switching function of the like:

ROM:8288 ; ---------------------------------------------------------------------------
ROM:8288 bank_0E: ; CODE XREF: ROM:84F6j
ROM:8288 ; ROM:852Cj ...
ROM:8288 mov A, P1 ; Port 1
ROM:828A anl A, #0xF ; 8 bits for high addressing => 16 banks
ROM:828C cjne A, #0xE, ROM_8291 ; bank 15 (0x0E)
ROM:828F clr A
ROM:8290 jmp @A+DPTR
ROM:8291 ; ---------------------------------------------------------------------------
ROM:8291 ROM_8291: ; CODE XREF: ROM:828Cj
ROM:8291 swap A
ROM:8292 rr A
ROM:8293 push ACC ; Accumulator
ROM:8295 mov A, #0xBF ; '+'
ROM:8297 push ACC ; Accumulator
ROM:8299 push DPL ; Data Pointer, Low Byte
ROM:829B push DPH ; Data Pointer, High Byte
ROM:829D ljmp ROM_BF70
ROM:82A0 ; ---------------------------------------------------------------------------
ROM:BF70 ; ---------------------------------------------------------------------------
ROM:BF70 anl P1, #0xF0 ; Go trough bank 0
ROM:BF73 orl P1, #0xE ; Set the high address lines to bank 0x0E on Port 1
ROM:BF76 ret
ROM:BF76 ; ---------------------------------------------------------------------------

As indicated above, each bank has to DUPLICATE this whole bank switching business at the exact same address. And there will be a short period where code is actually ran in Bank 0 (right after the "anl P1, #0xF0") before switching to the actual destination bank (here Bank 0Eh)

Only thing I can say is, seeing how crappy a CPU it is (don't get me started about how poorly optimized all the 8032 code I see seems to be), licensing for the 8032 must come really REALLY cheap!

But then again, having SoC makers like MediaTek design their systems on the cheap might have its advantages...