ML Tracer
Thomas G. Gordon
Attempting to debug a machine language program can sometimes be a trying experience, especially when the program always seems to exit into the twilight zone. And trying to study a program in ROM can be just as frustrating, even with a disassembler (where do branch instructions go?). Here's an excellent programming utility: a singlestepper for Atari, Apple, and all Commodore computers.
Anyone who has ever worked with machine language knows how helpful it can be to be able to single-step through a program. "ML Tracer" allows you to step through a machine language routine one event at a time and print out the contents of all of the microprocessor registers after each instruction. It also allows you to follow all branches, jumps, and returns. The program will display the address, opcode, mnemonic, and operand of each instruction.
Three versions are included. Program 1 runs on all Commodore computers (for the VIC, 8K or more expansion memory is required). Program 2, for the Apple II, is only slightly different from the Commodore version. The Atari version, Program 3, has more substantial changes, but its structure is still quite similar. Since all the versions have the same line numbers, references in this article apply to all versions unless otherwise stated.
When Tracer is run, there will be a ten-second delay while the DATA statements are read. You'll then be asked for the hex address of the ML program you wish to examine. You can change the contents of any register, before each instruction is executed. Press a for the accumulator, x for the x register, y for the y register, s for the stack pointer, p for the processor status, or i for the instruction pointer (program counter). On the Atari, also press RETURN. When you're through loading registers, press RETURN once more to execute the next instruction.
Hexadecimal numbers are used for all input and output. If you enter an address as a one-, two-, or three-digit hexadecimal number, zeros will be added on the left to make a four-digit number. If too many digits are entered, the rightmost four digits will be used. The same applies to changing the value in a register. The number that you enter will be converted to a two-digit hexadecimal number using the same rules.
The Execution Subroutine
The program is written mostly in BASIC, but contains two machine language subroutines. The first, the initialization subroutine, copies the lowest three pages (768 bytes) of RAM, which are used by BASIC, to a location above the BASIC program. The other, the execution subroutine, exchanges the two three-page blocks of data and loads all the registers with their saved values, then executes one instruction (which has been POKEd in from BASIC). When the instruction has been executed, the registers are saved and BASIC'S original lower three pages of memory are restored.
The same technique was used to identify addressing modes as in my disassembler ("A 6502 Disassembler," COMPUTE!, January 1981, p. 81). Lines 10000-10031 contain four-character extended mnemonics for the 6502's instruction set. The fourth character is a tag code identifying the addressing mode of the instruction. In lines 110-120, the mode is identified and the proper subroutine is called.
There are several instructions which cannot be allowed to actually execute in the machine language subroutine. If any control transfer instructions (JMP, JSR, RTS, RTI, or a conditional branch) were executed, control would not be returned properly to the BASIC program. These instructions are simulated in BASIC instead, so that they appear to execute successfully. The SEI and CLI instructions are ignored, since interrupts are always disabled during the execution subroutine.
How Does It Work?
The simplest way to see how the program works is to trace through an example. Suppose the instruction LDA #$20 resides at addresses $03C0-$03Cl. For this instruction, the extended mnemonic is LDAB, where LDA stands for LoaD Accumulator, and B is the tag code for immediate addressing. The hexadecimal representation for LDA immediate is $A9, which is equivalent to decimal 169.
Line 50, the top of the main loop, calls the keyboard pause routine at line 7000, which also handles changing registers. In line 55, the variable C is loaded with 169 by PEEKing the memory addressed by B, the instruction pointer. The value of B, 960 in this example, is then converted to hexadecimal characters in line 2000 and PRINTed.
In line 60, NOP instructions are POKEd into the execution routine to take up space after one-or two-byte instructions. The hexadecimal value of the opcode is printed next, and then the mnemonic is retrieved from the array R$(). (In the Atari version, mnemonics are stored in the string R$.) If the mnemonic is a blank, this instruction is undefined and an error message is displayed. Otherwise, the standard (three-character) mnemonic is PRINTed, the opcode is POKEd into the execution routine at OP, and the program counter is incremented to 961.
The ASCII code for B is 66, so the ON GOSUB in line 120 transfers control to line 400. Here, the symbol for the addressing mode, #$ is printed. The one-byte operand routine, at line 3000, PEEKs location 961, pointed to by the program counter. This number is POKEd into OP + 1, then converted to hexadecimal and PRINTed. After incrementing the program counter to point to the start of the next instruction, a RETURN is executed at line 3000.
At line 5000, the execution routine is SYSed, CALLed, or USRed depending on which computer you have. The contents of the registers are displayed, and control passes back to line 120. Here, a GOTO 50 takes us back to the top of the loop, where the instruction at $3C2 will be executed.
Tracing Is Educational Too
You will find that this program is most useful for testing small ML programs, such as those called as subroutines from BASIC. It's also good for examining sections of larger programs when you're not sure how a particular routine works. If you're learning machine language, you'll find that the register display is an enormous help in understanding the effects and side effects of each instruction, especially the bits (flags) of the processor status register.
Do be careful, though. Any program is vulnerable when dealing with something as powerful as machine language, and this one is no exception. There are more ways to kill a BASIC program from ML than anyone can name in one sitting, so always be conscientious about saving your programs. After you type this one in, SAVE it before you even think about running it. One typographical error could cause the program to erase itself, or at least lock up the computer.
There are also some ML programs that this tracer can't follow, such as those which disconnect the keyboard or video display (whether intentionally or accidentally). If everything is saved on disk or tape (for real security, take the diskette or cassette out of the drive), you can experiment as much as you want, and then if disaster struck all you'd have to do is just turn the computer off and reload the program.