Apple Assembly Line
Volume 6 -- Issue 6 March 1986

In This Issue...

ES-CAPE

Whatever happened to the Extended S-C Applesoft Program Editor? That's a question we've heard more than a few times in the last year or two, and we finally have some kind of answer.

We got bogged down in producing Version 2.0 of the program. The new printer control, Park and Join, and Applesoft and DOS command features are great. The 40-column, STB-80, and //e versions came out just fine, but the Videx and Viewmaster versions stumped us. The planned Renumber and Merge features never made it, and we couldn't settle on a mechanism for adding other utility programs.

Anyway, we've got a deal for you! How's this for a package?: ES-CAPE 1.0 Source and Object Code and manual, along with ES-CAPE 2.0 Source and Object Code and a manual supplement on disk. That's all the source and object code for both versions of the program, for a total of only $50.00. Registered owners of ES-CAPE 1.0 can purchase this new package for only $30.00.

New 65816 Book

There's another book coming along on programming the 658xx processors. This one is called "65816/65802 Assembly Language Programming", by Michael Fischer, published by Osborne/ McGraw-Hill as an addition to their Assembly Language Programming series, mostly by Lance Leventhal. Fischer's book is scheduled for May delivery, so we have ordered some copies and are beginning to accept orders. Our price will be $18.00 + shipping.


Modifying ProDOS for Non-Standard ROMs Bob Sander-Cederlof

We have published several times ways to defeat the ROM Checksummer that is executed during a ProDOS boot, so that owners of Franklin clones (or even real Apples with modified monitor ROMs) could use ProDOS-based software. See AALs of March and June, 1984.

Both of these previous articles are out of date now, because they apply to older versions of ProDOS than are current. What follows applies to Version 1.1.1 of ProDOS.

There are two problems with getting ProDOS to boot on a non-standard machine. The first is the ROM Checksummer. This subroutine starts at $267C in Version 1.1.1, and is only called from $25EE. The code is purposely weird, designed to look like it is NOT checking the ROMs. It also has apparently purposeful side effects. Here is a listing of the subroutine:

  1000 *SAVE CHECKSUMMER
  1010 *--------------------------------
  1020        .OR $267C    POSITION IN PRODOS SYSTEM FILE
  1030 *--------------------------------
  1040 CHECKSUMMER
  1050        CLC
  1060        LDY $2674    (GETS A VALUE 0)
  1070 .1     LDA ($0A),Y  GETS (FB09...FB10)
  1080        AND #$DF     STRIP OFF LOWER CASE BIT
  1090        ADC $2674    ACCUMULATE SHIFTED SUM
  1100        STA $2674
  1110        ROL $2674    SHIFT RESULT, CARRY INTO BIT 0
  1120        INY
  1130        CPY $2677    DO IT 8 TIMES
  1140        BNE .1
  1150        TYA          A = Y = 8
  1160        ASL          FORM $80 BY SHIFTING
  1170        ASL
  1180        ASL
  1190        ASL
  1200        TAY          $80 TO Y FOR LATER TRICK
  1210        EOR $2674    MERGE WITH PREVIOUS "SUM"
  1220        ADC #11      FORM $00 FOR VALID ROMS
  1230        BNE .2       ...NOT A VALID ROM
  1240        LDA $0C      GET MACHINE ID BYTE
  1250        RTS
  1260 .2     LDA #0       SIGNAL INVALIDITY
  1270        RTS

The pointer at $0A,0B was set up to point to $FB09 using very sneaky code at $248A. Location $2674 initially contains a 0, and $2677 contains an 8. Only the bytes from $FB09 through $FB10 are checksummed. Truthfully, "checksummed" is not the correct word.

The wizards who put ProDOS together figured out a fancy function which changes the 64 bits from $FB09 through $FB10 into the value $75. Their function does this whether your ROMs are the original monitor ROM from 1977-78, the Autostart ROM, the original //e ROM, or any other standard Apple ROM. The values in $FB09-FB10 are not the same in all cases, but the function result is always $75. However, a Franklin ROM does not produce $75. Probably a BASIS also gives a different result, and other clones. Once $75 is obtained, further slippery code changes the value to $00.

The original Apple II ROM has executable code at $FB09, and in hex it is this: B0 A2 20 4A FF 38 B0 9E. All other Apple monitor ROMs have an ASCII string at $FB09. The string is either "APPLE ][" or "Apple ][". Notice that the "AND #$DF" in the checksummer strips out the upper/lower case bit, making both ASCII strings the same.

I wrote a test program to print out all the intermediate values during the "Checksummer's" operation. Here are the results, for both kinds of ROMs.

     Original ROM           Later ROMs
     LDA AND ADC STA ROL    LDA   AND ADC STA ROL
     B0  90  00  90  20     C1    C1  00  C1  82
     A2  82  20  A2  44     D0/F0 D0  82  52  A5
     20  00  44  44  88     D0/F0 D0  A5  75  EB
     4A  4A  88  D2  A4     CC/EC CC  EB  B7  6F
     FF  DF  A4  83  07     C5/E5 C5  6F  34  69
     38  18  07  1F  3E     A0    80  69  E9  D2
     B0  90  3E  C3  9C     DD    DD  D2  AF  5F
     9E  9E  9C  3A  75     DB    DB  5F  3A  75

I don't understand why this code gives the same result, but I see it does. Now, dear readers, tell me how anyone ever figured out what sequence of operations would produce the same result using these two different sets of eight bytes, and yet produce a different result for clones! If you understand it, please explain it to me!

By the way, here is a listing of my test program:

  1000  .LIF
  1010 *SAVE TEST.CKSUMMER
  1020 *--------------------------------
  1030 *   SIMULATE PRODOS $FB09-FB10 CHECK-SUMMER
  1040 *      (AT $267C IN PRODOS 1.1.1)
  1050 *--------------------------------
  1060 T
  1070        LDA #S1
  1080        STA $0A
  1090        LDA /S1
  1100        STA $0B
  1110        JSR CS
  1120        LDA #S2
  1130        STA $0A
  1140        LDA /S2
  1150        STA $0B
  1160 CS
  1170        JSR PT
  1180        CLC
  1190        LDY #0
  1200        STY X
  1210 .1     LDA ($0A),Y
  1220        JSR B
  1230        AND #$DF
  1240        JSR B
  1250        LDA X
  1260        JSR B
  1270        LDA ($0A),Y
  1280        AND #$DF
  1290        ADC X
  1300        STA X
  1310        JSR B
  1320        ROL X
  1330        LDA X
  1340        JSR B
  1350        JSR $FD8E
  1360        INY
  1370        CPY #8
  1380        BCC .1
  1390        TYA
  1400        ASL
  1410        ASL
  1420        ASL
  1430        ASL
  1440        ORA X
  1450        JSR B
  1460        ADC #$0B
  1470 *--------------------------------
  1480 B      PHA
  1490        PHP
  1500        JSR $FDDA
  1510        LDA #" "
  1520        JSR $FDED
  1530        JSR $FDED
  1540        PLP
  1550        PLA
  1560        RTS
  1570 *--------------------------------
  1580 X      .BS 1
  1590 *--------------------------------
  1600 S1     .AS -/APPLE ][/
  1610 S2     .HS B0.A2.20.4A.FF.38.B0.9E
  1620 *--------------------------------
  1630 TITLE  .HS 8D8D
  1640        .AS -/LDA AND ADC STA ROL/
  1650        .HS 8D00
  1660 *--------------------------------
  1670 PT
  1680        LDY #0
  1690 .1     LDA TITLE,Y
  1700        BEQ .2
  1710        JSR $FDED
  1720        INY
  1730        BNE .1
  1740 .2     RTS
  1750 *--------------------------------

The checksummer can be defeated. The best way, preserving the various side effects, is to change the byte at $269F from $03 to $00. This changes the BNE to an effective no-operation, because it will branch to the next instruction regardless of the status. Another way to get the same result is to store $EA at both $269E and $269F. Still another way is to change the "LDA #0" at $26A3,4 to "LDA $0C" (A5 0C), so that either case gives the same result.

If it thinks it is in a valid Apple computer, the checksummer returns a value in the A-register which is non-zero, obtained from location $0C. The value at $0C has been previously set by looking at other locations in the ROM, trying to tell which version is there. Part of this code is at $2402 and following, and part is at $2047 and following. The byte at $0C will eventually become the Machine ID byte at $BF98 in the System Global Page, so it also gets some bits telling how much RAM is available, and whether an 80-column card and a clock card are found.

If you have a non-standard Apple or a clone the bytes which are checked to determine which kind of ROM you have may give an illegal result. The following table shows the bytes checked, and the resulting values for $0C. The values in parentheses are not ever checked, but I included them for completeness. The value in $0C will be further modified to indicate the amount of RAM found and the presence of a clock card.

   Version                FBB3  FB1E  FBC0  FBBF  $0C

   Original Apple II       38   (AD)  (60)  (2F)   00
   Autostart, II Plus      EA    AD   (EA)  (EA)   40
   Original //e            06   (AD)   EA   (C1)   80
   Enhanced //e            06   (AD)   E0   (00)   80
   DEBUG //e               06   (AD)   E1   (00)   80
   Original //c            06   (4C)   00    FF    88
   //c Unidisk 3.5         06   (4C)   00    00    88
   /// Emulating II        EA    8A   (??)  (??)   C0

By the way, ProDOS 1.1.1 will not allow booting by an Apple /// emulating a II Plus, possibly because the standard emulator only emulates a 48K machine.

I have no idea what a clone would have in those four locations, but chances are it would be different. You should probably try to fool ProDOS into thinking you are in a II Plus, because most clones are II Plus clones. This means you should somehow change the ID procedures so that the result in $0C is a value of $40. One way to do this is change the code at $2402 and following like this:

           Standard                     Change to

     2402- A9 00    LDA #0        2402- A9 40    LDA #$40
     2404- 85 0C    STA $0C       2404- 4C 2E 24 JMP $242E
     2406- A3 B3 FB LDX $FBB3

If your clone or modified ROM is a //e, change $2402 to LDA #$80 instead.

You may also need to modify the code at $2047 and following. If you are trying to fool ProDOS into thinking you are an Apple II Plus or //e, and have already made the change described above, change $2047-9 like this:

           Standard                    Change to

     2047- AE B3 FB LDX $FBB3    2047- 4C 6D 20 JMP $206D

No doubt future versions of ProDOS will make provision for clones and modified ROMs even more difficult. And there are always the further problems encountered by usage of the ROMs from BASIC.SYSTEM and the ProDOS Kernel and whatever application program is running.

I am intrigued about seeing what the minimum amount of code is that can distinguish between the four legal varieties of ROM for ProDOS. I notice from the table above that I can identify the four types and weed out the ///emulator by the following simple code at $2402:

            LDA $FBB3
            ORA $FB1E
            LDX #3
       .1   CMP TABLE.1,X
            BEQ .2
            DEX
            BPL .1
            SEC
            RTS
     *
     TABLE.1  .HS BD.EF.AF.4E
     TABLE.2  .HS 00.40.80.88
     *
       .2   LDA TABLE.2,X
            JMP $242E

With this code installed, all the code from $2047-$206C is not needed, and the JMP $206E should be installed at $2047. The new code at $2402 fits in the existing space with room to spare. Can you do it with even shorter code?


Even Faster 65802 16x16 Multiply Bob Sander-Cederlof

Bob Boughner, faithful reader from Yorktown, Virginia, decided that the challenge at the end of my article in the January 1986 AAL could not be ignored. He was able to slightly increase the speed of my 16x16 multiply subroutine for the 65802. After studying his code, I made a few more little changes and squeezed out even more cycles.

To see just how much faster the new subroutine is, I carefully counted the cycles, and then went back and did the same to January's subroutine. For some reason I got a new answer for January's program, slightly slower than published. Here are the results:

               Minimum  Maximum  Average
       January   333      693      513
       New One   321      633      477

The times include 6 cycles for a JSR to call the subroutine, and 6 cycles for the RTS to return. By putting the code in-line, even these 12 cycles could be eliminated. The so-called average time is merely the arithmetic average of the minimum and maximum times. The "real" average for random factors will be faster, because one or both of the INC instructions at lines 1350 and 1430 would be skipped. In fact, almost always at least one would be skipped, saving 48 cycles. Note also that if the factor in CAND is zero, the total time is only 45 cycles.

In counting cycles I assumed that the D-register, which tells the 65802 where the direct page is, has a low byte = 0. If it is non-zero, all of the references to CAND, PLIER, and PROD would require one more cycle.

The new subroutine is only 4 bytes longer than the January one. The new one uses the Y-register, while the old one did not. There are three tricks in the new code which save time. The first one is holding the multiplicand in the Y-register, so that TYA instructions can be used at lines 1310 and 1390. This saves 2 cycles each time, or a total of 32 cycles in the maximum case. The cost is the LDY CAND in line 1200, 4 cycles.

The second trick eliminates the CLC instruction before the multiplier is added in lines 1370-1430. The savings is 16 cycles maximum, and the cost is 8 cycles to set it up in lines 1120-1140 by inverting the high byte of the multiplier. This doesn't affect the average time any, but it does lower the maximum time.

The third trick is at lines 1280 and 1290. I saved 24 cycles by eliminating January's AND ##$0080 instruction here. The LDA PLIER-1 instruction picks up the low byte of the multiplier in the high byte of the A-register, allowing me to see what bit 7 of the multiplier is without any masking or shifting.

  1000        .OP 65802
  1010 *SAVE BOUGHNERS.MULT
  1020 *--------------------------------
  1030 *   CONTRIBUTED BY BOB BOUGHNER
  1040 *   MODIFIED A LITTLE MORE BY BOB S-C
  1050 *--------------------------------
  1060 CAND   .EQ 0,1
  1070 PLIER  .EQ 2,3
  1080 PROD   .EQ 4,5,6,7
  1090 *--------------------------------
  1100 MUL.FASTER.YET.16X16.65802
  1110        LDX #8       WILL LOOP 8 TIMES
  1120        LDA PLIER+1  INVERT HIGH BYTE
  1130        EOR #$FF     TO SAVE "CLC" IN LOOP
  1140        STA PLIER+1
  1150        CLC
  1160        XCE          ENTER "NATIVE" MODE
  1170        REP #$30     16-BITS BOTH X & M
  1180        STZ PROD     CLEAR PRODUCT
  1190        STZ PROD+2
  1200        LDY CAND     MULTIPLICAND IN Y-REG
  1210        BNE .2       ...NON-ZERO, START LOOP
  1220        XCE          ...ZERO, EXIT NOW
  1230        RTS
  1240 *--------------------------------
  1250 .1     ASL PROD     DOUBLE THE PRODUCT
  1260        ROL PROD+2
  1270 *--------------------------------
  1280 .2     LDA PLIER-1  GET LOW BYTE IN A(15-8)
  1290        BPL .3       ...ORIG. BIT=0, DON'T ADD
  1300        CLC
  1310        TYA          ...ORIG. BIT=1, ADD 'CAND
  1320        ADC PROD
  1330        STA PROD
  1340        BCC .3
  1350        INC PROD+2   ADD CARRY TO HI-16
  1360 *--------------------------------
  1370 .3     ASL PLIER    SHIFT MULTIPIER, GET HI-BIT
  1380        BCS .4       ...ORIG. BIT=0, DON'T ADD
  1390        TYA          ...ORIG. BIT=1, ADD 'CAND
  1400        ADC PROD+1   ADD TO MIDDLE OF PRODUCT
  1410        STA PROD+1
  1420        BCC .4
  1430        INC PROD+3   (NEVER BOTHERS PROD+4)
  1440 *--------------------------------
  1450 .4     DEX
  1460        BNE .1
  1470        SEC
  1480        XCE
  1490        RTS
  1500 *--------------------------------
  1510   .LIF

Some More Rumors Bob Sander-Cederlof

Electronics magazine printed a brief news item about a second source for 65816 chips. Western Design has signed up a lot of licensees to make these chips, but none of them are in production as of this month. Electronics says VLSI Technology Inc., of San Jose, California, is projecting prices in the $10 range for volume purchases. When? Target is to start selling sample quantities next summer. Meanwhile, volume prices are in the $35 range from Western Design Center. The single-unit price is still about $100.

The parts we are selling are the 65C802 from Western Design Center. Our price to you is $50 each. These are normally spec'd at 2MHz, but sometimes we get 4MHz parts at the same price, when they are out of the slower ones. Either speed works equally well in an Apple motherboard, but you need the 4MHz chip to use in a Transwarp accelerator card.

Rumors continue to ricochet around the club newsletter circuit about the possible configuration of the new Apple II (usually called the //x). Most rumor sources agree now that the //x will use a 65C816. We sure HOPE so! One source said he looks for an 8MHz clock. We doubt that, because current projections are for 8MHz chips becoming available about 1st quarter 1987. And the RAM for 8MHz operation would be far too expensive. My guess we will see either 2MHz or 3.58MHz.

Most are now including a SCSI port in their list of features, since the Macintosh Plus has one. Some are talking about a smaller set of normal slots, supplemented by some new super-slots having more signals available. There are reportedly a number of different versions of the //x already in existence, seeded around. If that is true, it could be than no one (even inside Apple) yet knows what the REAL //x will be.


Add Smarts to 65816 Dis-Assembler Jim Popenoe

I found fascinating the article by Bob Sander-Cederlof in the March 1985 AAL, entitled "A Disassembler for the 65816". I purchased AAL Quarterly Disk 18 and tried it out for myself, watching 65802 instructions zip before my eyes.

But, whoa! Bob was correct in warning that his disassembler would not know whether immediate-mode instructions are two or three bytes long. Bob explained "only by executing the programming, and tracing it line-by-line, can we tell." A fully accurate disassembler for the 65816 would have to execute the equivalent of STEP and TRACE, following the logic flow of the program.

I wanted an easier, quick-and-dirty way to spiff up the output, one that would at least recognize simple, straightforward changes in the processor status. I reasoned that:

1) Interpretation of immediate-mode instructions depends on the state of E, M, and X bits in the status register.

2) E and C bits are exchangeable.

3) The disassembler must keep track of all four bits (C, E, X, and M) in order to disassemble immediate mode opcodes correctly.

4) The disassembler should also keep track of when the processor status is pushed onto or pulled off the stack.

My implementation assigns a memory location for the E-bit, and a small "stack" of 8 memory locations for the status register. One more memory location serves as the stack pointer. Here is the initialization code for these memory locations, replacing lines 1450-1480 in Bob's March 1985 listing:

  1450        LDX #$FF          START WITH E=1
  1454        STX E.BIT
  1458        STX STATUS.STACK  EMPTY THE STATUS STACK
  1462        INX               X=0
  1466        STX STATUS.PNTR
  1470        RTS
  1474 *--------------------------------
  1478 E.BIT         .BS 1
  1482 STATUS.PNTR   .BS 1
  1486 STATUS.STACK  .BS 8

I added a JSR TEST.OP.CODES line at 5865, to call some new code which looks for CLC, SEC, REP, SEP, PHP, PLP, and XCE instructions. It adjusts the flags appropriately in response to these instructions. If the current opcode is none of the above, TEST.OP.CODES checks the status bits and the opcode to set up the correct immediate-mode length. If the opcode is an immediate mode operation on the A-register, and if E=0 and M=0, then 16-bit immediate will be disassembled. If the opcode is an immediate mode operation on the X- or Y-register, and if E=0 and X=0, then 16-bit immediate will be disassembled. Otherwise, any kind of immediate mode instruction will be disassembled with an 8-bit operand.

I tried the program on all the sample 65802 code I could find, and it was all disassembled correctly. Of course it is certainly possible to fool my program. The C-bit, and hence possibly the E-bit, can be changed in many other ways than by using the CLC and SEC instructions. The program flow is not followed, so it is possible than my emulation of the carry status and the XCE will not agree with what happens in some code. If you adhere to the "nice" standard of always using explicit SEC or CLC opcodes before an XCE opcode, the disassembler should stay in step perfectly.

When you type 800G to link in the disassembler (refer to Bob's article to know what I mean here) the status is initialized to E=C=M=X=1. This means normal 6502 mode. If you disassemble some code with XCE's in it, the status I keep will probably be left in some other mode. If you then try to disassemble some plain vanilla 6502 code, the immediate instructions may be disassembled with 16-bit operands. Just type 800G again to get back to normal.

By the way, in working with Bob's disassembler I discovered a typing error in his code. Line 3980 was originally >OXA TAY, and it should have been >OXA DEY. The hex listing in Bob's article showed $AF stored in $963; it really should be $89. Without this change, the DEY opcode disassembles as TAY!

The listing that follows has been extensively modified by Bob, based on my code I sent him last September. The lines are numbered to follow after the last line of the program on the quarterly disk.

  7060 *--------------------------------
  7070 TEST.OP.CODES
  7080        PHA          SAVE OPCODE
  7090        LSR IMM.SIZE      ASSUME 8-BIT IMMEDIATE
  7100        LDX STATUS.PNTR
  7110        CMP #$18     CLC?
  7120        BEQ CLC.OP
  7130        CMP #$38     SEC?
  7140        BEQ SEC.OP
  7150        INY
  7160        CMP #$C2     REP?
  7170        BEQ REP.OP
  7180        CMP #$E2     SEP?
  7190        BEQ SEP.OP
  7200        DEY
  7210        CMP #$08     PHP?
  7220        BEQ PHP.OP
  7230        CMP #$28     PLP?
  7240        BEQ PLP.OP
  7250        CMP #$FB     XCE?
  7260        BEQ XCE.OP
  7270 *--------------------------------
  7280        AND #$1F     ORA, AND, EOR, ADC, BIT, LDA, CMP, SBC?
  7290        CMP #$09
  7300        PHP          SAVE ANSWER
  7310        LDA #$20     ASSUME M-BIT
  7320        PLP          GET PREVIOUS ANSWER
  7330        BEQ .1       IT IS M-BIT
  7340        LSR    (LDA #$10)    USE X-BIT INSTEAD
  7350 .1     AND STATUS.STACK,X
  7360        BNE .2       ...USE 8-BIT IMMEDIATE
  7370        LDA E.BIT
  7380        LSR
  7390        BCS .2       E=1, USE 8-BIT IMMEDIATE
  7400        LDA #$FF     ...USE 16-BIT IMMEDIATE
  7410        STA IMM.SIZE
  7420 .2     PLA          GET OPCODE AGAIN
  7430        RTS
  7440 *--------------------------------
  7450 CLC.OP LDA STATUS.STACK,X
  7460        AND #$FE
  7470 UPDATE.STATUS
  7480        STA STATUS.STACK,X
  7490        PLA
  7500        RTS
  7510 *--------------------------------
  7520 SEC.OP LDA STATUS.STACK,X
  7530        ORA #$01
  7540        BNE UPDATE.STATUS   ...ALWAYS
  7550 *--------------------------------
  7560 REP.OP LDA (PCL),Y     LOOK AT OPERAND
  7570        EOR #$FF
  7580        AND STATUS.STACK,X
  7590        JMP UPDATE.STATUS
  7600 *--------------------------------
  7610 SEP.OP LDA (PCL),Y
  7620        ORA STATUS.STACK,X
  7630        JMP UPDATE.STATUS
  7640 *--------------------------------
  7650 PHP.OP LDA STATUS.STACK,X
  7660        INX
  7670        CPX #8
  7680        BCC PHP.PLP
  7690        LDX #0
  7700 PHP.PLP
  7710        STX STATUS.PNTR
  7720        JMP UPDATE.STATUS
  7730 *--------------------------------
  7740 PLP.OP DEX
  7750        BPL PHP.PLP
  7760        LDX #7
  7770        BEQ PHP.PLP
  7780 *--------------------------------
  7790 XCE.OP LSR E.BIT    GET E-BIT INTO CARRY
  7800        PHP          SAVE IT
  7810        LDA STATUS.STACK,X
  7820        STA E.BIT    NEW E-BIT
  7830        LSR          C-BIT INTO CARRY
  7840        BCC .1       ...NEW E-BIT = 0
  7850        ORA #$18     ...NEW E-BIT=1, SO SET M=X=1
  7860 .1     PLP          GET NEW C-BIT (OLD E-BIT)
  7870        ROL          PUT IT INTO STATUS BYTE
  7880        JMP UPDATE.STATUS

Further notes by Bob Sander-Cederlof:

Thanks, Jim! Your ideas were a big help! In looking back over my work, I noticed some more improvements.

R. F. O'Brien wrote us just this week with the news that he had found two bugs in the disassembler. One was the typing error at line 3980 which Jim noted above. But Robert found a second typo, at line 4960. ">OXB LDX" should be changed to ">OXB CPX". This changes the byte shown in the original article at $9BF from $19 to $0F.

I found a way to simplify the >ON macro, which speeds up assembly and shortens the listing. Replace lines 1220-1290 with the following:

  1220        .MA ON
  1280 ]1]2]3]4 .DA ']1-64*32+']2-64*32+']3-64*2
  1290        .EM

I also discovered that one kind of Apple monitor ROM did not have the RELADR subroutine, so I re-coded lines 6760-6950. Replace those lines with the following:

  6760 *---8- OR 16-BIT RELATIVE--------
  6770        LDA (PCL),Y  8=OFFSET, 16=OFFSETHI
  6780        DEY          TEST LENGTH
  6790        STY FORMATH  =0 IF 8-BIT
  6800        BEQ .10      ...8-BIT
  6810        STA FORMATH  ...16-BIT
  6820        LDA (PCL),Y  LOW BYTE OF 16-BIT OFFSET
  6830 .10    STA FORMATL
  6840        JSR PCADJ
  6850        CLC
  6860        ADC FORMATL
  6870        TAX
  6880        TYA
  6890        ADC FORMATH
  6900        JMP PRNTAX

One last item. I wrote a test routine to call the disassembler for every possible opcode from 00 to FF. Here it is:

  7890 *--------------------------------
  7900 TT     LDY #0
  7910        LDA #$C0
  7920        STA PCL
  7930        LDA #2       $2C0...$3C3
  7940        STA PCH
  7950 .1     TYA
  7960        STA $2C0,Y
  7970        INY
  7980        BNE .1
  7990        STY $3C0
  8000        INY
  8010        STY $3C1
  8020        INY
  8030        STY $3C2
  8040 .2     JSR INSTDSP
  8050        LDY #0
  8060        LDA (PCL),Y
  8070        CMP #$FF
  8080        BEQ .3
  8090 .4     LDA $C000
  8100        BPL .4
  8110        STA $C010
  8120        INC PCL
  8130        BNE .2
  8140        INC PCH
  8150        BNE .2       ...ALWAYS
  8160 .3     RTS
  8170 *--------------------------------

Here is the entire program with my changes included:

  1      .LIST OFF
  800        .TI 76,65816 DISASSEMBLER.................FEBRUARY 14, 1985...........
  810 *SAVE S.816.DSM.NEW
  820 *--------------------------------
  830        .OR $300
  840        .OP 65816
  850        LDA #3
  860        SEC
  870        LDA #3
  880        CLC
  890        LDA #3
  900        REP #$30
  910        LDA #3
  920        CLC
  930        XCE
  940        LDA ##$EA34
  950        REP #$20
  960        LDA ##$EA34
  970        LDX ##$EA34
  980        REP #$10
  990        LDA ##$EA34
  1000        LDX ##$EA34
  1010        SEP #$30
  1020        .OR $800
  1030 *--------------------------------
  1040 IMM.SIZE .EQ $00
  1050 LMNEM    .EQ $2C
  1060 RMNEM    .EQ $2D
  1070 FORMATL  .EQ $2E
  1080 LENGTH   .EQ $2F
  1090 FORMATH  .EQ $30
  1100 PCL      .EQ $3A
  1110 PCH      .EQ $3B
  1120 *--------------------------------
  1130 SCRN2  .EQ $F879
  1140 PRNTAX .EQ $F941
  1150 PRBLNK .EQ $F948
  1160 PRBL2  .EQ $F94A
  1170 PCADJ  .EQ $F953
  1180 CROUT  .EQ $FD8E
  1190 PRBYTE .EQ $FDDA
  1200 COUT   .EQ $FDED
  1210 *--------------------------------
  1211   .LIST ON
  1220        .MA ON
  1280 ]1]2]3]4 .DA ']1-64*32+']2-64*32+']3-64*2
  1290        .EM
  1291   .LIF
  1300 *--------------------------------
  1310        .MA OXA
  1320        .DA #]1-OPNAMES.A/2+128
  1330        .EM
  1340 *--------------------------------
  1350        .MA OXB
  1360        .DA #]1-OPNAMES.B/2
  1370        .EM
  1380 *--------------------------------
  1390 T      LDA $C083    WRITE-ENABLE RAM COPY OF MONITOR
  1400        LDA $C083
  1410        LDA #INSTDSP      PATCH L-COMMAND TO USE MY
  1420        STA $FE65         DIS-ASSEMBLER
  1430        LDA /INSTDSP
  1440        STA $FE66
  1444   .LIST ON
  1450        LDX #$FF          START WITH E=1
  1454        STX E.BIT
  1458        STX STATUS.STACK  EMPTY THE STATUS STACK
  1462        INX               X=0
  1466        STX STATUS.PNTR
  1470        RTS
  1474 *--------------------------------
  1478 E.BIT         .BS 1
  1482 STATUS.PNTR   .BS 1
  1486 STATUS.STACK  .BS 8
  1488   .LIST OFF
  1490 *--------------------------------
  1500 OPNAMES.A
  1510        >ON A,S,L,A
  1520        >ON B,R,K
  1530        >ON C,L,C
  1540        >ON C,L,D
  1550        >ON C,L,I
  1560        >ON C,L,V
  1570        >ON C,O,P
  1580        >ON D,E,C,A
  1590        >ON D,E,X
  1600        >ON D,E,Y
  1610        >ON I,N,C,A
  1620        >ON I,N,X
  1630        >ON I,N,Y
  1640        >ON L,S,R,A
  1650        >ON N,O,P
  1660        >ON P,H,A
  1670        >ON P,H,B
  1680        >ON P,H,D
  1690        >ON P,H,K
  1700        >ON P,H,P
  1710        >ON P,H,X
  1720        >ON P,H,Y
  1730        >ON P,L,A
  1740        >ON P,L,B
  1750        >ON P,L,D
  1760        >ON P,L,P
  1770        >ON P,L,X
  1780        >ON P,L,Y
  1790        >ON R,O,L,A
  1800        >ON R,O,R,A
  1810        >ON R,T,I
  1820        >ON R,T,L
  1830        >ON R,T,S
  1840        >ON S,E,C
  1850        >ON S,E,D
  1860        >ON S,E,I
  1870        >ON S,T,P
  1880        >ON T,A,X
  1890        >ON T,A,Y
  1900        >ON T,C,D
  1910        >ON T,C,S
  1920        >ON T,D,C
  1930        >ON T,S,C
  1940        >ON T,S,X
  1950        >ON T,X,A
  1960        >ON T,X,S
  1970        >ON T,X,Y
  1980        >ON T,Y,A
  1990        >ON T,Y,X
  2000        >ON W,A,I
  2010        >ON W,D,M
  2020        >ON X,B,A
  2030        >ON X,C,E
  2040 *--------------------------------
  2050 OPNAMES.B
  2060        >ON A,D,C
  2070        >ON A,N,D
  2080        >ON A,S,L
  2090        >ON B,C,C
  2100        >ON B,C,S
  2110        >ON B,E,Q
  2120        >ON B,I,T
  2130        >ON B,M,I
  2140        >ON B,N,E
  2150        >ON B,P,L
  2160        >ON B,R,A
  2170        >ON B,R,L
  2180        >ON B,V,C
  2190        >ON B,V,S
  2200        >ON C,M,P
  2210        >ON C,P,X
  2220        >ON C,P,Y
  2230        >ON D,E,C
  2240        >ON E,O,R
  2250        >ON I,N,C
  2260        >ON J,M,L
  2270        >ON J,M,P
  2280        >ON J,S,L
  2290        >ON J,S,R
  2300        >ON L,D,A
  2310        >ON L,D,X
  2320        >ON L,D,Y
  2330        >ON L,S,R
  2340        >ON M,V,N
  2350        >ON M,V,P
  2360        >ON O,R,A
  2370        >ON P,E,A
  2380        >ON P,E,I
  2390        >ON P,E,R
  2400        >ON R,E,P
  2410        >ON R,O,L
  2420        >ON R,O,R
  2430        >ON S,B,C
  2440        >ON S,E,P
  2450        >ON S,T,A
  2460        >ON S,T,X
  2470        >ON S,T,Y
  2480        >ON S,T,Z
  2490        >ON T,R,B
  2500        >ON T,S,B
  2510 *--------------------------------
  2520 OPINDEX
  2530 *---0X---------------------------
  2540        >OXA BRK
  2550        >OXB ORA
  2560        >OXA COP
  2570        >OXB ORA
  2580        >OXB TSB
  2590        >OXB ORA
  2600        >OXB ASL
  2610        >OXB ORA
  2620        >OXA PHP
  2630        >OXB ORA
  2640        >OXA ASLA
  2650        >OXA PHD
  2660        >OXB TSB
  2670        >OXB ORA
  2680        >OXB ASL
  2690        >OXB ORA
  2700 *---1X---------------------------
  2710        >OXB BPL
  2720        >OXB ORA
  2730        >OXB ORA
  2740        >OXB ORA
  2750        >OXB TRB
  2760        >OXB ORA
  2770        >OXB ASL
  2780        >OXB ORA
  2790        >OXA CLC
  2800        >OXB ORA
  2810        >OXA INCA
  2820        >OXA TCS
  2830        >OXB TRB
  2840        >OXB ORA
  2850        >OXB ASL
  2860        >OXB ORA
  2870 *---2X---------------------------
  2880        >OXB JSR
  2890        >OXB AND
  2900        >OXB JSL
  2910        >OXB AND
  2920        >OXB BIT
  2930        >OXB AND
  2940        >OXB ROL
  2950        >OXB AND
  2960        >OXA PLP
  2970        >OXB AND
  2980        >OXA ROLA
  2990        >OXA PLD
  3000        >OXB BIT
  3010        >OXB AND
  3020        >OXB ROL
  3030        >OXB AND
  3040 *---3X---------------------------
  3050        >OXB BMI
  3060        >OXB AND
  3070        >OXB AND
  3080        >OXB AND
  3090        >OXB BIT
  3100        >OXB AND
  3110        >OXB ROL
  3120        >OXB AND
  3130        >OXA SEC
  3140        >OXB AND
  3150        >OXA DECA
  3160        >OXA TSC
  3170        >OXB BIT
  3180        >OXB AND
  3190        >OXB ROL
  3200        >OXB AND
  3210 *---4X---------------------------
  3220        >OXA RTI
  3230        >OXB EOR
  3240        >OXA WDM
  3250        >OXB EOR
  3260        >OXB MVP
  3270        >OXB EOR
  3280        >OXB LSR
  3290        >OXB EOR
  3300        >OXA PHA
  3310        >OXB EOR
  3320        >OXA LSRA
  3330        >OXA PHK
  3340        >OXB JMP
  3350        >OXB EOR
  3360        >OXB LSR
  3370        >OXB EOR
  3380 *---5X---------------------------
  3390        >OXB BVC
  3400        >OXB EOR
  3410        >OXB EOR
  3420        >OXB EOR
  3430        >OXB MVN
  3440        >OXB EOR
  3450        >OXB LSR
  3460        >OXB EOR
  3470        >OXA CLI
  3480        >OXB EOR
  3490        >OXA PHY
  3500        >OXA TCD
  3510        >OXB JMP
  3520        >OXB EOR
  3530        >OXB LSR
  3540        >OXB EOR
  3550 *---6X---------------------------
  3560        >OXA RTS
  3570        >OXB ADC
  3580        >OXB PER
  3590        >OXB ADC
  3600        >OXB STZ
  3610        >OXB ADC
  3620        >OXB ROR
  3630        >OXB ADC
  3640        >OXA PLA
  3650        >OXB ADC
  3660        >OXA RORA
  3670        >OXA RTL
  3680        >OXB JMP
  3690        >OXB ADC
  3700        >OXB ROR
  3710        >OXB ADC
  3720 *---7X---------------------------
  3730        >OXB BVS
  3740        >OXB ADC
  3750        >OXB ADC
  3760        >OXB ADC
  3770        >OXB STZ
  3780        >OXB ADC
  3790        >OXB ROR
  3800        >OXB ADC
  3810        >OXA SEI
  3820        >OXB ADC
  3830        >OXA PLY
  3840        >OXA TDC
  3850        >OXB JMP
  3860        >OXB ADC
  3870        >OXB ROR
  3880        >OXB ADC
  3890 *---8X---------------------------
  3900        >OXB BRA
  3910        >OXB STA
  3920        >OXB BRL
  3930        >OXB STA
  3940        >OXB STY
  3950        >OXB STA
  3960        >OXB STX
  3970        >OXB STA
  3976 *>>>CHANGED FROM "TAY"
  3977  .LIST ON
  3980        >OXA DEY
  3987  .LIST OFF
  3990        >OXB BIT
  4000        >OXA TXA
  4010        >OXA PHB
  4020        >OXB STY
  4030        >OXB STA
  4040        >OXB STX
  4050        >OXB STA
  4060 *---9X---------------------------
  4070        >OXB BCC
  4080        >OXB STA
  4090        >OXB STA
  4100        >OXB STA
  4110        >OXB STY
  4120        >OXB STA
  4130        >OXB STX
  4140        >OXB STA
  4150        >OXA TYA
  4160        >OXB STA
  4170        >OXA TXS
  4180        >OXA TXY
  4190        >OXB STZ
  4200        >OXB STA
  4210        >OXB STZ
  4220        >OXB STA
  4230 *---AX---------------------------
  4240        >OXB LDY
  4250        >OXB LDA
  4260        >OXB LDX
  4270        >OXB LDA
  4280        >OXB LDY
  4290        >OXB LDA
  4300        >OXB LDX
  4310        >OXB LDA
  4320        >OXA TAY
  4330        >OXB LDA
  4340        >OXA TAX
  4350        >OXA PLB
  4360        >OXB LDY
  4370        >OXB LDA
  4380        >OXB LDX
  4390        >OXB LDA
  4400 *---BX---------------------------
  4410        >OXB BCS
  4420        >OXB LDA
  4430        >OXB LDA
  4440        >OXB LDA
  4450        >OXB LDY
  4460        >OXB LDA
  4470        >OXB LDX
  4480        >OXB LDA
  4490        >OXA CLV
  4500        >OXB LDA
  4510        >OXA TSX
  4520        >OXA TYX
  4530        >OXB LDY
  4540        >OXB LDA
  4550        >OXB LDX
  4560        >OXB LDA
  4570 *---CX---------------------------
  4580        >OXB CPY
  4590        >OXB CMP
  4600        >OXB REP
  4610        >OXB CMP
  4620        >OXB CPY
  4630        >OXB CMP
  4640        >OXB DEC
  4650        >OXB CMP
  4660        >OXA INY
  4670        >OXB CMP
  4680        >OXA DEX
  4690        >OXA WAI
  4700        >OXB CPY
  4710        >OXB CMP
  4720        >OXB DEC
  4730        >OXB CMP
  4740 *---DX---------------------------
  4750        >OXB BNE
  4760        >OXB CMP
  4770        >OXB CMP
  4780        >OXB CMP
  4790        >OXB PEI
  4800        >OXB CMP
  4810        >OXB DEC
  4820        >OXB CMP
  4830        >OXA CLD
  4840        >OXB CMP
  4850        >OXA PHX
  4860        >OXA STP
  4870        >OXB JML
  4880        >OXB CMP
  4890        >OXB DEC
  4900        >OXB CMP
  4910 *---EX---------------------------
  4920        >OXB CPX
  4930        >OXB SBC
  4940        >OXB SEP
  4950        >OXB SBC
  4960        >OXB CPX
  4970        >OXB SBC
  4980        >OXB INC
  4990        >OXB SBC
  5000        >OXA INX
  5010        >OXB SBC
  5020        >OXA NOP
  5030        >OXA XBA
  5040        >OXB CPX
  5050        >OXB SBC
  5060        >OXB INC
  5070        >OXB SBC
  5080 *---FX---------------------------
  5090        >OXB BEQ
  5100        >OXB SBC
  5110        >OXB SBC
  5120        >OXB SBC
  5130        >OXB PEA
  5140        >OXB SBC
  5150        >OXB INC
  5160        >OXB SBC
  5170        >OXA SED
  5180        >OXB SBC
  5190        >OXA PLX
  5200        >OXA XCE
  5210        >OXB JSR
  5220        >OXB SBC
  5230        >OXB INC
  5240        >OXB SBC
  5250 *--------------------------------
  5260 OPFORMAT
  5270 F.0    .HS 00.14.00.1C.02.02.02.20.00.00.00.00.04.04.04.06
  5280 F.1    .HS 26.16.12.1E.02.08.08.22.00.10.00.00.04.0A.0A.0C
  5290 F.2    .HS 04.14.06.1C.02.02.02.20.00.00.00.00.04.04.04.06
  5300 F.3    .HS 26.16.12.1E.08.08.08.22.00.10.00.00.0A.0A.0A.0C
  5310 F.4    .HS 00.14.00.1C.24.02.02.20.00.00.00.00.04.04.04.06
  5320 F.5    .HS 26.16.12.1E.24.08.08.22.00.10.00.00.06.0A.0A.0C
  5330 F.6    .HS 00.14.28.1C.02.02.02.20.00.00.00.00.18.04.04.06
  5340 F.7    .HS 26.16.12.1E.08.08.08.22.00.10.00.00.1A.0A.0A.0C
  5350 F.8    .HS 26.14.28.1C.02.02.02.20.00.00.00.00.04.04.04.06
  5360 F.9    .HS 26.16.12.1E.08.08.0E.22.00.10.00.00.04.0A.0A.0C
  5370 F.A    .HS 00.14.00.1C.02.02.02.20.00.00.00.00.04.04.04.06
  5380 F.B    .HS 26.16.12.1E.08.08.0E.22.00.10.00.00.0A.0A.10.0C
  5390 F.C    .HS 00.14.00.1C.02.02.02.20.00.00.00.00.04.04.04.06
  5400 F.D    .HS 26.16.12.1E.02.08.08.22.00.10.00.00.18.0A.0A.0C
  5410 F.E    .HS 00.14.00.1C.02.02.02.20.00.00.00.00.04.04.04.06
  5420 F.F    .HS 26.16.12.1E.08.08.08.22.00.10.00.00.1A.0A.0A.0C
  5430 *--------------------------------
  5440 FMTBL
  5450 *-----# > ( $ , X S ) , Y $ - - - LL
  5460  .DA %1.0.0.1.0.0.0.0.0.0.0.0.0.0.01 -- IMMEDIATE    00
  5470  .DA %0.0.0.1.0.0.0.0.0.0.0.0.0.0.01 -- DIRECT       02
  5480  .DA %0.0.0.1.0.0.0.0.0.0.0.0.0.0.10 -- ABS          04
  5490  .DA %0.0.0.1.0.0.0.0.0.0.0.0.0.0.11 -- LONG         06
  5500 *-----# > ( $ , X S ) , Y $ - - - LL
  5510  .DA %0.0.0.1.1.1.0.0.0.0.0.0.0.0.01 -- DIRECT,X     08
  5520  .DA %0.0.0.1.1.1.0.0.0.0.0.0.0.0.10 -- ABS,X        0A
  5530  .DA %0.0.0.1.1.1.0.0.0.0.0.0.0.0.11 -- LONG,X       0C
  5540 *-----# > ( $ , X S ) , Y $ - - - LL
  5550  .DA %0.0.0.1.1.0.0.0.0.1.0.0.0.0.01 -- DIRECT,Y     0E
  5560  .DA %0.0.0.1.1.0.0.0.0.1.0.0.0.0.10 -- ABS,Y        10
  5570 *-----# > ( $ , X S ) , Y $ - - - LL
  5580  .DA %0.0.1.1.0.0.0.1.0.0.0.0.0.0.01 -- IND          12
  5590  .DA %0.0.1.1.1.1.0.1.0.0.0.0.0.0.01 -- INDX         14
  5600  .DA %0.0.1.1.0.0.0.1.1.1.0.0.0.0.01 -- INDY         16
  5610 *-----# > ( $ , X S ) , Y $ - - - LL
  5620  .DA %0.0.1.1.0.0.0.1.0.0.0.0.0.0.10 -- INDABS       18
  5630  .DA %0.0.1.1.1.1.0.1.0.0.0.0.0.0.10 -- INDABSX      1A
  5640 *-----# > ( $ , X S ) , Y $ - - - LL
  5650  .DA %0.0.0.1.1.0.1.0.0.0.0.0.0.0.01 -- STK          1C
  5660  .DA %0.0.1.1.1.0.1.1.1.1.0.0.0.0.01 -- STKY         1E
  5670 *-----# > ( $ , X S ) , Y $ - - - LL
  5680  .DA %0.1.1.1.0.0.0.1.0.0.0.0.0.0.01 -- INDLONG      20
  5690  .DA %0.1.1.1.0.0.0.1.1.1.0.0.0.0.01 -- INDLONGY     22
  5700  .DA %0.0.0.1.0.0.0.0.1.0.1.0.0.0.10 -- MVN & MVP    24
  5710  .DA %0.0.0.0.0.0.0.0.0.0.1.0.0.0.01 -- RELATIVE     26
  5720  .DA %0.0.0.0.0.0.0.0.0.0.1.0.0.0.10 -- LONG RELA.   28
  5730 *--------------------------------
  5740 FMTSTR .AS -/$Y,)SX,$(>#/
  5750 *--------------------------------
  5760 INSDS1 JSR CROUT
  5770        LDA PCH
  5780        JSR PRBYTE
  5790        LDA PCL
  5800        JSR PRBYTE
  5810        LDA #"-"
  5820        JSR COUT
  5830        LDA #" "
  5840        JSR COUT
  5850        LDY #0
  5860        LDA (PCL),Y  GET OPCODE
  5863  .LIST ON
  5864 *>>>INSERT LINE HERE
  5865        JSR TEST.OP.CODES         <<<>>>
  5866   .LIF
  5870 INSDS2 TAY          SAVE IN Y-REG
  5880        LDA OPINDEX,Y
  5890        ASL
  5900        TAX
  5910        BCC .1       ...NOT SINGLE BYTE OPCODE
  5920        LDA OPNAMES.A,X
  5930        STA RMNEM
  5940        LDA OPNAMES.A+1,X
  5950        STA LMNEM
  5960        LDA #0
  5970        STA LENGTH
  5980        RTS
  5990 *--------------------------------
  6000 .1     LDA OPNAMES.B,X
  6010        STA RMNEM
  6020        LDA OPNAMES.B+1,X
  6030        STA LMNEM
  6040        LDX OPFORMAT,Y
  6050        LDA FMTBL+1,X
  6060        STA FORMATH
  6070        LDA FMTBL,X
  6080        STA FORMATL
  6090        AND #3
  6100        STA LENGTH
  6110        TXA          CHECK IF IMMEDIATE
  6120        BNE .2       ...NO
  6130        BIT IMM.SIZE CHECK IF 16-BIT MODE
  6140        BPL .2       ...NO
  6150        INC LENGTH   ...YES
  6160 .2     RTS
  6170 *--------------------------------
  6180 INSTDSP
  6190        JSR INSDS1
  6200        LDY #0       PRINT BYTES OF OPCODE & OPERAND
  6210 .1     LDA (PCL),Y
  6220        JSR PRBYTE
  6230        LDX #1       PRINT 1 BLANK
  6240 .2     JSR PRBL2
  6250        CPY LENGTH
  6260        INY
  6270        BCC .1
  6280        LDX #3
  6290        CPY #4
  6300        BCC .2
  6310 *---PRINT MNEMONIC---------------
  6320        LDY #3       THREE LETTERS
  6330 .3     LDA #6       SHIFT OUT ONE LETTER, TOP BITS 11
  6340 .4     ASL RMNEM
  6350        ROL LMNEM
  6360        ROL
  6370        BPL .4       ...NOT ENUF BITS YET
  6380        JSR COUT     PRINT THE LETTER
  6390        DEY
  6400        BNE .3       ...MORE LETTERS
  6410        LDY LENGTH
  6420        BEQ .8       ...SINGLE BYTE OPCODE
  6430        LDA FORMATL
  6440        AND #$20     SEE IF SPECIAL
  6450        BNE .9       ...YES, MOVES OR RELATIVES
  6460 *---PRINT NORMAL OPERANDS--------
  6470        LDA #" "
  6480        JSR COUT
  6490        LDX #10      11 FORMAT BITS
  6500 .5     ASL FORMATL
  6510        ROL FORMATH
  6520        BCC .7
  6530        LDA FMTSTR,X
  6540        JSR COUT
  6550        CMP #"#"
  6560        BNE .55
  6570        BIT IMM.SIZE
  6580        BPL .7
  6590        JSR COUT
  6600 .55    CMP #"$"
  6610        BNE .7
  6620 .6     LDA (PCL),Y
  6630        JSR PRBYTE
  6640        DEY
  6650        BNE .6
  6660 .7     DEX
  6670        BPL .5
  6680 .8     RTS
  6690 *---SPECIAL CASES----------------
  6700 .9     LDA #" "
  6710        JSR COUT
  6720        LDA #"$"
  6730        JSR COUT
  6740        LDA FORMATL
  6750        BMI .11      MVN & MVP
  6755  .LIST ON
  6760 *---8- OR 16-BIT RELATIVE--------
  6770        LDA (PCL),Y  8=OFFSET, 16=OFFSETHI
  6780        DEY          TEST LENGTH
  6790        STY FORMATH  =0 IF 8-BIT
  6800        BEQ .10      ...8-BIT
  6810        STA FORMATH  ...16-BIT
  6820        LDA (PCL),Y  LOW BYTE OF 16-BIT OFFSET
  6830 .10    STA FORMATL
  6840        JSR PCADJ
  6850        CLC
  6860        ADC FORMATL
  6870        TAX
  6880        TYA
  6890        ADC FORMATH
  6900        JMP PRNTAX
  6905   .LIST OFF
  6960 *---MVN & MVP--------------------
  6970 .11    LDA (PCL),Y
  6980        JSR PRBYTE
  6990        LDA #","
  7000        JSR COUT
  7010        LDA #"$"
  7020        JSR COUT
  7030        DEY
  7040        LDA (PCL),Y
  7050        JMP PRBYTE
  7055   .LIST ON
  7060 *--------------------------------
  7070 TEST.OP.CODES
  7080        PHA          SAVE OPCODE
  7090        LSR IMM.SIZE      ASSUME 8-BIT IMMEDIATE
  7100        LDX STATUS.PNTR
  7110        CMP #$18     CLC?
  7120        BEQ CLC.OP
  7130        CMP #$38     SEC?
  7140        BEQ SEC.OP
  7150        INY
  7160        CMP #$C2     REP?
  7170        BEQ REP.OP
  7180        CMP #$E2     SEP?
  7190        BEQ SEP.OP
  7200        DEY
  7210        CMP #$08     PHP?
  7220        BEQ PHP.OP
  7230        CMP #$28     PLP?
  7240        BEQ PLP.OP
  7250        CMP #$FB     XCE?
  7260        BEQ XCE.OP
  7270 *--------------------------------
  7280        AND #$1F     ORA, AND, EOR, ADC, BIT, LDA, CMP, SBC?
  7290        CMP #$09
  7300        PHP          SAVE ANSWER
  7310        LDA #$20     ASSUME M-BIT
  7320        PLP          GET PREVIOUS ANSWER
  7330        BEQ .1       IT IS M-BIT
  7340        LSR    (LDA #$10)    USE X-BIT INSTEAD
  7350 .1     AND STATUS.STACK,X
  7360        BNE .2       ...USE 8-BIT IMMEDIATE
  7370        LDA E.BIT
  7380        LSR
  7390        BCS .2       E=1, USE 8-BIT IMMEDIATE
  7400        LDA #$FF     ...USE 16-BIT IMMEDIATE
  7410        STA IMM.SIZE
  7420 .2     PLA          GET OPCODE AGAIN
  7430        RTS
  7440 *--------------------------------
  7450 CLC.OP LDA STATUS.STACK,X
  7460        AND #$FE
  7470 UPDATE.STATUS
  7480        STA STATUS.STACK,X
  7490        PLA
  7500        RTS
  7510 *--------------------------------
  7520 SEC.OP LDA STATUS.STACK,X
  7530        ORA #$01
  7540        BNE UPDATE.STATUS   ...ALWAYS
  7550 *--------------------------------
  7560 REP.OP LDA (PCL),Y     LOOK AT OPERAND
  7570        EOR #$FF
  7580        AND STATUS.STACK,X
  7590        JMP UPDATE.STATUS
  7600 *--------------------------------
  7610 SEP.OP LDA (PCL),Y
  7620        ORA STATUS.STACK,X
  7630        JMP UPDATE.STATUS
  7640 *--------------------------------
  7650 PHP.OP LDA STATUS.STACK,X
  7660        INX
  7670        CPX #8
  7680        BCC PHP.PLP
  7690        LDX #0
  7700 PHP.PLP
  7710        STX STATUS.PNTR
  7720        JMP UPDATE.STATUS
  7730 *--------------------------------
  7740 PLP.OP DEX
  7750        BPL PHP.PLP
  7760        LDX #7
  7770        BEQ PHP.PLP
  7780 *--------------------------------
  7790 XCE.OP LSR E.BIT    GET E-BIT INTO CARRY
  7800        PHP          SAVE IT
  7810        LDA STATUS.STACK,X
  7820        STA E.BIT    NEW E-BIT
  7830        LSR          C-BIT INTO CARRY
  7840        BCC .1       ...NEW E-BIT = 0
  7850        ORA #$18     ...NEW E-BIT=1, SO SET M=X=1
  7860 .1     PLP          GET NEW C-BIT (OLD E-BIT)
  7870        ROL          PUT IT INTO STATUS BYTE
  7880        JMP UPDATE.STATUS
  7890 *--------------------------------
  7900 TT     LDY #0
  7910        LDA #$C0
  7920        STA PCL
  7930        LDA #2       $2C0...$3C3
  7940        STA PCH
  7950 .1     TYA
  7960        STA $2C0,Y
  7970        INY
  7980        BNE .1
  7990        STY $3C0
  8000        INY
  8010        STY $3C1
  8020        INY
  8030        STY $3C2
  8040 .2     JSR INSTDSP
  8050        LDY #0
  8060        LDA (PCL),Y
  8070        CMP #$FF
  8080        BEQ .3
  8090 .4     LDA $C000
  8100        BPL .4
  8110        STA $C010
  8120        INC PCL
  8130        BNE .2
  8140        INC PCH
  8150        BNE .2       ...ALWAYS
  8160 .3     RTS
  8170 *--------------------------------
  8180    .LIF

Fastest 6502 Multiplication Yet Charles Putney
Shankill, Dublin, Ireland

Here is an 8x8 multiply routine that will blow your socks off! The maximum time, including both a calling JSR and a returning RTS, is only 66 cycles! The minimum is 60 cycles, and most factors will multiply in 63 cycles. Recall that the fastest time in Bob S-C's January 1986 AAL article for a 6502 was 132 cycles. My new one is twice as fast!

As with most fast routines, there is a trade off in memory space. My program uses 1024 bytes of lookup tables. This isn't so bad if you really need or want a 2:1 speed advantage.

My routine is based on the fact that:

       4 * X * Y = (X+Y)^2 - (X-Y)^2

I got this idea from an article in EDN Magazine by Arch D. Robison (October 13, 1983, pages 263-4). His routine used the fact that:

       2 * X * Y = X^2 + Y^2 - (X-Y)^2

Robison's method requires three dips into the lookup tables. Formulated to the same method for passing parameters, his method takes either 74 or 77 cycles. Here is my rendition of his method:

  1000 *SAVE ROBISONS.8X8
  1010 *--------------------------------
  1020 *   MODIFIED FROM ORIGINAL PROGRAM
  1030 *   BY ARCH D. ROBISON, BURROUGHS CORP.
  1040 *      EDN, OCTOBER 13, 1983.
  1050 *--------------------------------
  1060 *   ENTER WITH (A)=MULTIPLIER # 1
  1070 *              (X)=MULTIPLIER #2
  1080 *   EXIT WITH (A)=PRODUCT HI BYTE
  1090 *             (X)=PRODUCT LO BYTE
  1100 *--------------------------------
  1110 PROD   .EQ $06      PRODUCT TEMP OF M1*M2 (LOW BYTE)
  1120 M2     .EQ $07      TEMP FOR M2 SAVE
  1130 *--------------------------------
  1140 MULT8  TAY          SAVE M1 IN Y
  1150        STX M2       SAVE M2
  1160        AND M2       CHECK IF BOTH FACTORS ARE ODD
  1170        LSR          SET CARRY <--> BOTH ODD
  1180        LDA SQL,X    ADD (X*X)/2 AND (Y*Y)/2
  1190        ADC SQL,Y
  1200        STA PROD     SAVE LO BYTE OF PRODUCT
  1210        LDA SQH,X
  1220        ADC SQH,Y
  1230        TAX          SAVE HI BYTE OF PRODUCT
  1240        TYA          GET M1 BACK
  1250        SEC
  1260        SBC M2       FIND M1 - M2
  1270        BCS .1       M1 >= M2, CONTINUE
  1280        SBC #0       M1 < M2, FORM 2'S COMPLEMENT
  1290        EOR #$FF
  1300 .1     TAY          USE ABS(M1-M2) AS INDEX
  1310        LDA PROD        TO FIND SQUARE/2 IN TABLE
  1320        SBC SQL,Y    NOW SUBTRACT (X-Y)*(X-Y)
  1330        STA PROD     SAVE LO BYTE OF RESULT
  1340        TXA          HI BYTE FROM PREVIOUS SUM
  1350        SBC SQH,Y
  1360        LDX PROD     LO BYTE OF FINAL PRODUCT
  1370        RTS
  1380 *--------------------------------
  1390        .OR $900     PAGE BOUNDARY TO SAVE MAX 6 CYCLES
  1400 *--------------------------------
  1410 SQL
  1420  .DA #0,#0,#2,#4,#8,#12,#18,#24
  1430  .DA #32,#40,#50,#60,#72,#84,#98,#112
  1440  .DA #128,#144,#162,#180,#200,#220,#242,#264
  1450  .DA #288,#312,#338,#364,#392,#420,#450,#480
  1460  .DA #512,#544,#578,#612,#648,#684,#722,#760
  1470  .DA #800,#840,#882,#924,#968,#1012,#1058,#1104
  1480  .DA #1152,#1200,#1250,#1300,#1352,#1404,#1458,#1512
  1490  .DA #1568,#1624,#1682,#1740,#1800,#1860,#1922,#1984
  1500  .DA #2048,#2112,#2178,#2244,#2312,#2380,#2450,#2520
  1510  .DA #2592,#2664,#2738,#2812,#2888,#2964,#3042,#3120
  1520  .DA #3200,#3280,#3362,#3444,#3528,#3612,#3698,#3784
  1530  .DA #3872,#3960,#4050,#4140,#4232,#4324,#4418,#4512
  1540  .DA #4608,#4704,#4802,#4900,#5000,#5100,#5202,#5304
  1550  .DA #5408,#5512,#5618,#5724,#5832,#5940,#6050,#6160
  1560  .DA #6272,#6384,#6498,#6612,#6728,#6844,#6962,#7080
  1570  .DA #7200,#7320,#7442,#7564,#7688,#7812,#7938,#8064
  1580  .DA #8192,#8320,#8450,#8580,#8712,#8844,#8978,#9112
  1590  .DA #9248,#9384,#9522,#9660,#9800,#9940,#10082,#10224
  1600  .DA #10368,#10512,#10658,#10804,#10952,#11100,#11250,#11400
  1610  .DA #11552,#11704,#11858,#12012,#12168,#12324,#12482,#12640
  1620  .DA #12800,#12960,#13122,#13284,#13448,#13612,#13778,#13944
  1630  .DA #14112,#14280,#14450,#14620,#14792,#14964,#15138,#15312
  1640  .DA #15488,#15664,#15842,#16020,#16200,#16380,#16562,#16744
  1650  .DA #16928,#17112,#17298,#17484,#17672,#17860,#18050,#18240
  1660  .DA #18432,#18624,#18818,#19012,#19208,#19404,#19602,#19800
  1670  .DA #20000,#20200,#20402,#20604,#20808,#21012,#21218,#21424
  1680  .DA #21632,#21840,#22050,#22260,#22472,#22684,#22898,#23112
  1690  .DA #23328,#23544,#23762,#23980,#24200,#24420,#24642,#24864
  1700  .DA #25088,#25312,#25538,#25764,#25992,#26220,#26450,#26680
  1710  .DA #26912,#27144,#27378,#27612,#27848,#28084,#28322,#28560
  1720  .DA #28800,#29040,#29282,#29524,#29768,#30012,#30258,#30504
  1730  .DA #30752,#31000,#31250,#31500,#31752,#32004,#32258,#32512
  1740 SQH
  1750  .DA /0,/0,/2,/4,/8,/12,/18,/24
  1760  .DA /32,/40,/50,/60,/72,/84,/98,/112
  1770  .DA /128,/144,/162,/180,/200,/220,/242,/264
  1780  .DA /288,/312,/338,/364,/392,/420,/450,/480
  1790  .DA /512,/544,/578,/612,/648,/684,/722,/760
  1800  .DA /800,/840,/882,/924,/968,/1012,/1058,/1104
  1810  .DA /1152,/1200,/1250,/1300,/1352,/1404,/1458,/1512
  1820  .DA /1568,/1624,/1682,/1740,/1800,/1860,/1922,/1984
  1830  .DA /2048,/2112,/2178,/2244,/2312,/2380,/2450,/2520
  1840  .DA /2592,/2664,/2738,/2812,/2888,/2964,/3042,/3120
  1850  .DA /3200,/3280,/3362,/3444,/3528,/3612,/3698,/3784
  1860  .DA /3872,/3960,/4050,/4140,/4232,/4324,/4418,/4512
  1870  .DA /4608,/4704,/4802,/4900,/5000,/5100,/5202,/5304
  1880  .DA /5408,/5512,/5618,/5724,/5832,/5940,/6050,/6160
  1890  .DA /6272,/6384,/6498,/6612,/6728,/6844,/6962,/7080
  1900  .DA /7200,/7320,/7442,/7564,/7688,/7812,/7938,/8064
  1910  .DA /8192,/8320,/8450,/8580,/8712,/8844,/8978,/9112
  1920  .DA /9248,/9384,/9522,/9660,/9800,/9940,/10082,/10224
  1930  .DA /10368,/10512,/10658,/10804,/10952,/11100,/11250,/11400
  1940  .DA /11552,/11704,/11858,/12012,/12168,/12324,/12482,/12640
  1950  .DA /12800,/12960,/13122,/13284,/13448,/13612,/13778,/13944
  1960  .DA /14112,/14280,/14450,/14620,/14792,/14964,/15138,/15312
  1970  .DA /15488,/15664,/15842,/16020,/16200,/16380,/16562,/16744
  1980  .DA /16928,/17112,/17298,/17484,/17672,/17860,/18050,/18240
  1990  .DA /18432,/18624,/18818,/19012,/19208,/19404,/19602,/19800
  2000  .DA /20000,/20200,/20402,/20604,/20808,/21012,/21218,/21424
  2010  .DA /21632,/21840,/22050,/22260,/22472,/22684,/22898,/23112
  2020  .DA /23328,/23544,/23762,/23980,/24200,/24420,/24642,/24864
  2030  .DA /25088,/25312,/25538,/25764,/25992,/26220,/26450,/26680
  2040  .DA /26912,/27144,/27378,/27612,/27848,/28084,/28322,/28560
  2050  .DA /28800,/29040,/29282,/29524,/29768,/30012,/30258,/30504
  2060  .DA /30752,/31000,/31250,/31500,/31752,/32004,/32258,/32512

The entries in the two tables (SQL and SQH) are the squares of the numbers from 0 to 255, divided by two. The low bytes are in the SQL table, and the high bytes are in SQH. Dividing by two throws away an important bit for odd factors, but lines 1160-1170 compensate for the loss.

I looked for a way to add fewer table entries together and came upon the sum^2 - diff^2. Since the sum can be as large as 255+255=510, I need twice as much table space. Lest you despair of typing in such a large table, let me offer an Applesoft program which will write a text file of the source code for the table:

     100 D$ =  CHR$ (4)
     110  PRINT D$"OPEN TEMPFILE"
     120  PRINT D$"WRITE TEMPFILE"
     1000  REM  CREATE SQUARE/4 TABLE
     1010  PRINT "1000 SQL":A$ = "#":L = 1010
     1020  FOR I = 0 TO 510 STEP 8: GOSUB 2000
     1030  NEXT I
     1100  PRINT "2000 SQH":A$ = "/":L = 2010
     1110  FOR I = 0 TO 510 STEP 8: GOSUB 2000
     1120  NEXT I
     1130  PRINT D$"CLOSE": END 
     2000  REM  GENERATE 8 ITEMS
     2010 N =  INT (I * I / 4): PRINT L"  .DA "A$;N;
     2020  FOR J = I + 1 TO I + 7
     2030 N =  INT (J * J / 4): PRINT ","A$;N;
     2040  NEXT J:L = L + 10
     2050  PRINT : RETURN 

My tables contain the squares divided by four. I can hear you saying, "Wait a minute! You can't just divide by four and truncate!" Well, even squares are all multiples of four; odd squares are all multiples of four with a remainder = 1. The sum of two numbers and the difference of the same numbers are either both even or both odd. Therefore, we never lose anything by throwing away our truncated 1.

The number of cycles my MULT8 takes depends on the values of the two factors. You call MULT8 with one factor in the A-register and the other in the X-register. If (A) is less than (X), it takes an extra 3 cycles to perform a complement operation. If the sum of the factors is greater than 255, add another three cycles. To summarize,

                 A>=X  |  A<X
       -----------------------
       sum<256  |  60  |  63
       sum>255  |  63  |  66
       -----------------------

Just for fun, I also wrote a program to generate the square/4 tables. This takes less time than loading the tables from disk, so it could mean faster booting for some hi-resolution game program that needs super-fast multiplications. It is in lines 1560-2100 below.

The origin I used in my program is meant just to allow me to test it. I wrote an Applesoft program to call TEST at $6000 (CALL 24576). The program POKEd two factors at $FA and $FB, called TEST, and then checked the result at the same two locations. If you want to use MULT8, you should just assemble it along with the rest of your program, without any special origin. You should make sure that the tables start on an even page boundary, or it will cost you up to 8 cycles extra for indexing across a page boundary.

  1000 *SAVE PUTNEYS.8X8
  1010 *--------------------------------
  1020 *      ULTRA-FAST 8 X 8 MULTIPLY
  1030 *--------------------------------
  1040 *   ENTER WITH (A)=MULTIPLIER # 1
  1050 *              (X)=MULTIPLIER #2
  1060 *   EXIT WITH (A)=PRODUCT HI BYTE
  1070 *             (X)=PRODUCT LO BYTE
  1080 *--------------------------------
  1090 *   TIMING DATA
  1100 *      MINIMUM TIME = 54 CYCLES
  1110 *      MAXIMUM TIME = 60 CYCLES
  1120 *      AVERAGE TIME = 57 CYCLES
  1130 *--------------------------------
  1140 PROD   .EQ $06      PRODUCT TEMP OF M1*M2 (LOW BYTE)
  1150 M2     .EQ $07      TEMP FOR M2 SAVE
  1160 *--------------------------------
  1170        .OR $6000    SAFE PLACE
  1180 *--------------------------------
  1190 *   TEST FOR APPLESOFT DRIVER
  1200 *--------------------------------
  1210 TEST   LDA $FA      LOAD ACC AND X SO BASIC CAN TEST
  1220        LDX $FB
  1230        JSR MULT8
  1240        STX $FA      NOW BASIC CAN CHECK ACC AND X
  1250        STA $FB
  1260        RTS
  1270 *--------------------------------
  1280 MULT8  TAY          SAVE M1 IN Y
  1290        STX M2       SAVE M2
  1300        SEC          SET CARRY FOR SUBTRACT
  1310        SBC M2       FIND DIFFERENCE
  1320        BCS .1       WAS M1 > M2 ?
  1330        EOR #$FF     INVERT IT
  1340        ADC #$01     AND ADD 1
  1350 .1     TAX          USE ABS(M1-M2) AS INDEX
  1360        CLC
  1370        TYA          GET M1 BACK
  1380        ADC M2       FIND M1 + M2
  1390        TAY          USE M1+M2 AS INDEX
  1400        BCC .2       M1+M2 < 255 ?
  1410        LDA SQL+256,Y     FIND SUM SQUARED LOW IF > 255
  1420        SBC SQL,X         SUBTRACT DIFF SQUARED
  1430        STA PROD          SAVE IN PRODUCT
  1440        LDA SQH+256,Y     HI BYTE
  1450        SBC SQH,X
  1460        LDX PROD     GET PROD LOW IN X
  1470        RTS          DONE
  1480 .2     SEC          SET CARRY FOR SUBTRACT
  1490        LDA SQL,Y    FIND SUM OF SQUARES LOW IF < 255
  1500        SBC SQL,X    SUBTRACT DIFF SQUARED
  1510        STA PROD     SAVE IN PRODUCT
  1520        LDA SQH,Y    HI BYTE
  1530        SBC SQH,X
  1540        LDX PROD     GET PROD LOW IN X
  1550        RTS
  1560 *--------------------------------
  1570 *   PROGRAM TO CREATE A TABLE OF SQUARES/4
  1580 *--------------------------------
  1590 LOTP   .EQ 0,1
  1600 HITP   .EQ 2,3
  1610 *--------------------------------
  1620 SQUARE LDY #0
  1630        STY LOTP
  1640        STY HITP
  1650        STY SQ
  1660        STY SQ+1
  1670        STY SQ+2
  1680        STY DELTA+1
  1690        STY DELTA+2
  1700        STY $6800
  1710        STY $6A00
  1720        INY
  1730        LDA #$40
  1740        STA DELTA
  1750        LDA /$6800
  1760        STA LOTP+1
  1770        LDA /$6A00
  1780        STA HITP+1
  1790        LDX #1
  1800 *--------------------------------
  1810 .1     CLC
  1820        LDA DELTA
  1830        ADC SQ
  1840        STA SQ
  1850        LDA DELTA+1
  1860        ADC SQ+1
  1870        STA SQ+1
  1880        STA (LOTP),Y
  1890        LDA DELTA+2
  1900        ADC SQ+2
  1910        STA SQ+2
  1920        STA (HITP),Y
  1930 *--------------------------------
  1940        LDA DELTA
  1950        ADC #$80
  1960        STA DELTA
  1970        BCC .2
  1980        INC DELTA+1
  1990        BNE .2
  2000        INC DELTA+2
  2010 .2     INY
  2020        BNE .1
  2030        INC LOTP+1
  2040        INC HITP+1
  2050        DEX
  2060        BPL .1
  2070        RTS
  2080 *--------------------------------
  2090 DELTA  .BS 3
  2100 SQ     .BS 3
  2110 *--------------------------------
  2120 *   TABLE OF SQUARES/4 FROM 0 TO 511
  2130 *--------------------------------
  2140        .BS *+$FF/$100*$100-*  KEEP TABLES ALIGNED ON PAGE BOUNDARY
  2150 *--------------------------------
  2160 SQL    .DA #0,#0,#1,#2,#4,#6,#9,#12
  2170        .DA #16,#20,#25,#30,#36,#42,#49,#56
  2180        .DA #64,#72,#81,#90,#100,#110,#121,#132
  2190        .DA #144,#156,#169,#182,#196,#210,#225,#240
  2200   .LIF
  2210        .DA #256,#272,#289,#306,#324,#342,#361,#380
  2220        .DA #400,#420,#441,#462,#484,#506,#529,#552
  2230        .DA #576,#600,#625,#650,#676,#702,#729,#756
  2240        .DA #784,#812,#841,#870,#900,#930,#961,#992
  2250        .DA #1024,#1056,#1089,#1122,#1156,#1190,#1225,#1260
  2260        .DA #1296,#1332,#1369,#1406,#1444,#1482,#1521,#1560
  2270        .DA #1600,#1640,#1681,#1722,#1764,#1806,#1849,#1892
  2280        .DA #1936,#1980,#2025,#2070,#2116,#2162,#2209,#2256
  2290        .DA #2304,#2352,#2401,#2450,#2500,#2550,#2601,#2652
  2300        .DA #2704,#2756,#2809,#2862,#2916,#2970,#3025,#3080
  2310        .DA #3136,#3192,#3249,#3306,#3364,#3422,#3481,#3540
  2320        .DA #3600,#3660,#3721,#3782,#3844,#3906,#3969,#4032
  2330        .DA #4096,#4160,#4225,#4290,#4356,#4422,#4489,#4556
  2340        .DA #4624,#4692,#4761,#4830,#4900,#4970,#5041,#5112
  2350        .DA #5184,#5256,#5329,#5402,#5476,#5550,#5625,#5700
  2360        .DA #5776,#5852,#5929,#6006,#6084,#6162,#6241,#6320
  2370        .DA #6400,#6480,#6561,#6642,#6724,#6806,#6889,#6972
  2380        .DA #7056,#7140,#7225,#7310,#7396,#7482,#7569,#7656
  2390        .DA #7744,#7832,#7921,#8010,#8100,#8190,#8281,#8372 
  2400        .DA #8464,#8556,#8649,#8742,#8836,#8930,#9025,#9120
  2410        .DA #9216,#9312,#9409,#9506,#9604,#9702,#9801,#9900
  2420        .DA #10000,#10100,#10201,#10302,#10404,#10506,#10609,#10712
  2430        .DA #10816,#10920,#11025,#11130,#11236,#11342,#11449,#11556
  2440        .DA #11664,#11772,#11881,#11990,#12100,#12210,#12321,#12432
  2450        .DA #12544,#12656,#12769,#12882,#12996,#13110,#13225,#13340
  2460        .DA #13456,#13572,#13689,#13806,#13924,#14042,#14161,#14280
  2470        .DA #14400,#14520,#14641,#14762,#14884,#15006,#15129,#15252
  2480        .DA #15376,#15500,#15625,#15750,#15876,#16002,#16129,#16256
  2490        .DA #16384,#16512,#16641,#16770,#16900,#17030,#17161,#17292
  2500        .DA #17424,#17556,#17689,#17822,#17956,#18090,#18225,#18360
  2510        .DA #18496,#18632,#18769,#18906,#19044,#19182,#19321,#19460
  2520        .DA #19600,#19740,#19881,#20022,#20164,#20306,#20449,#20592
  2530        .DA #20736,#20880,#21025,#21170,#21316,#21462,#21609,#21756
  2540        .DA #21904,#22052,#22201,#22350,#22500,#22650,#22801,#22952
  2550        .DA #23104,#23256,#23409,#23562,#23716,#23870,#24025,#24180
  2560        .DA #24336,#24492,#24649,#24806,#24964,#25122,#25281,#25440
  2570       .DA #25600,#25760,#25921,#26082,#26244,#26406,#26569,#26732
  2580       .DA #26896,#27060,#27225,#27390,#27556,#27722,#27889,#28056
  2590       .DA #28224,#28392,#28561,#28730,#28900,#29070,#29241,#29412
  2600       .DA #29584,#29756,#29929,#30102,#30276,#30450,#30625,#30800
  2610       .DA #30976,#31152,#31329,#31506,#31684,#31862,#32041,#32220
  2620       .DA #32400,#32580,#32761,#32942,#33124,#33306,#33489,#33672
  2630       .DA #33856,#34040,#34225,#34410,#34596,#34782,#34969,#35156
  2640       .DA #35344,#35532,#35721,#35910,#36100,#36290,#36481,#36672
  2650       .DA #36864,#37056,#37249,#37442,#37636,#37830,#38025,#38220
  2660       .DA #38416,#38612,#38809,#39006,#39204,#39402,#39601,#39800
  2670       .DA #40000,#40200,#40401,#40602,#40804,#41006,#41209,#41412
  2680       .DA #41616,#41820,#42025,#42230,#42436,#42642,#42849,#43056
  2690       .DA #43264,#43472,#43681,#43890,#44100,#44310,#44521,#44732
  2700       .DA #44944,#45156,#45369,#45582,#45796,#46010,#46225,#46440
  2710       .DA #46656,#46872,#47089,#47306,#47524,#47742,#47961,#48180
  2720       .DA #48400,#48620,#48841,#49062,#49284,#49506,#49729,#49952
  2730       .DA #50176,#50400,#50625,#50850,#51076,#51302,#51529,#51756
  2740       .DA #51984,#52212,#52441,#52670,#52900,#53130,#53361,#53592
  2750       .DA #53824,#54056,#54289,#54522,#54756,#54990,#55225,#55460
  2760       .DA #55696,#55932,#56169,#56406,#56644,#56882,#57121,#57360
  2770       .DA #57600,#57840,#58081,#58322,#58564,#58806,#59049,#59292
  2780       .DA #59536,#59780,#60025,#60270,#60516,#60762,#61009,#61256
  2790       .DA #61504,#61752,#62001,#62250,#62500,#62750,#63001,#63252
  2800       .DA #63504,#63756,#64009,#64262,#64516,#64770,#65025,#65280
  2810 *--------------------------------
  2820    .LIST ON
  2830 SQH    .DA /0,/0,/1,/2,/4,/6,/9,/12
  2840        .DA /16,/20,/25,/30,/36,/42,/49,/56
  2850        .DA /64,/72,/81,/90,/100,/110,/121,/132
  2860   .LIST OFF
  2870        .DA /144,/156,/169,/182,/196,/210,/225,/240
  2880        .DA /256,/272,/289,/306,/324,/342,/361,/380
  2890        .DA /400,/420,/441,/462,/484,/506,/529,/552
  2900        .DA /576,/600,/625,/650,/676,/702,/729,/756
  2910        .DA /784,/812,/841,/870,/900,/930,/961,/992
  2920        .DA /1024,/1056,/1089,/1122,/1156,/1190,/1225,/1260
  2930        .DA /1296,/1332,/1369,/1406,/1444,/1482,/1521,/1560
  2940        .DA /1600,/1640,/1681,/1722,/1764,/1806,/1849,/1892
  2950        .DA /1936,/1980,/2025,/2070,/2116,/2162,/2209,/2256
  2960        .DA /2304,/2352,/2401,/2450,/2500,/2550,/2601,/2652
  2970        .DA /2704,/2756,/2809,/2862,/2916,/2970,/3025,/3080
  2980        .DA /3136,/3192,/3249,/3306,/3364,/3422,/3481,/3540
  2990        .DA /3600,/3660,/3721,/3782,/3844,/3906,/3969,/4032
  3000        .DA /4096,/4160,/4225,/4290,/4356,/4422,/4489,/4556
  3010        .DA /4624,/4692,/4761,/4830,/4900,/4970,/5041,/5112
  3020        .DA /5184,/5256,/5329,/5402,/5476,/5550,/5625,/5700
  3030        .DA /5776,/5852,/5929,/6006,/6084,/6162,/6241,/6320
  3040        .DA /6400,/6480,/6561,/6642,/6724,/6806,/6889,/6972
  3050        .DA /7056,/7140,/7225,/7310,/7396,/7482,/7569,/7656
  3060        .DA /7744,/7832,/7921,/8010,/8100,/8190,/8281,/8372 
  3070        .DA /8464,/8556,/8649,/8742,/8836,/8930,/9025,/9120
  3080        .DA /9216,/9312,/9409,/9506,/9604,/9702,/9801,/9900
  3090        .DA /10000,/10100,/10201,/10302,/10404,/10506,/10609,/10712
  3100        .DA /10816,/10920,/11025,/11130,/11236,/11342,/11449,/11556
  3110        .DA /11664,/11772,/11881,/11990,/12100,/12210,/12321,/12432
  3120        .DA /12544,/12656,/12769,/12882,/12996,/13110,/13225,/13340
  3130        .DA /13456,/13572,/13689,/13806,/13924,/14042,/14161,/14280
  3140        .DA /14400,/14520,/14641,/14762,/14884,/15006,/15129,/15252
  3150        .DA /15376,/15500,/15625,/15750,/15876,/16002,/16129,/16256
  3160        .DA /16384,/16512,/16641,/16770,/16900,/17030,/17161,/17292
  3170        .DA /17424,/17556,/17689,/17822,/17956,/18090,/18225,/18360
  3180        .DA /18496,/18632,/18769,/18906,/19044,/19182,/19321,/19460
  3190        .DA /19600,/19740,/19881,/20022,/20164,/20306,/20449,/20592
  3200        .DA /20736,/20880,/21025,/21170,/21316,/21462,/21609,/21756
  3210        .DA /21904,/22052,/22201,/22350,/22500,/22650,/22801,/22952
  3220        .DA /23104,/23256,/23409,/23562,/23716,/23870,/24025,/24180
  3230        .DA /24336,/24492,/24649,/24806,/24964,/25122,/25281,/25440
  3240       .DA /25600,/25760,/25921,/26082,/26244,/26406,/26569,/26732
  3250       .DA /26896,/27060,/27225,/27390,/27556,/27722,/27889,/28056
  3260       .DA /28224,/28392,/28561,/28730,/28900,/29070,/29241,/29412
  3270       .DA /29584,/29756,/29929,/30102,/30276,/30450,/30625,/30800
  3280       .DA /30976,/31152,/31329,/31506,/31684,/31862,/32041,/32220
  3290       .DA /32400,/32580,/32761,/32942,/33124,/33306,/33489,/33672
  3300       .DA /33856,/34040,/34225,/34410,/34596,/34782,/34969,/35156
  3310       .DA /35344,/35532,/35721,/35910,/36100,/36290,/36481,/36672
  3320       .DA /36864,/37056,/37249,/37442,/37636,/37830,/38025,/38220
  3330       .DA /38416,/38612,/38809,/39006,/39204,/39402,/39601,/39800
  3340       .DA /40000,/40200,/40401,/40602,/40804,/41006,/41209,/41412
  3350       .DA /41616,/41820,/42025,/42230,/42436,/42642,/42849,/43056
  3360       .DA /43264,/43472,/43681,/43890,/44100,/44310,/44521,/44732
  3370       .DA /44944,/45156,/45369,/45582,/45796,/46010,/46225,/46440
  3380       .DA /46656,/46872,/47089,/47306,/47524,/47742,/47961,/48180
  3390       .DA /48400,/48620,/48841,/49062,/49284,/49506,/49729,/49952
  3400       .DA /50176,/50400,/50625,/50850,/51076,/51302,/51529,/51756
  3410       .DA /51984,/52212,/52441,/52670,/52900,/53130,/53361,/53592
  3420       .DA /53824,/54056,/54289,/54522,/54756,/54990,/55225,/55460
  3430       .DA /55696,/55932,/56169,/56406,/56644,/56882,/57121,/57360
  3440       .DA /57600,/57840,/58081,/58322,/58564,/58806,/59049,/59292
  3450       .DA /59536,/59780,/60025,/60270,/60516,/60762,/61009,/61256
  3460   .LIST ON
  3470       .DA /61504,/61752,/62001,/62250,/62500,/62750,/63001,/63252
  3480       .DA /63504,/63756,/64009,/64262,/64516,/64770,/65025,/65280
  3490 *--------------------------------
  3500   .LIF

New Hardware for Programming PALs Bob Sander-Cederlof

PALs (programmable array logic chips) are to logic circuitry as ROMs are to memory. Most of the new cards coming out these days contain one or more PALs. Engineers write logic equations, feed them into a PAL Assembler, and run the output to a PAL burner. The programmed PAL is then ready to use in a circuit. Until now, you had to buy a PAL development system, either stand-alone or perhaps interfaced to an IBM-alike.

But now, Dynatek Electronics has introduced a new board than slips nicely into an Apple slot for programming 20- and 24-pin PALs. The PALP-701A, for $245, programs 20-pin PALs. The PALP-702A handles both 20- and 24-pin chips, and can also blow the security fuse when you are ready for it. Both of them come with the PAL Assembler software.

Dynatek's PAL Assembler is compatible with Monolithic Memories PALASM. It creates a fuse plot from a PAL source file of Boolean equations. The fuse plot is then used by the PAL Programmer card via on-board firmware to program the PAL. The firmware on the Programmer card can also read un-protected PALs, and verify them. There is also a screen editor for creating, examining, and modifying a fuse plot.

Almost any Apple II system will do. You need at least 16K RAM to use the card, at least 48K and a disk drive to use the PAL Assembler. And who, these days, does not have at LEAST 48K?

If you design and build circuits, you ought to investigate this card. Call Jerry Wang at (312) 255-3469, or write to Dynatek Electronics, Inc., P. O. Box 1567, Arlington Heights, IL 60006. Tell him we sent you!


Review of Applied Engineering Transwarp Bob Sander-Cederlof

We reviewed the M-c-T SpeedDemon accelerator card in AAL of July 1985. At the time the price was $295 from the manufacturer or $199 through Call APPLE. We recently received a promotion sent to software publishers offering wholesale prices if we would advertise the SpeedDemon in conjunction with our software. The suggested price is now $249. (We notice that at least one game publisher took them up on the offer.)

Now Applied Engineering has released their new accelerator card, the Transwarp. Their price is $279 with a 65C02 installed, and an optional upgrade to a fast 65802 for an additional $89. The higher price is probably well justified by the features.

Transwarp includes 256K of high-speed RAM on the card. This compares to 64K on the Titan Accelerator, and a 4K cache on the SpeedDemon. Transwarp will run with the SWYFT card installed, while the others apparently will not.

Transwarp's 256K RAM is effectively divided into four 64K banks. When you power-up your Apple with Transwarp installed, all of the ROM from $D000 through $FFFF is copied into one of the high-speed RAM banks. The rest of this bank is not used. A second bank is used in place of the motherboard RAM. The third and fourth banks are used in place of the first and second banks of AUXMEM, if you have a RAM card such as RAMWORKS installed in the AUX slot. If you have a large RAMWORKS in the auxiliary slot of a //e, any additional banks beyond two will still be usable but at "only" 1 MHz.

When you write data to one of the screen areas (any address $400-$BFF or $2000-$5FFF), the data is "written through" to the motherboard RAM. (The video hardware in the Apple requires that the screen data be in motherboard RAM.) When you read from any of these addresses, the data will be read from the fast Transwarp RAM.

Transwarp keeps track of the state of all the AUXMEM soft switches, as well as the RAMWORKS bank register. All reads from any memory that is supported in the Transwarp RAM will be done at full speed. Reads from and writes to any address in the range $C000-$CFFF will slow down to 1 MHz for one cycle.

There are 16 dip switches on the card, allowing you to configure for most environments. Seven switches indicate which slots must execute code at 1 MHz. Slots designated by switches will slow down the processor for about 1/2 second after any access to either the slot ROM or the slot registers. An Apple disk Controller must run at the slow speed, while most other slots can run faster. Some I/O cards, especially serial cards, must run at slow speed due to internal software-controlled timing. The Transwarp's switches are much more flexible than the SpeedDemon's system of always slowing down for slot 6 and using jumpers to allow a slowdown for slots 4 and 5.

Another seven switches let you indicate which slots (1-7) have RAM cards installed. The two remaining switches let you select the initial speed of the Transwarp card. You can select a default speed of 3.58 MHz, 1.7 MHz, or 1 MHz. This is the speed the card runs at when you power up. You might like the 1.7 MHz speed for making your game software just a LITTLE faster.

Once the Transwarp has taken over, you can switch back and forth between the default speed and 1 MHz by storing either 0 (default speed) or 1 (1 MHz) into $C074. In BASIC this would be POKE to -16268 or 49268 of either 0 or 1.

If you POKE a value of 3 to $C074, Transwarp will be shut down completely; the motherboard processor will take over when you hit CTRL-RESET. In order to turn Transwarp back on, you have to turn the computer off and back on again with the power switch. You also have the option of disabling Transwarp during the power-on cycle, by typing the ESCAPE key within a couple of seconds after turning on the computer.

Transwarp has a 4K EPROM on-board with startup and self-test firmware. Naturally, I disassembled the code to see how it all works. The self-test is initiated by typing a "0" or "9" during the first two seconds. The test checks for the type of processor installed (65C02 or 65802), measures the speed, tests bank switching, and tests RAM. If you are in a //e, you can hold down the Open-Apple key to keep it looping through the speed test.

Transwarp measures its own speed by counting how many cycles it takes for the Vertical Blanking Signal to pass by. This signal is not available on the II or II Plus, so no speed information is tested on the older machines.

We tested Transwarp doing various jobs such as assembling, word processing, and spreadsheet-ing. Everything worked, no glitches, and a lot faster. The speedup factor depends on the amount of disk I/O, screen I/O, and so on. Nothing runs with a full 3.5 or 3.6 speed increase, not even a short timing loop. The very highest factor I could coax out of my board was about 3.3, on a timing loop running at $C00. This loop included a large number of STA instructions, on purpose. When I moved the program to $800, so that the STA instructions were storing into the range slowed down to 1MHz (between $400 and $BFF), the loop only ran 2.0 times faster under Transwarp than under a normal 1 MHz processor.

Why do the advertisements for accelerators claim a 3.6 or larger speedup factor? I think they are rounding up the clock speed of 3.579... to 3.6, and likewise rounding down the Apple's clock speed from 1.023 to 1. That is not the way the IRS likes you to do math.... The actual ratio of the two clock speeds is exactly 3.5, but the mist does not entirely clear yet.

Remember that the Apple stretches one cycle out of every 65 by an amount equal to one cycle of the 7MHz signal. See chapter 3 of Jim Sather's "Understanding the Apple //e" for details. This means the normal Apple runs a hair slower than the clock rate. But also remember that dynamic RAM needs refreshing from time to time. The refresh of the 256K RAM on the Transwarp card occurs once out of every 16 Apple phase 0 (1MHz) clock cycles. During each 16th 1MHz cycle, the Transwarp slows down to 1MHz. This means that in the time a normal Apple would execute 16 clock cycles, the full-speed Transwarp will execute 53 clock cycles. If not for the long refresh cycle, Transwarp would execute 56 cycles during 16 phase 0 cycles. Now 53 divided by 16 is 3.3125, showing that the maximum speedup factor for Transwarp is 3.3125. I don't know for certain, but the Titan Accelerator II probably has the same characteristic. If so, they both run at a full 3.5 times faster for 15 microseconds, slow down for one microsecond, and then take off again.

The SpeedDemon, on the other hand, can run at a full 3.5 times faster for somewhat longer bursts. If every byte needed is in the SpeedDemon cache memory (static RAM, needing no refresh), execution should proceed at 3.5 times normal Apple speed. Normal programs, however, which are long enough to make us worry about speed, will never be entirely inside the cache. In all comparison tests of real software, Transwarp is faster than either SpeedDemon or Titan. SpeedDemon loses due to its cache, and Titan loses because it does not speed up any accesses to AUXMEM.

The S-C Word Processor increased its speed by about 3.2 for compute-bound operations like searching. Interestingly, an operation that is limited by screen output, like inserting characters from the yank buffer, showed almost no increase in speed. In THE Spreadsheet (MagiCalc) the acceleration factor was about 3.1-3.3, running in a II+ with a Viewmaster 80-column card. Our mailing label system, written mostly in Applesoft, showed a pretty consistent 3.3 speedup. Programs which involve disk I/O will not speed up as much, because the disk still spins at the same 300 rpm.

All in all, we think the Transwarp is a good investment: you get a quality product at a reasonable price which significantly enhances the performance of your computer.


New Book by Tom Weishaar reviewed by Bob Sander-Cederlof

A little over a year ago, just before he started the "Open-Apple" newsletter, Tom wrote a book. Info Books has just released it, called "Your Best Interest: A Money Book for the Computer Age." It's not about Apple assembly language, but I cannot resist telling you about it anyway!

The book is about interest rates -- how to understand them, how to calculate them, how they affect you. It was written for people who know how to use a spreadsheet program. All the hard math and books of tables are replaced your favorite calc-alike.

If you remember Tom's DOSTalk column from the much-missed pages of Softalk Magazine, or are familiar with his current Open-Apple newsletter, you know that what he writes is easy to read, fun to read, and WORTH READING.

Seven fascinating chapters lead you to an understanding of how financial transactions really work. He starts with simple percentage calculations, at a level your Junior High children can follow. If you think that is starting too simply, try explaining percentages to YOUR children! But he keeps going....

Have you thought about buying a house recently? Tom shows you how to figure the true cost of an adjustable-rate mortgage, how to compare different financing schemes, and how to protect your money. You'll learn about the tricks money lenders sometimes use to take advantage of unwary investors and borrowers. And all is tied to spreadsheet models you can put into your Apple. I wish I had only known how to do these things when I bought a pickup truck last summer. Or leased a copying machine three years ago. And when we bought some land in the country....

The book is 160 pages slim (172 counting everything), only $9.95 at your favorite book store. And worth a trip! Or call Gerald Rafferty at Info Books, (213) 470-6786. Or write to them at P. O. Box 1018, Santa Monica, CA 90406.


Which Processor Am I In? Jim Popenoe

One of the first programs I wrote after receiving my 65802 chip was one which tells me which microprocessor is in my Apple. Since the 65C02 has instructions not in the 6502, and since the 65802 has all of those and still more, it is possible to tell which is which.

The instructions in the 65802 (or 65816) which are not in the 65C02 are all "no-operation" opcodes in the 65C02. The same is not true for the un-implemented codes in the 6502! Bob S-C detailed what all the un-implemented 6502 opcodes do in the March 1981 issue of AAL. Some of them do really exotic things, but some are in fact NOPs. $80 is a two-byte NOP in the 6502, but a Branch Always (BRA) in the 65C02 and 658xx. Therefore, the BRA opcode can be used to distinguish between the 6502 and higher versions.

The XBA instruction ($EB) is a one-byte no-operation in the 65C02. In the 658xx it exchanges the low and high bytes of the 16-bit A-register. Therefore it can be used to distinguish between the 65C02 and the 658xx processors.

The following program will print out either "6502", "65C02", or "65802" depending on which it finds. A few more tests could distinguish the Rockwell 65C02, which has four opcodes beyond those in 65C02s made by other manufacturers. And a few more might distinguish between a 65802 in my motherboard and a 65816 running in a co-processor card. I'll leave those for interested readers to try.

  1000 *SAVE S.WHICH.PROC
  1010        .OP 65802
  1020 *--------------------------------
  1030 PRBYTE .EQ $FDDA
  1040 COUT   .EQ $FDED
  1050 *--------------------------------
  1060 WHICH.PROCESSOR
  1070        LDA #$65
  1080        JSR PRBYTE
  1090        BRA .1
  1100        JMP .2
  1110 .1     LDA #"8"
  1120        XBA
  1130        LDA #"C"
  1140        XBA
  1150        JSR COUT
  1160 .2     LDA #$02
  1170        JMP PRBYTE
  1180 *--------------------------------

Apple Assembly Line is published monthly by S-C SOFTWARE CORPORATION, P.O. Box 280300, Dallas, Texas 75228. Phone (214) 324-2050. Subscription rate is $18 per year in the USA, sent Bulk Mail; add $3 for First Class postage in USA, Canada, and Mexico; add $14 postage for other countries. Back issues are available for $1.80 each (other countries add $1 per back issue for postage).

All material herein is copyrighted by S-C SOFTWARE CORPORATION, all rights reserved. (Apple is a registered trademark of Apple Computer, Inc.)