Apple Assembly Line
Volume 3 -- Issue 12 September 1983

In This Issue...

65C02 Notes

We now have a sample from Rockwell, and it shares the problem of not working in an older Apple. It's running just fine in the //e, but it doesn't work in the ][+. Rockwell's distributor says that regular delivery is now scheduled for November. Sigh....

There's a bug in the 65C02 chips! Among the new features are several new addressing modes for the BIT instruction, including BIT #immediate.

The BIT instruction actually does two operations:

  1. It ANDs together the Accumulator and the specified memory byte, and sets the Zero flag according to the result.
  2. It sets the Overflow and Negative flags to the values of bits 6 and 7 of the memory byte.

Well, the BIT #immediate instruction does not do step two; it only modifies the Zero flag. The other new address modes for BIT behave correctly. BIT #$40 sure would have come in handy for a SEV (SEt oVerflow flag) instruction.

As always, we'll keep you posted.

Jump Vectoring Bob Sander-Cederlof

Applesoft has a statement which allows branching according to a computed index:

       ON X GO TO 100,200,300,400

Integer BASIC has a different method, simply allowing the line number after a GOTO, THEN, or GOSUB to be a computed value:

       GO TO X*100

Most other languages have some technique for vectoring to one of a series of places based on the value of a variable. Modern languages like Pascal have a CASE statement, which can combine a comparison step.

       case PIECE of
          Pawn : ...;
          Knight : ...;
          Bishop : ...;
          Rook : ...;
          Queen : ...;
          King : ...;

I frequently find myself building various schemes to handle the CASE statement in assembly language. For example, I might accept a character from the keyboard and then compare it to a series of legal inputs, and branch accordingly to process the input.

One common way involves a series of CMP BEQ pairs, like this:

       CMP #$81        control-A?
       BEQ ...            yes
       CMP #$84        control-D?
       BEQ ...            yes
       CMP #$8D        return?
       BEQ ...            yes
       et cetera

If there are not too many cases, and if the processing routines are not too far away for the BEQs to reach, this is a good way to do the job. If the routines are bigger, and therefore tend to be too far away (causing RANGE ERRORS at assembly time), I might string together CMP BNE pairs instead:

       CMP #$81        control-A?
       BNE TRY.D       no, try ctrl-D

      <code to process ctrl-A here>

TRY.D  CMP #$84        control-D?
       BNE TRY.M       no, try return

      <code for ctrl-D here>
TRY.M  CMP #$8D        return?
       BNE ... et cetera

      <code for ctrl-M here>

The trouble with the latter way is that programs get strung all over the place, and become very difficult to follow. Unstructured, some would say. The structure is really there, because we are just implementing a CASE statement; however, assembly language code over a sheet of paper long LOOKS unstructured, no matter what it is implementing. And once a programmer gets his CASE statement spread over several sheets of paper, the temptation to begin making a "rat's nest" out of it can be overwhelming.

I prefer to put things into nice neat data tables. Back in the August 1982 issue of AAL I presented a "Search and Perform" subroutine to handle a table like this:

       .DA #$81,CTRL.A-1
       .DA #$84,CTRL.D-1
       .DA #$8D,RETURN-1

The table consists of three bytes per line, the first byte being the CASE value, and the other two being the address of the processing routine.

Another method is handy when the variable has a nice numeric range. For example, what if I have processing routines for every possible control character from ctrl-A through ESC? That is ASCII codes $81 through $9B. If I subtract $81, I get a value from 0 through 26 (decimal). If I then multiply the value by three, and add it to a base address, and store the result into another variable, and JMP indirect, I can access a series of JMPs to each processing routine:

       SBC #$81
       CMP #27
       STA ADDR        TIMES THREE
       ADC ADDR
       STA ADDR
       LDA #0
       ADC /TABLE
       STA ADDR+1
       JMP (ADDR)
ADDR   .BS 2
       JMP CTRL.B

Note that if we got to the CASE program by doing a JSR CASE, then each processing routine can do an RTS to return to the main line program. This makes our CASE look like it is doing a series of JSR's instead of JMP's.

We can shave bytes off the above technique by only keeping the address in TABLE, without all the JMP opcodes. Then the variable only needs to be multiplied by two instead of three. We will have to use the doubled variable for an index to pick up the address in the table and put it into ADDR:

       SBC #$81
       CMP #27
       LDA TABLE,X
       STA ADDR
       LDA TABLE+1,X
       STA ADDR+1
       JMP (ADDR)
ADDR   .BS 2
       .DA CTRL.B
       .DA ESCAPE

I don't recommend self-modifying code, but I still use it sometimes. If you want to save two more bytes above, then you can store the jump address directly into the second and third bytes on a direct JMP instruction:

       LDA TABLE,X
       STA ADDR+1
       LDA TABLE+1,X
       STA ADDR+2

A much better way involves pushing the processing routine address onto the stack, and using an RTS to branch to the pushed address. Since RTS adds 1 to the address on the stack before branching, we have to push the address-1:

       LDA TABLE+1,X
       PHA             HIGH BYTE FIRST
       LDA TABLE,X
       .DA CTRL.B-1
       .DA ESCAPE-1

Note that this method not only is not self-modifying, it also is a few bytes shorter and a tad faster.

All this is only necessary because the designers of the 6502 did not give us a JMP (addr,X) instruction. If they had, we could do it like this:

       SBC #$81
       CMP #27
       BCS ...ERROR
       ASL             DOUBLE FOR INDEX
       JMP (TABLE,X)

Then the hardware would add the doubled character offset (0,2,4,...52 for ctrl-A thru ESC) to the base address of the table, pick up the address from the table, and jump to the corresponding processing routine.

Since that would be so nice, and the designers agreed, the new 65C02 chip has it! So if you know you are writing for a 65C02, and don't EVER intend to run in a plain 6502, you can use the JMP (TABLE,X).

It would also be nice to have JSR (TABLE,X), but you can simulate that by calling CASE with a JSR. Or in other situations, you might merely do it this way:

       JSR CALL

Sometimes it so happens that your program can be arranged so that all the processing routines are in the same memory page. Then there is no need to store the high byte of the address in the table, right? Steve Wozniak thought this way, and you can see the result in the Apple monitor at $FFBE and following:

       STY MODE
       .DA #USR-1      CTRL-Y
       .DA #BEGZ-1     CTRL-E
       .DA #BLANK-1    BLANK

Steve also used this technique inside the SWEET-16 interpreter. You can see the code at $F69E through $F6C6 in the Integer BASIC ROM or RAM image.

If the routines are not necessarily all in one page, but are all within one 256-byte range, you can add an offset from the table to a known starting address.

Here is a method I would NEVER use, but it is cute, and short:

       CLC             make branch always...
BASE   .EQ *
       ...all the routines here
       .DA #CTRL.B-BASE

The table has pre-computed relative offsets from BASE, so that the values can be plugged directly into the BCC instruction. This is a fast and short technique, but somehow it scares me to think about self-modifying code. If you need it, go ahead and use it!

Using QUICKTRACE with S-C Assembler Bob Urschel
Valparaiso, IN

I wanted to use QUICKTRACE in conjunction with the S-C Assembler without having QUICKTRACE interfere with either my source file or any object code generated. Since I always use the LC version of the assembler, I modified the HELLO program on the S-C assembler disk as follows:

     20 POKE 40192,211:POKE40193,142:CALL42964
     60 END

Line 20 in the HELLO program modifies the location of the DOS buffers by $E00 bytes to make room for the QUICKTRACE program. After running the HELLO program, when the S-C prompt appears and BEFORE loading any S-C source files, enter:

       :$8F00G  <return>

This initializes QUICKTRACE.

I also changed the address at MON$ (from within QUICKTRACE) to MON$=D003 so when I press M from single-step mode, I return to the S-C Assembler with my source file intact.

Generate Machine Code with Applesoft Bob Sander-Cederlof

Apparently nobody picked up my challenge at the end of the article about Charlie Putney's faster spiral screen clear program (August 1983 AAL, page 16). I suggested someone write a program in Applesoft which would in turn construct a machine language screen clear.

Nobody else did it, so I did. And whether you are interested in fancy ways to clear the screen or not, the techniques I used may be put to other uses.

The task of building a screen clear program can be divided into two parts. First, generate the memory addresses of the 960 cells on the screen, in the order (or path) that the spiral shift will follow. Second, using that table of addresses, generate the 959 pairs of LDA and STA instructions necessary to move the screen one position along the spiral path. There is really a third part: to generate the necessary prologue and postlogue instructions to make those 959 LDA-STA pairs be executed 960 times, and to clear the vacated byte at the tail end of the spiral path.

After trying various ways to understand the spiral path, I arrived at a table-driven approach. I put the table into data statements (lines 3000-3110 below), and made a simple loop to generate the 960 addresses (lines 100-150).

You might notice that the twelve lines of data correspond very closely to the parameters on Charlie Putney's macro calls. After I typed in the twelve lines, I noticed a definite pattern. I could have used only the first line of data, and computed the others by a simple algorithm: increment each value smaller than 13, and decrement each value 13 or larger. Well, no program is ever finished....

Once the 960 addresses are stored in array A%(0) through A%(959), I proceed to generate machine language code. Line 180 does it all, with the help of four simple subroutines. Then line 190 rings the bell, and line 200 calls the machine language program just generated for a fast two-and-a-half second demonstration.

During the address array building process, I fill up the screen with the letters U, D, L, and R. These show the direction (up, down, left, and right) which a given character will be shifted along the spiral path. The directions are just the opposite from the order in which the letters are displayed, because I generate the address list backwards (from head to tail).

During the generation of the machine language program, which takes about two minutes, I toggle the tail end character between normal to inverse video. This gives you something to watch for those lloooonnggg two minutes.

The generation process is broken into four parts, represented by four subroutines at 5000, 5100, 5200, and 5300.

GOSUB 5000 generates a four byte prologue, starting at memory address $2710, or 10000. The code looks like this:

       LDX /-960
       LDY #-960

Actually, not -960, but -960/S. S gives a step size. Sidestepping a little from the main discussion, let me tell you about S.

Don Lancaster called last week to talk about a few things with Bill, and passed on the results of his experiments with Charlie's program. He noted that the video refresh rate is 60 times per second, and that a 7.5 second screen clear moves a little more than two steps for each frame time. Therefore you don't really SEE each step. Therefore the screen clear routine could move each character two steps ahead at a time with the same smooth effect on the screen, but clearing the screen in half the time. Or three steps, clearing in one third the time. The variable S in my program lets you experiment with the number of steps each character moves during each pass. As listed, S=3, so the screen clears in 2.5 seconds.

GOSUB 5100 generates the requisite number of LDA-STA pairs to move the screen one step of size S along the spiral path.

GOSUB 5200 generates the instructions to clear the bytes at the tail end of the spiral. If S=3, you will get:

       LDA #$A0        BLANK
       STA $636
       STA $635
       STA $634

GOSUB 5300 generates the end-of-loop code:

       BNE LP
       BNE LP
  LP   JMP 10004

The screen need not necessarily be cleared to all blanks. By changing the value POKEd in the second part of line 5210 you can fill with all stars, or all white, or whatever.

Another interesting option occurs to me. Given a table in the A% array of all the screen addresses, in any arrangement that suits my fancy, I can clear the screen in 2.5 to 7.5 seconds by shifting the screen along that particular path. It could be random, spiral, kaleidoscopic, or whatever.

There are so many other things I could explain about this little program, I hardly know where to stop. I think I'll stop here, and leave the rest for your own rewarding investigation and analysis.

     100  TEXT : HOME : DIM A%(1000)
     105 N = 0
     110  READ X,YB,YT: GOSUB 1000
     120  READ Y,XL,XR: GOSUB 1200
     130  READ X,YT,YB: GOSUB 1100
     140  READ Y,XR,XL: GOSUB 1300
     150  IF N < 960 THEN 110
     160  REM  
     170 S = 3
     180  GOSUB 5000: GOSUB 5100: GOSUB 5200: GOSUB 5300
     190  CALL  - 1054: REM RING BELL
     200  CALL 10000: END 
     500  REM  
                    POKE ADDRESS
     510 AH =  INT (A / 256):AL = A - AH * 256: POKE L + 1,AL: POKE L + 
          2,AH:L = L + 3: POKE 1588,256 -  PEEK (1588): RETURN 
     1000  REM  
                    MOVE DOWN COLUMN X FROM YB TO YT
     1010 C$ = "D": FOR Y = YB TO YT STEP  - 1
     1020  VTAB Y + 1: GOSUB 2000: NEXT : RETURN 
     1100  REM  
                    MOVE UP COLUMN X FROM YT TO YB
     1110 C$ = "U": FOR Y = YT TO YB: VTAB Y + 1: GOSUB 2000: NEXT : RETURN 
     1200  REM  
                    MOVE LEFT ROW Y FROM XL TO XR
     1210 C$ = "L": VTAB Y + 1: FOR X = XL TO XR: GOSUB 2000: NEXT : RETURN 
     1300  REM  
                    MOVE RIGHT ROW Y FROM XR TO XL
     1310 C$ = "R": VTAB Y + 1: FOR X = XR TO XL STEP  - 1: GOSUB 2000:
            NEXT : RETURN 
     2000  REM  
                    POST ADDRESS
     2010 A =  PEEK (40) +  PEEK (41) * 256 + X:A%(N) = A:N = N + 1: POKE 
           A, ASC (C$) + 128
     2020  RETURN 
     3000  DATA  0,23,0,   0,1,39,   39,1,23,  23,38,1
     3010  DATA  1,22,1,   1,2,38,   38,2,22,  22,37,2
     3020  DATA  2,21,2,   2,3,37,   37,3,21,  21,36,3
     3030  DATA  3,20,3,   3,4,36,   36,4,20,  20,35,4
     3040  DATA  4,19,4,   4,5,35,   35,5,19,  19,34,5
     3050  DATA  5,18,5,   5,6,34,   34,6,18,  18,33,6
     3060  DATA  6,17,6,   6,7,33,   33,7,17,  17,32,7
     3070  DATA  7,16,7,   7,8,32,   32,8,16,  16,31,8
     3080  DATA  8,15,8,   8,9,31,   31,9,15,  15,30,9
     3090  DATA  9,14,9,   9,10,30,  30,10,14, 14,29,10
     3100  DATA 10,13,10, 10,11,29,  29,11,13, 13,28,11
     3110  DATA 11,12,11, 11,12,28,  28,12,12, 12,27,12
     5000  REM  
                    COMPILE PROLOGUE
     5010 T = 65536 - 960 / S:TH =  INT (T / 256):TL = T - TH * 256
     5020  POKE 10000,162: POKE 10001,TH
     5030  POKE 10002,160: POKE 10003,TL
     5040  RETURN 
     5100  REM  
                    COMPILE LDA-STA PAIRS
     5110 L = 10004: FOR I = 0 TO 957: POKE L,173:A = A%(I + S): GOSUB 
     5120  POKE L,141:A = A%(I): GOSUB 500: NEXT 
     5130  RETURN 
     5200  REM  
                    COMPILE CLEAR S BYTES 
     5210  POKE L,169: POKE L + 1,160:L = L + 2
     5220  FOR I = 1 TO S: POKE L,141:A = A%(960 - I): GOSUB 500: NEXT 
     5230  RETURN 
     5300  REM  
                    COMPILE POSTLOGUE
     5310  FOR I = 0 TO 9: READ A: POKE L + I,A: NEXT 
     5320  RETURN 
     5350  DATA 200,208,4,232,208,1,96,76,20,39

Amper-Monitor Bob Sander-Cederlof

It would be nice to be able to use monitor commands from within Applesoft, both in direct commands and within running Applesoft programs. At least Kraig Arnett, from Homestead, Florida, thinks so.

I agree, and so I whipped out another handy-dandy &-subroutine for just that purpose. I call it Amper-Monitor. You can install it by BRUNning it from a binary file, or by adding some POKEs to your Applesoft program. My listing shows it residing at the ever popular $300 address, but it can be reassebled to run anywhere. Just remember to connect it properly to the Ampersand Vector.

Once Amper-Monitor is installed and hooked to the ampersand vector, you call it by typing an ampersand, a quotation mark, and a monitor command. Here is a sample program showing some uses of the Amper-Monitor.

    100 FOR I = 768 TO 855
    110 READ D : POKE I,D : NEXT
    120 CALL 768

    130 &"300.357
    140 &"380:12 34 56 78 9A BC DE F0
    150 &"FBE2G
    160 &"300L 380.387

    200 DATA 169,11,141,246,3,169,3,141,247,3,96
    210 DATA 201,34,208,70,32,177,0,160,0,177,184,201,0
    220 DATA 240,8,9,128,153,0,2,200,208,242,169,141
    230 DATA 153,0,2,152,24,101,184,133,184,144,2,230
    240 DATA 185,32,199,255,32,167,255,132,52,160,23
    250 DATA 136,48,23,217,204,255,208,248,192,21,240
    260 DATA 8,32,190,255,164,52,76,52,3,32,197,255
    270 DATA 76,0,254,76,201,222

Why did I choose to require the quotation mark after the ampersand? Because normally Applesoft would parse the line, eliminating blanks, changing DEF to a token instead of three hex digits, using ":" to end a line, and so on. Using the "-mark prevents all this, leaving the line in raw ASCII form. Here is a listing of the program in assembly language:

  1010 *--------------------------------
  1020 *      &-MONITOR COMMANDS
  1030 *--------------------------------
  1040 MON.MODE         .EQ $31
  1050 MON.YSAV         .EQ $34
  1060 TXTPTR           .EQ $B8 AND B9
  1070 MON.BUFFER       .EQ $200
  1090 *--------------------------------
  1100 AS.CHRGET        .EQ $00B1
  1110 AS.SYNERR        .EQ $DEC9
  1120 MON.BL1          .EQ $FE00
  1130 MON.GETNUM       .EQ $FFA7
  1140 MON.TOSUB        .EQ $FFBE
  1150 MON.ZMODE        .EQ $FFC7
  1160 MON.CHRTBL       .EQ $FFCC
  1170 *--------------------------------
  1180        .OR $300
  1190 *--------------------------------
  1210        STA AMPERSAND.VECTOR+1
  1220        LDA /AMPER.MONITOR
  1230        STA AMPERSAND.VECTOR+2
  1240        RTS
  1250 *--------------------------------
  1270        CMP #$22     MUST BE QUOTATION MARK HERE
  1280        BNE .6       SYNTAX ERROR
  1290        JSR AS.CHRGET
  1300        LDY #0
  1310 .1     LDA (TXTPTR),Y
  1320        BEQ .2
  1330        ORA #$80
  1340        STA MON.BUFFER,Y
  1350        INY
  1360        BNE .1
  1370 .2     LDA #$8D
  1380        STA MON.BUFFER,Y
  1390        TYA
  1400        CLC
  1410        ADC TXTPTR
  1420        STA TXTPTR
  1430        BCC .25
  1440        INC TXTPTR+1
  1450 .25    JSR MON.ZMODE
  1460 .3     JSR MON.GETNUM
  1470        STY MON.YSAV
  1480        LDY #23
  1490 .4     DEY
  1500        BMI .6       SYNTAX ERROR
  1510        CMP MON.CHRTBL,Y
  1520        BNE .4       NOT THIS ENTRY
  1530        CPY #21
  1540        BEQ .5       <RETURN> ALONE
  1550        JSR MON.TOSUB
  1560        LDY MON.YSAV
  1570        JMP .3
  1580 .5     JSR MON.ZMODE-2
  1590        JMP MON.BL1
  1600 .6     JMP AS.SYNERR

Lines 1200-1240 link in the ampersand vector. This is the only part that would have to be changed if you move the routine.

When Applesoft sees an "&", it will JSR to AMPER.MONITOR. The A-register will hold the character following the "&", which we hope is a quotation mark. Lines 1270 and 1280 do this hoping.

Lines 1290-1380 copy the characters following the quotation mark into the monitor buffer starting at $200. If you typed in the &"... as a direct command, it is already in the monitor buffer but starts at $202, so it gets shifted over two bytes. If the command is in a program, it will be copied out of program space into $200. Applesoft has stripped off the sign bit from every byte, so my loop adds the sign bit back in to satisfy the monitor's requirements. Applesoft ends the line with a $00 byte, and the monitor wants $8D, so I fix that up too. I don't let colon terminate the line, because colon is a valid character in a monitor command line. I use "LDA (TXTPTR),Y" rather than repeated calls to AS.CHRGET because AS.CHRGET would eliminate blanks.

Lines 1390-1440 adjust the Applesoft pointer to the end of the line, so upon returning we won't get false syntax errors and the Applesoft program can continue executing.

Lines 1450-1590 parse the command line one command at a time, call on the monitor to execute each command, and finally return to Applesoft after the last command on the line. (The idea for this code came originally from code Steve Wozniak wrote for the mini-assembler in the old Apple monitor ROM.) Note that an illegal monitor command will result in a syntax error.

I thought it would now be possible to use the Amper-Monitor to write hex dumps on text files...BUT: Unfortunately DOS uses some critical zero page locations which prevent using the Amper-Monitor while writing on a text file. Monitor commands use locations $3D through $42, and so does DOS. I tried using the &"300.357 to do a hex dump into a text file, but DOS went wild and clobbered itself. Sorry, but I see no solution without changing DOS or recoding the entire monitor.

Yet Another New Version of DOS 3.3 Bob Sander-Cederlof

In the July issue of AAL I outlined the changes Apple made to DOS 3.3 early this year. Today I received a new "Developer's System Master", with a cover letter claiming another correction to the APPEND routine. The letter binds developers to begin using the new version no later than November 1st.

If you like APPEND, or would like to like it, you might want to make these patches in your own system master. I am going to assume you already have the "early 1983" version, either because you bought a //e or a disk drive this year, or you copied one from a friend, or you made the patches from my July article. Here are the new changes:

"early 1983"                August, 1983
---------------------      -----------------------
B683:4C 84 BA JMP $BA84    B683:4C B3 B6 JMP $B6B3

                           B6B6:8D E6 B5 STA $B5E6
                           B6B9:8D EA B5 STA $B5EA
                           B6BC:AD BE B5 LDA $B5BE
                           B6BF:8D E7 B5 STA $B5E7
                           B6C2:8D EB B5 STA $B5EB
                           B6C5:8D E4 B5 STA $B5E4
                           B6C8:BA       TSX
                           B6C9:8E 9B B3 STX $B39B
                           B6CC:4C 7F B3 JMP $B37F

$BA84-BA93:PATCH           BA84-BA93:ALL ZEROES

What Apple has done is move the patch they had put at $BA84 down to $B6B3 and added four extra lines to that patch. I HOPE IT IS NOW CORRECT!

Base Address Calculation Bob Sander-Cederlof

I believe that Steve Wozniak was the first to use the tricks in a microcomputer, back in 1976 and 1977. All of the other designs I recall either used the more expensive static RAM, or used a complex circuit to refresh dynamic RAM arrays. Steve's design allowed the use of dynamic RAM without any separate circuitry for refresh.

Dynamic RAM needs refreshing because each bit cell is really only a capacitor, and the charge runs out after a few milliseconds. By reading each bit and re-writing it every few milliseconds, the data in memory is maintained as long as you like. Each 16384-bit RAM-chip is organized in 128 rows by 128 columns of bytes, and the chips are designed so that merely addressing each row often enough will keep the bits fresh as a daisy. Steve hooked up the Apple so that the process of keeping data displayed on the screen also ran through all the row addresses.

His second trick was to keep the screen (and therefore the RAM) happy without stealing any time from the CPU. He did this by using alternate half cycles of the clock. The one-megahertz clock runs the 6502 every other half cycle, and the screen gets its whacks at memory in between.

What has all the above to do with an article titled "Base Address Calculation"? Well, I'm getting to that. In order to address each row often enough, Steve re-arranged the address bits in a rather complicated way. As the screen is refreshed, scan-line by scan-line, bytes are read from RAM in an order that assures every RAM row is accessed about every 2 milliseconds. [ For the exact details of this process, see Winston Gayler's "Apple II Circuit Description", pages 41-57. ]

All this boils down to a need to go through a complicated calculation to convert a display line number into a base address in RAM. The process is implemented for the text screen at $FBC1 in the monitor ROM; for the lo-res graphics screen at $F847 in the monitor ROM; for the hi-res graphics screen at $F417 in the Applesoft ROM.

If we represent the 8-bit value for the line number on the text screen as "000abcde", the base calculation computes the address in RAM for the first character on that line and stores the result in two bytes at $28 and $29 in the form "000001cd eabab000". The two bits "ab" may have values "00", "01", or "10" for lines 0-7, 8-15, and 16-23 respectively. The "abab000" part of the least significant byte of the base address represents "ab" times 40. Remember there are 40 characters on a line?

The hi-res base address calculation is more complicated, but it really the same thing. If we think of a text line as being made up of 8 hi-res lines, both calculations ARE the same. Except that the lo-res RAM starts at $400, and the hi-res starts at $2000. A hi-res line number runs from 0 through 191, or $00 - $BF. If we visualize it as "abcdefgh", the base address calculation merely re-arranges the bits to "001fghcd eabab000". Note that if we multiply the text line number by 8 and run it through the hi-res calculation we will get "001000cd eabab000" which is correct except for starting at $2000 rather than $400.

The hi-res calculation inside Applesoft takes 33 bytes and 61 cycles. Harry Cheung, who lives in Onitsha, Nigeria, wrote a letter to Call APPLE (page 70, July, 1983) to present his shorter, faster version. Harry did it in 25 bytes and only 46 cycles (one more byte and 6 more cycles if you count the RTS, but I didn't count an RTS in the Applesoft version). Here is Harry's code, with my comments.

  1200 *--------------------------------
  1220 *      HARRY CHEUNG
  1230 *      PMB 1601, ONITSHA, NIGERIA
  1240 *      CALL APPLE, JULY 1983, PAGE 70
  1250 *--------------------------------
  1260 CALC   TAY          (TAY..TYA COULD BE PHA..PLA)
  1270        AND #$C7     ABCDEFGH
  1280        STA 0        AB000FGH
  1290        ORA #$08     FOR BASE = $2000, $10 FOR $4000
  1300        STA 1        AB001FGH
  1310        TYA          ABCDEFGH
  1320 *                CARRY..A-REG......$00.......$01...
  1330        ASL          A--BCDEFGH0  AB000FGH  AB001FGH
  1340        ASL          B--CDEFGH00     "         "
  1350        ROR 0        H--   "      BAB000FG     "
  1360        ASL          C--DEFGH000     "         "
  1370        ROL 1        A--   "         "      B001FGHC
  1380        ROR 0        G--   "      ABAB000G     "
  1390        ASL          D--EFGH0000
  1400        ROL 1        B--   "         "      001FGHCD
  1410        ASL          E--FGH00000     "         "
  1420        ROR 0        G--   "      EABAB000  001FGHCD
  1430        RTS

I need to point out several things here. Harry used page zero locations $00 and $01 for the resulting base address. If you want to use his program with Applesoft, change them to $26 and $27. Harry saved the line number temporarily in the Y-register. If the Y-register is already holding something important (it is in the Applesoft case), you can substitute PHA and PLA for the TAY and TYA above. Same number of bytes, but 3 cycles longer.

If you want REAL speed, and can spare a few more bytes, you need to pre-compute all the base addresses and store them in a table. Then you can use the line number as an index into the table and do a base address TRANSLATION in just a few cycles. For example, assume you store all the low-order bytes in a 192-byte table called LO.BASE, and similarly the high-order bytes at HI.BASE. If you get the line number in the Y-register, then you can convert the line number to a base address like this:

       LDA LO.BASE,Y
       STA $26
       LDA HI.BASE,Y
       STA $27

That takes 10 bytes of program, 384 bytes of table, and only 14 to 16 cycles. I say 14 to 16, because it depends on whether either or both of the two tables cross page boundaries. If they each are entirely within a memory page, 14 cycles.

Now here is a little piece of code I wrote to test out Harry's calculator. It runs through each of the 192 lines and prints out the line number, an equal sign, the base address, and a space for each line (all in hex).

  1010 *--------------------------------
  1040 *--------------------------------
  1050 TEST   LDX #0
  1060 .1     TXA
  1070        JSR CALC
  1080        TXA
  1090        JSR $FDDA
  1100        LDA 1
  1110        JSR $FDD3
  1120        LDA 0
  1130        JSR $FDDA
  1140        LDA #$A0
  1150        JSR $FDED
  1160        INX
  1170        CPX #192
  1180        BCC .1
  1190        RTS

The monitor address $FDD3 is not a labelled entry point, but I think it will probably stay consistent in future editions of the Apple ROMs. It saves whatever is in the A-register, prints "=", restores the A-register, and falls into $FDDA. The routine at $FDDA prints the contents of A in hex.

Just for fun I also wrote some new versions of the text base address calculator. One of them is shorter but takes more time, and the other is longer but takes less time. Oh well, can't win every race! Here are listings of them both, followed by a commented listing of the Applesoft hi-res calculator.

  1440 *--------------------------------
  1450 LRCALC.1
  1460        PHA
  1470        AND #$18     000DE000
  1480        ASL          00DE0000
  1490        STA 0
  1500        ASL          0DE00000
  1510        ASL          DE000000
  1520        ORA 0        DEDE0000
  1530        STA 0
  1540        PLA          000DEFGH
  1550        LSR          0000DEFG
  1560        ROR 0        HDEDE000
  1570        AND #$03     000000FG
  1580        ORA #$04     000001FG  (FOR PAGE 1)
  1590        STA 1
  1600        RTS
  1610 *--------------------------------
  1620 LRCALC.2
  1630        PHA
  1640        AND #$18     000DE000
  1650        BEQ .1
  1660        CMP #$10
  1670        LDA #$A0
  1680        BCS .1
  1690        LSR
  1700 .1     STA 0        DEDE0000
  1710        PLA          000DEFGH
  1720        LSR          0000DEFG
  1730        ROR 0        HDEDE000
  1740        AND #$03     000000FG
  1750        ORA #$04     000001FG  (FOR PAGE 1)
  1760        STA 1
  1770        RTS
  1780 *--------------------------------
  1790 *      FROM APPLESOFT ROM AT $F417-$F437
  1800 *--------------------------------
  1810 MON.GBASL  .EQ $26
  1820 MON.GBASH  .EQ $27
  1830 HGR.PAGE   .EQ $E6
  1840 AS.HRCALC
  1850        PHA          Y-POS ALSO ON STACK
  1870        STA MON.GBASL     FOR Y=ABCDEFGH
  1880        LSR               GBASL=ABAB0000
  1890        LSR
  1900        ORA MON.GBASL
  1910        STA MON.GBASL
  1920        PLA           (C)   (A)     (GBASH)   (GBASL)
  1940        ASL            A-BCDEFGH0  ABCDEFGH  ABAB0000
  1950        ASL            B-CDEFGH00  ABCDEFGH  ABAB0000
  1960        ASL            C-DEFGH000  ABCDEFGH  ABAB0000
  1970        ROL MON.GBASH  A-DEFGH000  BCDEFGHC  ABAB0000
  1980        ASL            D-EFGH0000  BCDEFGHC  ABAB0000
  1990        ROL MON.GBASH  B-EFGH0000  CDEFGHCD  ABAB0000
  2000        ASL            E-FGH00000  CDEFGHCD  ABAB0000
  2010        ROR MON.GBASL  0-FGH00000  CDEFGHCD  EABAB000
  2030        AND #$1F       0-000FGHCD  CDEFGHCD  EABAB000
  2060 *--------------------------------
  2070        RTS
  2080 *--------------------------------

By the way, if you want to see the WHOLE thing...a commented listing of the entire Applesoft ROM, we have it on disk in format for the S-C Macro Assembler.

Saving Source with Apple's Mini-Assembler Jim Church
Trumbull, CT

I have discovered a way to store source code, complete with comments, on disk files for the Apple mini-assembler (at $F666 in the Integer BASIC ROM or Language Card load). I use what I call "the world's best word processor", the one you get from S-C Software for $50. I create a text file that looks like this:

     300:LDX #C0 ;START WITH "A"-1
      INX        ;LOOP COMES HERE
      TXA        ;CHAR TO PRINT
      CPX #DA    ;STOP AFTER "Z"
      BCC 302    ;NOT THERE YET
      RTS        ;FINISHED!

Assuming I have Integer BASIC in my RAM card, EXECing the above text file assembles the code very nicely and even runs the program once! Note that the Mini-Assembler does allow comments following a ";".

Generic Screen Dump Steve Knouse
Tomball, TX

Some computer terminals have a special key on the keyboard which will dump whatever is on the screen to a printer. The following program will give the same function to an Apple, using the ctrl-P key.

Many different versions of screen dump programs have been written, and published hither and yon. Most of them work with the particular author's printer and interface combination, but not mine or yours. I found the one Bob S-C published in the July 81 issue of AAL to be like that, so I worked it over. Now I believe it can truly be called "generic", or at least general, because it runs on every combination of printers and interfaces I can find.

I tested it on systems using the following interfaces:

The screen dump should work with any interface which recognizes the Apple standard method for turning off video output. The standard is to "print" a control-I followed by an "N". Lines 2190 through 2250 perform the output of these two characters.

The only board I found which did not work with this convention was the SSM AIO board, so the program which follows has a special conditional assembly mode to make it assembly slightly different object code for that board. If you have that board, change line 1610 to say "VERSION .EQ AIO" and it will assembly your version. Instead of Lines 2190 through 2250 being assembled, lines 2260 through 2310 will. They do not show up in the listing, so here they are:

       2260    .DO VERSION=AIO
       2270    LDA #$80
       2280    JSR COUT
       2290    LDX SLOT
       2300    STA NOVID,X
       2310    .FIN

If your assembler does not support conditional assembly, you can merely type in the lines 2270-2300 above in place of lines 2190-2310.

If your printer interface is not plugged into slot 1, change the slot number in line 2030, or at $0319.

Install the program by BRUNning the binary file of the object code, or by BLOADing it and doing a CALL768. Then whenever you type control-P, the screen will be printed. You can also call the screen dump from a running Applesoft program with CALL 794.

  1010 *--------------------------------
  1020 *
  1040 * 
  1560 *--------------------------------
  1580 GENERIC    .EQ 1
  1590 AIO        .EQ 2
  1630 CH         .EQ $24
  1640 BASL       .EQ $28
  1650 CSWL       .EQ $36 
  1660 CSWH       .EQ CSWL+1
  1670 KSWL       .EQ $38
  1680 KSWH       .EQ KSWL+1
  1700 DOS.HOOK   .EQ $3EA
  1720 BASCALC    .EQ $FBC1
  1730 COUT       .EQ $FDED
  1740 KEYIN      .EQ $FD1B
  1750 RDKEY      .EQ $FD0C
  1760 OUTPORT    .EQ $FE95  
  1770 VTAB       .EQ $FC22
  1790 CR         .EQ $8D      CARRIAGE RETURN
  1800 NOVID      .EQ $578
  1810 *--------------------------------
  1820        .OR $300
  1890 *--------------------------------
  1910        STA KSWL
  1920        LDA /ENTRY
  1930        STA KSWH
  1940        JMP DOS.HOOK
  1950 *--------------------------------
  1970        CMP #$90     ^P ?
  1980        BNE .1       NO
  1990        JSR DUMP     YES
  2000        JMP RDKEY    
  2010 .1     RTS
  2020 *--------------------------------
  2030 SLOT   .DA #1
  2040 *--------------------------------
  2050 DUMP   PHA          SAVE A, X, Y
  2060        TXA 
  2070        PHA
  2080        TAY
  2090        PHA
  2100        LDA CH       SAVE CH
  2110        PHA
  2120        LDA CSWL     SAVE OUTPUT HOOKS
  2130        PHA
  2140        LDA CSWH
  2150        PHA  
  2160 *
  2170        LDA SLOT     COLD START BOARD
  2180        JSR OUTPORT    IN SLOT 1
  2190        .DO VERSION=GENERIC
  2200        LDA #$89     KILL VIDEO ECHO
  2210        JSR COUT
  2220        LDA #"N"     
  2230        JSR COUT
  2240        NOP          PAD TO STAY ALIGNED W/ AIO VERSION
  2250        .FIN
  2260        .DO VERSION=AIO
  2270        LDA #$80     KILL VIDEO ECHO
  2280        JSR COUT
  2290        LDX SLOT
  2300        STA NOVID,X
  2310        .FIN
  2320 *
  2330        LDA #CR      START ON A NEW LINE
  2340        JSR COUT
  2350 *
  2360        LDX #0       START W/ 1ST LINE (0TH)
  2390 .1     TXA          LINE LOOP
  2410        LDY #0       START W/ 1ST CHARACTER (0TH)
  2420 .2     LDA (BASL),Y GET A CHAR
  2440        BCS .4         NON-FLASHING U.C.
  2450        ADC #$40
  2460        BNE .3       ..ALWAYS
  2470 .4     AND #$7F     MASK OFF HI BIT TO AVOID
  2480 *                      EPSON BLOCK GRAPHICS
  2490        JSR COUT     PRINT IT
  2500        INY          LOOP FOR ANOTHER CHAR
  2510        CPY #40
  2520        BCC .2
  2530        LDA #CR      END OF LINE 
  2540        JSR COUT      
  2550        INX          LOOP FOR ANOTHER LINE
  2560        CPX #24
  2570        BCC .1
  2590        PLA          RESTORE OUTPUT HOOKS
  2600        STA CSWH
  2610        PLA
  2620        STA CSWL
  2630        PLA          RESTORE CH
  2640        STA CH
  2650        JSR VTAB       AND LINE
  2660        PLA          RESTORE Y, X, A
  2670        TAY
  2680        PLA
  2690        TAX
  2700        PLA 
  2710        RTS          ..THAT'S ALL FOLKS
  2720 *

New CATALOG Interrupt Col. Paul L. Shetler
Tripler AMC, Hawaii

Most of the routines I've seen to terminate a CATALOG listing involve patching in a routine that checks for a particular key input and adding code to do different actions, like aborting or single-stepping the catalog list. Here is a modification I came up with that requires only a small change and no additional code.

This is the section of DOS that handles a new line in the CATALOG display:

               1000        .OR $AE2C
AE2C- 4C 7F B3 1020        JMP $B37F    leave File Manager
AE2F- A9 8D    1030 NEWLN  LDA #$8D     carriage return
AE31- 20 ED FD 1040        JSR $FDED    MON.COUT
AE34- CE 9D B3 1050        DEC $B39D    line count
AE37- D0 08    1060        BNE .1
AE39- 20 0C FD 1070        JSR $FD0C    MON.RDKEY
AE3C- A9 15    1080        LDA #$15     count 21 lines
AE3E- 8D 9D B3 1090        STA $B39D    reset line count
AE41- 60       1100 .1     RTS

Line 1020 is really the end of the previous routine, but we're going to be borrowing it, so I'll show it here. NEWLN is called every time the catalog list finishes a file name.

Notice that two bytes are wasted in lines 1030-1040. Why do LDA #$8D, JSR $FDED, when JSR $FD8E does the same thing? Two bytes may not sound like much, but in this case it's enough to work some magic! Try replacing the above piece of DOS with this:

               1000        .OR $AE2C
AE2C- 4C 7F B3 1020 EXIT   JMP $B37F    leave File Manager
AE32- CE 9D B3 1040        DEC $B39D    line count
AE35- D0 0A    1050        BNE .1       return if not done
AE37- 20 0C FD 1060        JSR $FD0C    get a keypress
AE3A- 29 17    1070        AND #$17     the magic number
AE3C- F0 EE    1080        BEQ EXIT     abort CATALOG
AE3E- 8D 9D B3 1090        STA $B39D    new line count
AE41- 60       1100 .1     RTS

Slipping in that AND #$17, BEQ EXIT, has several effects:

1. Space Bar or Back Arrow will terminate the listing. 2. Forward Arrow will advance the listing one page (just like normal.) 3. The "A" key will advance the listing one line.

And it all fits into the original space! The other keys will have different effects, depending on the value left in the accumulator after AND #$17. Most keys will advance the listing between 1-23 lines.

Try substituting other values for the $17 in line 1070. Remember that the value of (Keypress AND Value) will be the new line count. The catalog display will scroll up by that number of lines. If the result is zero, the catalog display will end. The maximum result is the same as the mask value, that is, 23 lines for a $17 mask.

[ My favorite mask value is $4F. With that value SPACE still breaks the display, but now the numeral keys scroll up by the same number of lines, i.e., pressing the "1" key gives one more line, "2" shows two more names, and so on. Also, the "O" (oh, not zero) key scrolls up by 79 lines, which usually means all the way to the end of the catalog....Bill ]

  1010 *--------------------------------
  1020 FMEXIT .EQ $B37F
  1030 COUNT  .EQ $B39D
  1040 RDKEY  .EQ $FD0C
  1050 CROUT  .EQ $FD8E
  1060 *--------------------------------
  1070        .OR $AE2C
  1090 EXIT   JMP FMEXIT   leave File Manager
  1100 NEWLN  JSR CROUT    send <CR>
  1110        DEC COUNT    line count
  1120        BNE .1       return if not done
  1130        JSR RDKEY    get a keypress
  1140        AND #$17     the magic number
  1150        BEQ EXIT     abort CATALOG
  1160        STA COUNT    new line count
  1170 .1     RTS

80 Column ASCII Monitor Dump Mike Dobe
O'fallon, IL

I have been trying out the monitor patches in the July issue of AAL for adding an ASCII display to the memory dump, and I have two problems with them. Because the routines place the characters directly into the Apple's screen memory, they do not work with my 80 column card. The same problem also arises when I want to send a dump to a printer. As a solution to this problem I present still another monitor patch for an ASCII display. My version is slightly longer than the others, but it still fits in the cassette tape portion of the monitor (just barely, I might add).

In order to take advantage of the 80 column display I first made the following patches to the monitor:


These changes allow the dump routine to print 16 values on each line, rather than the usual eight.

Since the characters have to be printed after the current line of the dump is finished, I need a place to buffer up to 16 characters. $BCDF, an unused area in DOS, serves this purpose. My routine buffers each byte before calling PRBYTE to display the hex value. If a particular byte will be the last one on that line of the dump, the patch calls PRBYTE to print the byte, then tabs to column 60 and displays the contents of the buffer. Upper and lower case characters are printed as they are, and control characters are replaced with blanks. (That's my style. As Bob said in July, choose your own favorite!)

Of course the following patch needs to be made to the dump code, to call my routine (this is the same as shown in the July article):

     FDBE:C9 FC

The patch can be used with a 40 column display by ignoring the above patches to $FDA6 and $FDB0, and by making the following changes to my patch routine:

     1140     AND #7
     1200     EOR #7
     1300     LDA #30
     1420     CPX #8

This patch was tested on a Microtek Magnum 80 card, but it should work on other brands as well.

[ It also works fine with the STB80 card, and the Apple //e...Bill ]

  1010 *--------------------------------
  1020 CH     .EQ $24
  1030 A1L    .EQ $3C
  1040 A1H    .EQ $3D
  1050 A2L    .EQ $3E
  1060 A2H    .EQ $3F
  1090 COUT   .EQ $FDED
  1100 *--------------------------------
  1110        .OR $FCC9
  1120        .TA $CC9
  1140 PATCH  PHA          save byte
  1150        LDA A1L      low byte of dump address
  1160        AND #$F        is transformed to
  1170        TAX            offset in buffer
  1180        PLA          get original byte back
  1190        PHA            but keep it on the stack
  1200        STA BUFFER,X buffer the character
  1210        CPX #$F      last byte of line?
  1220        BEQ .0       if so, print the buffer
  1230        LDA A2L
  1240        CMP A1L      done with range?
  1250        BNE .3       return to monitor if not
  1260        LDA A2H
  1270        CMP A1H      check high bytes
  1280        BNE .3       return if more
  1300 .0     PLA
  1310        JSR PRBYTE   print the last byte
  1320        LDA #60      tab to column 60
  1330        STA CH
  1340        LDX #0
  1350 .1     LDA BUFFER,X display the buffer
  1360        ORA #$80
  1370        CMP #$A0     control character?
  1380        BCS .2
  1390        LDA #$A0     if so, substitute blank
  1400 .2     JSR COUT     print the character
  1410        LDA #$A0
  1420        STA BUFFER,X blank out buffer as we go
  1430        INX
  1440        CPX #$10     done?
  1450        BCC .1       no, go on
  1460        RTS
  1480 .3     PLA          restore original byte
  1490        JMP PRBYTE   returns to caller

Apple Assembly Line is published monthly by S-C SOFTWARE CORPORATION, P.O. Box 280300, Dallas, Texas 75228. Phone (214) 324-2050. Subscription rate is $15 per year in the USA, sent Bulk Mail; add $3 for First Class postage in USA, Canada, and Mexico; add $13 postage for other countries. Back issues are available for $1.50 each (other countries add $1 per back issue for postage).

All material herein is copyrighted by S-C SOFTWARE, all rights reserved. Unless otherwise indicated, all material herein is authored by Bob Sander-Cederlof. (Apple is a registered trademark of Apple Computer, Inc.)