Apple Assembly Line
Volume 7 -- Issue 2 November 1986

In This Issue...

Some Price Reductions...

The other day I noticed that I could buy ten 3M diskettes in a nice hard-plastic library case for less than $11 at the local Safeway store. Wow! Times have changed! Our price keeps going down too, though. Now you can buy disks from us for 60 cents apiece. A shrink-wrapped pack of 25 is only $15, including tyvek sleeves.

We told you a few months ago about the Minuteman UPS from Para Systems. I still love mine, and I think you would also enjoy one as much as I do. We are lowering our price this month from $350 to $320, plus shipping charges. The Minuteman handles up to 250 watts. I run my Sider, printer, monitor, and //e with a full deck of cards including 1Meg RAMWORKS; there is probably ample power left for a few more items. My power is now filtered, surge protected, brownout protected, and blackout protected.

Ultra-Fast Integer Square Roots Charles H. Putney
and Bob Sander-Cederlof

Well, Bob wanted a faster integer square root program, so here it is! This method uses table lookup with as little as two pages of tables. The fastest version uses 2.75 pages of tables (704 bytes), and averages only 37 microseconds per root when taking all 65536 possible. The version which uses only 512 bytes of tables is a little slower, but still a lot faster than the IBM PC program Bob mentioned a few months ago.

Here's how my method works. First the input argument is shifted left two bits at a time until it is in the range from $4000 to $FFFF. I keep track of how many double-bit shifts this takes, from 0 to 7 times. Then I use the high byte of this value, which will be a number from $40 to $FF, to find the root in a table of 192 roots. Then I shift the root right from 0 to 7 times, depending on the number of shift steps used before. The result is either the correct integer square root of the original number, or one less than the correct root. I can make the final correction by testing the original argument against a table of squares. I use the root taken from the first table as index into the second table. The value in the second table for root=N will be (N+1)*(N+1). If the original argument is less than N-plus-1-squared, then N is the correct root; otherwise, N+1 is the correct root.

The program is shown below, with the tables. I used an Applesoft program to actually generate the data for the tables, in a form which can be EXECed directly into the S-C Macro Assembler:

     10 D$ =  CHR$ (4)
     40 H$ = "0123456789ABCDEF"
     110  PRINT "5000 TABLE1"
     120  FOR I = 0 TO 23
     130  PRINT 5001 + I"    >HS ";
     140  FOR J = 64 TO 71
     150 N = (I * 8 + J) * 256
     160 R =  INT ( SQR (N))
     170  GOSUB 1000
     180  NEXT : PRINT : NEXT 
     210  PRINT "6000 TABLE2"
     220  FOR I = 0 TO 31
     230  PRINT 6001 + I"    >HS ";
     240  FOR J = 1 TO 8
     250 N = I * 8 + J:N2 = N * N
     260 R = N2 - 256 *  INT (N2 / 256)
     270  GOSUB 1000
     280  NEXT : PRINT : NEXT 

     310  PRINT "7000 TABLE3"
     320  FOR I = 0 TO 31
     330  PRINT 7001 + I"    >HS ";
     340  FOR J = 1 TO 8
     350 N = I * 8 + J:N2 = N * N
     360 R =  INT (N2 / 256): IF R > 255 THEN R = 0
     370  GOSUB 1000
     380  NEXT : PRINT : NEXT 

     900  PRINT D$"CLOSE"
     910  END 

     1000  REM  PRINT IN HEX
     1010  PRINT  MID$ (H$, INT (R / 16) + 1,1);
     1020  PRINT  MID$ (H$,R - 16 *  INT (R / 16) + 1,1)".";
     1030  RETURN 

The way I have written the SQRT subroutine, the high byte of the argument is expected in the X-register and the low byte in the A-register. The square root is returned in the Y-register, with A and X destroyed. This combination seemed to me to give the best speed. Lines 2290 and 2300 divde the arguments into two ranges: $4000-FFFF, and below $4000. The higher range comprises 75% of the possible arguments, a total of 49152.

The top 256 possible arguments, from $FF00 to $FFFF, must be handled as a special case. The logic which compares roots against values in TABLE2 and TABLE3 is confused by the fact that the entry for $FF is $0000. (It really is $10000, but the leading 1 is not in either table.) Lines 2320-2330 strip out these arguments, and lines 2860-2870 return the correct root ($FF). A total of only 11 cycles (not counting JSR SQRT or RTS) for these 256 arguaments.

If the range is from $4000 to $FEFF, as it is in 48896 cases, lines 2340-2410 return the correct root. The high-byte of the argument is already in the X-register, so line 2340 loads the Y-register with the trial root from the table. No shifting must be done, so lines 2350-2390 proceed to compare with the square of the root+1 in TABLE2 and TABLE3. If the entry there is larger than the original argument, the root is correct; if not, line 2400 adds one, making it correct. The longest path from beginning to end for these arguments is only 28 cycles. If the test at line 2360 branches, it is only 19 cycles. Wow! And this takes care of three-fourths of all cases!

Arguments below $4000 are handled by lines 2430 and following. Lines 2440-2450 test for arguments from $0000 to $00FF. These will be handled by lines 2740-2840. If the argument is exactly $0000, the root is $00, and this is detected at lines 2740-2750. All other roots below $0100 need to be shifted at least 4 two-bit steps. By merely starting to work on the low-byte, and with a shift-count of 4, we accomplish the first four steps without taking any time at all. The loop in lines 2790-2840 normalizes the byte and continues to count shift steps. Then we joint the processing of values between $0100 and $3FFF.

The range of arguments from $0100 to $3FFF are handled beginning at line 2470. The loop in lines 2510-2570 normalizes the argument by shifting left in two-bit steps until the value is $4000 or more, counting the number of steps it takes. It will be 1, 2, or 3 steps.

We come to line 2590 with a shift count in the Y-register. The count will be 1, 2, or 3 the original argument was $0100 or more; it will be from 4 to 7 if the original argument was below $0100. We also come to line 2590 with the high-byte of the normalized argument in the A-register. Lines 2590-2600 used this byte to get a trial root from TABLE1. The loop in lines 2610-2630 shifts the trial root right the same number of bits as we took in two-bit steps to normalize the argument earlier. Finally, lines 2640-2720 check the trial root against the value in TABLE2 and TABLE3, and correct the root if necessary.

I had it all counted out at one time, and the arguments below $4000 (with the exception of $0000) take on the order of 80 cycles. It gets very involved to try to count these paths, so I wrote a timing program instead. LInes 2070-2260 call the SQRT subroutine for each argument from $0000 to $FFFF, and do it ten times. This takes about 41 seconds to execute. For fun, I inserted line 2170 to toggle the speaker after taking each square root; the sound is interesting, and also reveals the fact that lower roots take longer than higher roots. Lines 2130 and 2250 turn on and off the AN0 signal in the game port, for the timing setup I describe in another article in this newsletter. Line 2220 makes a visible mark on the screen so I will not get too impatient while the program is running.

I ran the timing loop as shown first, and as I said it took about 41 seconds. My timing setup using another Apple to count cycles gave a result of 41,821,940 cycles. Then I changed line 2280 from "SQRT" to "SQRT RTS", so I could time the overhead of the the timing program itself. My other Apple said this took 17,056,520 cycles. The difference is the time of SQRT itself, and this is 24,765,420. Remember that I took 65536 square roots ten times: therefore I divide by 655360 to get an average cycle count of only 37.8 cycles. In English, that is about 37 microseconds. Wow! Tell that to your IBM friend!

The program slows down if the tables are not properly placed in memory. Indexed instructions take an extra clock cycle if the indexing crosses a page boundary. Therefore, I adjusted the start of the tables so that they fit in a page. Notice that TABLE1 really starts 64 bytes into the page. This is so because the index we use to access TABLE1 runs from $40 to $FF. The label ROOT is equated to TABLE1-64, at line 4010.

All this timing is irrelevant if the program produces incorrect results. Therefore an exhaustive test is necessary. I wrote a test program in Applesoft, but it was very slow. Therefore I converted it to assembly language, with the result in lines 1340-2050. The test program has some interesting wrinkles in it. It checks all the square roots from SQRT without actually having any code to multiply, divide, or take a square root.

The test program runs through the possible arguments in sequential order, from $0000 to $FFFF. If the answer returned by SQRT is correct, it will pass the following tests:

  1. it will be the same as the previous root,
    or it will be the previous root + 1.
  2. if it is the same as the previous root,
    the argument must still be less than the "next perfect square"
  3. if it is prev. root + 1,
    the argument must be greater than or equal to the "next perfect square"

I keep a running value for "next perfect square". I start with 1 (the next perfect square after 0*0 is 1*1). Then each time I find that the argument has reached the value of the "next perfect square", I bump it up by adding 2*root+1. Remember that (n+1)^2 = n^2 + 2*n + 1.

Lines 1550, 1780, and 1860 indicate visually that the program is running, and helped me find a bug or two. Lines 1930-2050 print out the important information when an error is detected.

Lines 1280-1320 allowed me to call SQRT from inside an Applesoft program. However, this is not foolproof because there may be page-zero conflicts as the program is now written. It worked fine for my tests, though.

  1010 *--------------------------------
  1030 *
  1050 *      18 QUINNS ROAD
  1070 *
  1090 *        A = ARG LOW BYTE
  1100 *
  1120 *         X AND A DESTROYED
  1130 *
  1140 *--------------------------------
  1150 BAS.ARG    .EQ $00,01
  1160 NUMBER     .EQ $02,03
  1170 ARGSAV     .EQ $04,05
  1180 ARGLO      .EQ $06
  1190 TEN.TIMES  .EQ $07
  1200 OLD.ROOT   .EQ $08
  1210 NEW.ROOT   .EQ $09
  1220 RR         .EQ $0A,0B
  1230 SS         .EQ $0C,0D,0E
  1240 *--------------------------------
  1250        .OR $6000    OUT OF THE WAY
  1260 *      .TF SQUARE ROOT.OBJ
  1270 *--------------------------------
  1290        LDX BAS.ARG+1    HIGH = 1
  1300        JSR SQRT     TEST IT
  1310        STY BAS.ARG  RETURN IN 0
  1320        RTS
  1330 *--------------------------------
  1340 TEST
  1350        LDA #0
  1360        STA NUMBER
  1370        STA NUMBER+1
  1380        STA OLD.ROOT
  1390        STA RR
  1400        STA RR+1
  1410        STA SS+1
  1420        STA SS+2
  1430        LDA #1
  1440        STA SS
  1460        LDX NUMBER+1
  1480        STY NEW.ROOT
  1490        CPY OLD.ROOT
  1500        BEQ .2       SAME AS OLD ROOT
  1510        INC OLD.ROOT
  1520        BEQ .99      ERROR
  1530        CPY OLD.ROOT
  1540        BNE .99      ERROR
  1550   INC $7F4
  1560        LDA SS       SS = RR
  1570        STA RR
  1580        LDA SS+1
  1590        STA RR+1
  1600        SEC          SS = SS + R + R + 1
  1610        ROL NEW.ROOT
  1620        LDA #0
  1630        ROL
  1640        PHA          SAVE HIBYTE OF 2*R+1
  1650        LDA SS
  1660        ADC NEW.ROOT
  1670        STA SS
  1680        PLA
  1690        ADC SS+1
  1700        STA SS+1
  1710        BCC .2
  1720        INC SS+2
  1740        CMP RR
  1750        LDA NUMBER+1
  1760        SBC RR+1
  1770        BCC .99
  1780   INC $7F5
  1790        LDA NUMBER   ERROR IF NUMBER >= SS
  1800        CMP SS
  1810        LDA NUMBER+1
  1820        SBC SS+1
  1830        LDA #0
  1840        SBC SS+2
  1850        BCS .99
  1860   INC $7F6
  1870        INC NUMBER
  1880        BNE .1       WRAPPED ?
  1890        INC NUMBER+1
  1900        BNE .1       DONE 65536 ?
  1910        RTS
  1920 *--------------------------------
  1930 .99    LDA NUMBER+1
  1940        JSR $FDDA
  1950        LDA NUMBER
  1960        JSR $FDDA
  1970        LDA #"-"
  1980        JSR $FDED
  1990        LDA OLD.ROOT
  2000        JSR $FDDA
  2010        LDA #"-"
  2020        JSR $FDED
  2030        LDA NEW.ROOT
  2040        JSR $FDDA
  2050        RTS
  2060 *--------------------------------
  2070 TIMING
  2090        STA NUMBER
  2100        STA NUMBER+1
  2110        LDA #10      DO IT ALL TEN TIMES
  2120        STA TEN.TIMES
  2130   LDA $C059         START TIMER
  2150        LDX NUMBER+1
  2170 * LDA $C030     REMOVE "*" TO GET NEAT SOUNDS
  2180        INC NUMBER
  2190        BNE .1       WRAPPED ?
  2200        INC NUMBER+1
  2210        BNE .1       DONE 65536 ?
  2220        INC $7F7
  2230        DEC TEN.TIMES
  2240        BNE .1
  2250   LDA $C058         STOP TIMER
  2260        RTS
  2270 *--------------------------------
  2280 SQRT
  2290        CPX #$40     VALUE ALREADY NORMALIZED?
  2300        BCC .2       ...NO
  2310 *---ARG = $4000...FFFF-----------49152 CASES
  2320        CPX #$FF     CHECK FOR ARG-HI = $FF
  2330        BEQ .9       ...YES, SPECIAL CASE
  2350        CMP TABLE2,Y
  2360        BCC .1       ...SPEEDS UP AVERAGE BY 0.8 CYCLE
  2370        TXA          ARG-HI
  2380        SBC TABLE3,Y
  2390        BCC .1
  2400        INY
  2410 .1     RTS
  2420 *---ARG = $0000...3FFF-----------
  2430 .2     STX ARGSAV+1 SAVE ARG-HI
  2440        CPX #0       IS ARG-HI ZERO?
  2450        BEQ .7       ...YES
  2460 *---ARG = $01FF...3FFF-----------16128 CASES
  2490        TXA          ARG-HI TO A-REG
  2500        LDY #0       START SHIFT COUNT = 0
  2510 .3     ASL ARGLO
  2520        ROL
  2530        ASL ARGLO
  2540        ROL
  2550        INY
  2560        CMP #$40
  2570        BCC .3
  2580 *---A=NORM-ARG, Y=SHIFT-CNT------
  2590 .4     TAX          USE NORM-ARG FOR INDEX
  2610 .5     LSR          HALF ROOT SHIFT-CNT TIMES
  2620        DEY
  2630        BNE .5
  2640        TAY          USE SHIFTED ROOT FOR INDEX NOW
  2650        LDA ARGSAV   GET ARG-LO
  2660        CMP TABLE2,Y
  2670        BCC .6       ...SPEEDS UP AVERAGE BY 0.7 CYCLE
  2680        LDA ARGSAV+1
  2690        SBC TABLE3,Y
  2700        BCC .6
  2710        INY
  2720 .6     RTS
  2730 *---ARG = $0000...00FF-----------
  2740 .7     TAY          IS ARG-LO ALSO ZERO?
  2750        BEQ .1       ...YES, SQRT=0
  2760 *---ARG = $0001...00FF-----------255 CASES
  2780        LDY #4       START SHIFT COUNT = 4
  2790 .8     CMP #$40     NORMALIZED YET?
  2800        BCS .4       ...YES, GET ROOT NOW
  2810        ASL
  2820        ASL
  2830        INY          COUNT THE SHIFT
  2840        BNE .8       ...ALWAYS
  2850 *---ARG = $FFXX------------------
  2860 .9     LDY #$FF
  2870        RTS
  2880 *--------------------------------
  2890 ZZ     .EQ *-SQRT
  2900 *--------------------------------
  2920        .BS *+255/256*256-*+64
  2930 *--------------------------------
  2940 *      DON'T WASTE PAPER
  2950        .LIST MOFF
  2960        .MA HS
  2970        .HS ]1
  2980        .EM
  2990 *--------------------------------
  3000 *      SQUARE ROOT TABLE OF N
  3010 *      FROM $4000 (16384)
  3020 *      TO $FF00   (65280)
  3030 *      BY $100    (256)
  3040 TABLE1 >HS
  3050        >HS
  3060        >HS 8F.
  3070        >HS
  3080        >HS 9C.9D.9E.9F.A0.A0.A1.A2.
  3090        >HS A3.A3.A4.A5.A6.A7.A7.A8.
  3100        >HS A9.AA.AA.AB.AC.AD.AD.AE.
  3110        >HS AF.B0.B0.B1.B2.B2.B3.B4.
  3120        >HS B5.B5.B6.B7.B7.B8.B9.B9.
  3130        >HS BA.BB.BB.BC.BD.BD.BE.BF.
  3140        >HS C0.C0.C1.C1.C2.C3.C3.C4.
  3150        >HS C5.C5.C6.C7.C7.C8.C9.C9.
  3160        >HS CA.CB.CB.CC.CC.CD.CE.CE.
  3170        >HS CF.D0.D0.D1.D1.D2.D3.D3.
  3180        >HS D4.D4.D5.D6.D6.D7.D7.D8.
  3190        >HS D9.D9.DA.DA.DB.DB.DC.DD.
  3200        >HS DD.DE.DE.DF.E0.E0.E1.E1.
  3210        >HS E2.E2.E3.E3.E4.E5.E5.E6.
  3220        >HS E6.E7.E7.E8.E8.E9.EA.EA.
  3230        >HS EB.EB.EC.EC.ED.ED.EE.EE.
  3240        >HS EF.F0.F0.F1.F1.F2.F2.F3.
  3250        >HS F3.F4.F4.F5.F5.F6.F6.F7.
  3260        >HS F7.F8.F8.F9.F9.FA.FA.FB.
  3270        >HS FB.FC.FC.FD.FD.FE.FE.FF.
  3280 *--------------------------------
  3290 *
  3310 *      BYTE OF (N+1)
  3320 TABLE2 >HS
  3330        >HS
  3340        >HS
  3350        >HS 71.A4.D9.10.49.84.C1.00.
  3360        >HS 41.84.C9.10.59.A4.F1.40.
  3370        >HS 91.E4.39.90.E9.44.A1.00.
  3380        >HS 61.C4.29.90.F9.64.D1.40.
  3390        >HS B1.
  3400        >HS
  3410        >HS D1.64.F9.90.29.C4.61.00.
  3420        >HS A1.44.E9.90.39.E4.91.40.
  3430        >HS F1.A4.59.10.C9.84.41.00.
  3440        >HS C1.84.49.10.D9.A4.71.40.
  3450        >HS 11.E4.B9.
  3460        >HS E1.C4.A9.
  3470        >HS
  3480        >HS
  3490        >HS
  3500        >HS
  3510        >HS 71.A4.D9.10.49.84.C1.00.
  3520        >HS 41.84.C9.10.59.A4.F1.40.
  3530        >HS 91.E4.39.90.E9.44.A1.00.
  3540        >HS 61.C4.29.90.F9.64.D1.40.
  3550        >HS B1.
  3560        >HS
  3570        >HS D1.64.F9.90.29.C4.61.00.
  3580        >HS A1.44.E9.90.39.E4.91.40.
  3590        >HS F1.A4.59.10.C9.84.41.00.
  3600        >HS C1.84.49.10.D9.A4.71.40.
  3610        >HS 11.E4.B9.
  3620        >HS E1.C4.A9.
  3630        >HS
  3640 *--------------------------------
  3650 *
  3670 *      BYTE OF (N+1)
  3680 TABLE3 >HS
  3690        >HS
  3700        >HS
  3710        >HS
  3720        >HS
  3730        >HS
  3740        >HS 09.09.0A.0A.0A.0B.0B.0C.
  3750        >HS 0C.0D.0D.0E.0E.0F.0F.10.
  3760        >HS
  3770        >HS
  3780        >HS 19.1A.1A.1B.1C.1C.1D.1E.
  3790        >HS 1E.1F.
  3800        >HS
  3810        >HS 2B.2B.2C.2D.2E.2F.30.31.
  3820        >HS
  3830        >HS 39.3A.3B.3C.3D.3E.3F.40.
  3840        >HS
  3850        >HS 49.4A.4B.4C.4D.4E.4F.51.
  3860        >HS
  3870        >HS 5B.5C.5D.5F.
  3880        >HS
  3890        >HS 6F.
  3900        >HS 7A.7B.7D.7E.7F.81.82.84.
  3910        >HS
  3920        >HS
  3930        >HS 9D.9F.A0.A2.A4.A5.A7.A9.
  3940        >HS AA.AC.AD.AF.B1.B2.B4.B6.
  3950        >HS B7.B9.BB.BD.BE.C0.C2.C4.
  3960        >HS C5.C7.C9.CB.CC.CE.D0.D2.
  3970        >HS D4.D5.D7.D9.DB.DD.DF.E1.
  3980        >HS E2.E4.E6.E8.EA.EC.EE.F0.
  3990        >HS F2.F4.F6.F8.FA.FC.FE.00.
  4000 *--------------------------------
  4020 EXACTL .EQ TABLE2    SET UP SO 0 INDEX        (OF $4000)
  4040 *--------------------------------

New ProDOS Bug and Fix Bob Sander-Cederlof

The November 1986 issue of Open-Apple (Tom Weishaar's wonderful newsletter) tells of an important new discovery. For about a year Tom has been reporting on the symptom: Appleworks and Applewriter data disks suddenly turning up with track 0 destroyed. It only happened to 5.25" diskettes, and only one certain machines, and otherwise seemingly at random. For a complete description, get all of Tom's back issues.

Some of his readers from Australia seem to have tracked down the problem, and they suggest a solution. In the floppy driver code inside ProDOS, at $D6C3, there are four STA commands that turn off all four stepper motor windings. Tom says the purpose is to disable any 3.5" drives connected in a daisy chain to the same controller. I wonder, because this code has been here since 1983, long before the possiblility of 3.5" drives. Anyway, the code has a bad side-effect in some systems.

A quirk of the controller card is that STA operations to the stepper motor winding soft-switches also cause the card to write on the data bus. So you have the bus being driven in two directions at once: the cpu trying to store the A-register, and the controller card trying to send something meaningless. Besides resulting in garbage on the data bus, which causes no real damage in this case, apparently in some Apples with some controller cards it causes the card to go into WRITE mode. Whatever track the head is sitting on will then be clobbered.

The solution is to change the four STA operations to LDA. The disk drives will get the same message, without causing the bus contention. You can patch the PRODOS system file and re-SAVE it, on all your disks. If you have a hard disk, you should only have to do it one time. If you BLOAD the PRODOS file at $2000, the four instructions will be found at $56D3:

       56D3: 9D 80 C0  STA $C080,X
       56D6: 9D 82 C0  STA $C082,X
       56D9: 9D 84 C0  STA $C084,X
       56DC: 9D 86 C0  STA $C086,X

If you change all those "9D" bytes to "BD", which is the opcode for "LDA addr,X", the bug is supposed to disappear. Doing it from inside the S-C Macro Assembler, I did it this way:

       :BLOAD PRODOS,TSYS,A$2000
       :$56D3:BD N 56D6:BD N 56D9:BD N 56DC:BD
       :BSAVE PRODOS,TSYS,A$2000,L14848

I personally have never had ProDOS clobber a diskette. I have trashed some myself, by stupidity, but this hardware/software bug has never caused it. Nevertheless, I have now patched my disks, just in case. Many thanks to Tom, Open-Apple, and to the men in Australia.

Timing Apple Programs with Another Apple Bob Sander-Cederlof

While I was working on Charles Putney's integer square root program, I longed for a better way to time it. I was wasting a lot of my time using a stopwatch, and still getting inaccurate (or at least imprecise) times.

For around $3000 I could buy a logic analyzer and hook it up to count machine cycles. That is obviously out of the question. Maybe I could hunt around among my old boards and find one with a 6522 on it: that chip has an interval timer that could give me fairly accurate times. I might be able to find one, but then I would have to figure out how to program it again.

Then I thought about using the game port to communicate with another Apple, and put a timing loop in the other Apple. I hooked one of the Annunciator output lines in my first Apple to a Push Button input line on the second one. Then I set up the program being clocked to set the annunciator on at the beginning and turn it off at the end. I wrote a timing program to run in the other Apple which waited until the push button input went on, and then counted loops until it went low again. The results were better than I hoped for!

To hook up the Apples, I started by finding some wire. I needed about 12 feet of at least two wires. I found about six feet of four-line telephone wire, and another six feet of twisted pair left over from my burglar alarm installation. I connect them together, very crudely, and stretched them across the room. The Apple on the south side of the room is my nine-year-old. It has a nice ZIF-socket in the game port, so I inserted the ground wire into pin 8 and the signal wire into pin 2, and clamped the socket. If you do not have a ZIF socket in yours, the telephone wire fits very nicely into the holes in a regular socket.

The Apple //e on the north side of the room challenged me a little more. First, the game socket is unreachable, way under the top right lip of the upper case. I can't even see it without a flashlight! There is a nine-pin D-connector on the back panel, but the Annunciator lines do not come to this connector. A little research led to the knowledge that the Annunciator signals come directly from pins 10-13 of the IOU chip. I chose AN0, which is pin 10. I hooked a red miniclip lead to that pin, and a black miniclip lead to ground at pin 1 of the same chip. The IOU chip is the 40-pin chip at position E5 on my //e motherboard, conveniently labeled "IOU". Facing the computer from the front, pin one is the first one on the right-hand side of the chip. Pin 10 is on the same side, about half way back. I then connected the other end of those leads to my wires, and the circuit was complete.

The program in the //e is the program whose time I want to measure. At the beginning of the section to be timed, I insert the instruction "LDA $C059" to turn on AN0. At the end, I insert the instruction "LDA $C058" to turn off AN0.

The timing program in the other Apple is shown below. Lines 1130-1180 set up a page zero location to contain $01, which I need later to make all the timing correct. They also clear the three registers, which I am going to use for accumlating a 24-bit count. Lines 1190-1200 then wait until the input signal goes high. This will happen when the program in the //e does the "LDA $C059" instruction.

Lines 1260-1400 increment the 24-bit count once each 20 cycles, until the PB0 signal falls. The signal is tested only once each 20 cycles, so there is a built in resolution of 20 cycles. If I want to measure a program down to the exact cycle, I will have to run it at least 20 times. Actually, there are two other sources of "error": the signal on my 12 feet of wire will not necessarily rise and fall at exactly the same speed; and the two Apples may not be running at exactly the same speed.

The various paths in lines 1260-1400 are all carefully timed so they all take exactly 20 cycles. The interval between BIT PB0 executions should always be 20 cycles. Of course, that is, unless I made a mistake. There is one exception: When the A-register wraps around, after 16,777,216 counts, lines 1340-1350 add 5 cycles. The total interval on this path is 24 cycles. But this only happens once every 6 or 7 minutes, so who cares!

Finally, lines 1420-1490 print out the resulting count in hexadecimal. I then take my handy Radio Shack calculator out, convert to decimal, multiply by 20, and have the cycle count.

I like this arrangement so well, and I need to time programs so frequently, that I plan to make a more permanent hookup. And that reminds me of an old idea... a way to network several Apples using just the gameport....

  1000 *SAVE S.TIMER
  1010 *--------------------------------
  1040 *--------------------------------
  1050 CNT0   .EQ 0
  1060 CNT1   .EQ 1
  1070 CNT2   .EQ 2
  1080 ONE    .EQ 3
  1090 *--------------------------------
  1100 PB0    .EQ $C061    BIT 7 = 1 WHEN PRESSED
  1110 *--------------------------------
  1120 T
  1130        LDY #1
  1140        STY ONE
  1150        DEY          Y=0
  1160        TYA          A=0
  1170        TAX          X=0
  1180        CLC
  1190 .1     BIT PB0
  1200        BPL .1
  1210 *--------------------------------
  1230 *      24-BIT COUNTER GIVES 16,777,216 COUNTS
  1240 *      WHICH IS 335,544,320 CYCLES
  1250 *--------------------------------
  1260 .2     BIT PB0
  1270        BPL .5       END OF COUNTING
  1280        INX
  1290        BNE .3
  1300        INY
  1310        BNE .4
  1320        ADC ONE
  1330        BNE .2
  1340        CLC
  1350        BCC .2
  1360 *
  1370 .3     NOP
  1380        NOP
  1390 .4     NOP
  1400        BNE .2
  1410 *--------------------------------
  1420 .5     STA CNT2
  1430        STY CNT1
  1440        STX CNT0
  1450        JSR $FDDA
  1460        LDA CNT1
  1470        JSR $FDDA
  1480        LDA CNT0
  1490        JMP $FDDA
  1500 *--------------------------------
  1510        .LIF

The //gs Reference Manuals Bob Sander-Cederlof

When Apple sent me the prototype //gs they included 11 fat 3-ring binders full of documentation. Much of it is destined to eventually be published as reference manuals by Addison-Wesley. I can hardly wait, because in the present form it is incomplete, inconvenient, inconsistent, inaccurate, and takes up too much space. I am sure the finished product will be up to Apple's usual standard, eliminating all the negatives just mentioned.

Addison-Wesley has released a little folder which describes the new manuals, with projected publishing dates. We will carry some of these, as soon as they are available.

The first book out will be "Technical Introduction to the Apple //gs". It is due in December, but there is not much REAL information in it. It is more like a complete marketing description, without the kind of detailed information programmers need. It is only 120 pages.

Three books are due in "Spring, 1987". I suppose that means we can expect copies by June 21st, at least. These look like books worth ordering:

    "Programmer's Introduction to the Apple //gs", 150 pgs, $19.95
    "Apple //gs Hardware Reference", 250 pgs, $26.95
    "Apple //gs Firmware Reference", 250 pgs, $24.95

Three more are due in "Summer, 1987", which means no later than September 21st:

    "Apple //gs Toolbox Reference"
         Volume 1, 400 pgs, $29.95
         Volume 2, 400 pgs, $29.95
    "Apple //gs ProDOS 16 Reference"
         Disk included, 200 pgs, $39.95

I expect Gary Little's new book, "Inside the Apple //gs", to be coming out by next March or April. No doubt there will be many more books coming out. Apple has a way of triggering whole new industries....

PAUSE Directive for S-C Macro Assembler 2.0 Bill Morgan

Many times we want to call "time out" during an assembly, for various reasons. Maybe a program has outgrown the available disk space and we need to swap source disks, or maybe we need to check the value of a label during assembly. We can't manually pause an assembly at a specific place during pass one, and it's difficult to do during pass two if the listing is on, since you have to sit and stare at the screen to tell where the assembly is. What we need is a PAUSE directive to tell the assembler to stop and wait for a keypress before continuing.

Back in May of 1983 Mike Laumer wrote up such a directive for the S-C Macro Assembler. The Assembler provides a .US directive for just such cases and Mike supplied the routines to use a line like .US SWAP SOURCE DISK to pause the assembly and display "SWAP SOURCE DISK" on the screen in inverse text. The Macro Assembler, Apple computers, and people's expectations have all changed in the last 3 1/2 years, so it seems like time to update and expand that article.

The .US vector normally contains JMP CMNT, a jump to the assembler's comment routine to just list a line. We can patch in the address of our handler and then have our code exit to CMNT when we're through. When control transfers to the .US vector the source line is in the system input buffer at $200 and location $7B contains an index into the buffer, pointing at the first character following the ".US" (normally a space). Another assembler variable that can come in handy for a .US feature is PASS, at $60. This location contains a 0 during pass one of assembly and a 1 during pass two.

It only takes a few changes to adapt Mike's code to the Version 2.0 Macro Assemblers. The .US vector is now at $D015 ($8015 for ProDOS), so we have to change that .EQuate line. CMNT has shifted around between various releases of Version 2.0, so I redid the code in INSTALL to transfer the correct address for CMNT out of the vector before installing PAUSE.

We only need to make two more changes to create a ProDOS version: alter the .US vector definition as shown in lines 1230-1240; and delete lines 1180-1190, 1290-1300, and 1400, since we don't need to worry about enabling/disabline the "Language Card" memory under the ProDOS assembler.

Most of the changes have to do with accomodating the //e 80-column display, with its division between main and auxiliary memory. I kept the technique of using the Y-register to index through the source line, and the X-register to index through screen memory. Toggling the Carry bit keeps track of which bank we need to store into, and incrementing Y after each store and incrementing and testing X after every other store takes care of the different indexes we need.

I thought we were just about done when I realized that this program wouldn't properly handle lower case text in the message string. To use inverse lower case we have to take the AltChar soft switch into account and adjust the ASCII values. I added that code in at the last minute, and left it with odd line numbers and lower case opcodes, so you can see exactly how much extra effort it takes to deal with inverse lower case. If you're always going to use upper case text in your Pause messages you can save 24 bytes by leaving out those lines.

This program is specifically for the Apple //e 80-column display. For 40-column display you can just change the addresses in Mike's original article. For other 80-column displays you will probably have to give up some transparency, since you are unlikely to be able to display something on the screen without going through the usual I/O hooks. Maybe you can store directly into the card's memory, if the manufacturer documents how to do it.

  1010 *--------------------------------
  1030 *
  1040 *      SYNTAX:  .US <phrase>
  1050 *      RESULT:  Displays <phrase> in inverse text
  1060 *               and waits for a keypress
  1070 *
  1080 *--------------------------------
  1090 CHAR.PTR .EQ $7B
  1110 WBUF     .EQ $200
  1120 CORNER   .EQ $7D0
  1140 KEYBOARD .EQ $C000
  1141  .eq $c00e
  1142 alt.on   .eq $c00f
  1150 STROBE   .EQ $C010
  1151 .eq $c01e
  1160 PAGE1    .EQ $C054
  1170 PAGE2    .EQ $C055
  1180 PROTECT  .EQ $C080
  1190 ENABLE   .EQ $C083
  1210 BELL     .EQ $FBE2
  1220 *--------------------------------
  1230 USR.VECT  .EQ $D015      DOS 3.3
  1240 *             $8015      ProDOS
  1250 *--------------------------------
  1260        .OR $300
  1280 INSTALL
  1290        LDA ENABLE        write enable
  1300        LDA ENABLE        RAM card
  1310        LDX #1            start with hi-bytes
  1320 .1     LDA USR.VECT+1,X  get SC.CMNT address
  1330        PHA               stash it
  1340        LDA EXIT+1,X      get PAUSE address
  1350        STA USR.VECT+1,X  set .US vector
  1360        PLA               recover stash
  1370        STA EXIT+1,X      set exit address
  1380        DEX               now do lo-bytes
  1390        BPL .1
  1400        LDA PROTECT       protect card
  1410        RTS
  1420 *--------------------------------
  1430 PAUSE  LDX #0            start at beginning of screen line
  1440        CLC               clear toggle
  1441        lda      get altchar status
  1442        php               stash it
  1443        sta alt.on        altchars on
  1450        LDY CHAR.PTR      index into source line
  1460 .1     LDA WBUF,Y        get char from call line
  1470        BEQ .3            .EQ. is end of line
  1471        php               preserve carry
  1472        cmp #'`'          test for lower case
  1480        AND #%00111111    invert char
  1481        bcc .15           .CC. if upper case
  1482        ora #%01000000    correct inverse lower case
  1483 .15    plp               restore carry
  1490        BCS .2            branch if odd screen position
  1510        STA PAGE2         even, so use aux memory
  1520        STA CORNER,X      show character
  1530        INY               next message character
  1540        SEC               set toggle
  1550        BCS .1            always
  1570 .2     STA PAGE1         odd, so use main memory
  1580        STA CORNER,X      show character
  1590        INY               next message character
  1600        INX               next screen position
  1610        CPX #40           line full?
  1620        BCC .1            no, get another char, clear toggle
  1640 .3     JSR BELL          beep
  1650 .4     LDA KEYBOARD
  1660        BPL .4            wait for keypress
  1670        STA STROBE
  1671        sta       assume altchars off
  1672        plp               get altchar status
  1673        bpl exit          .PL. if altchar was off
  1674        sta alt.on        set altchars on
  1680 EXIT   JMP PAUSE         address modified by INSTALL
  1690 *--------------------------------

//gs Battery RAM and Clock/Calendar Bob Sander-Cederlof

There are 256 bytes of RAM inside the clock chip in the Apple //gs. These bytes are backed up by the same battery that keeps the clock ticking when you turn off your Apple. You can read and write the battery RAM locations, but not the same as regular RAM. You can either do it the hard way, by direct hardware, or you can do it through the built-in firmware.

First, the easy way. When you turn on your //gs, the power-up routines install a lot of stuff in RAM in banks $E0 and $E1. At the beginning of $E1 there are a lot of JMP opcodes, with long (24-bit) addresses. The one at $E10000 is a jump to the Tool Locater. The Tool Locater is simply a way to access a lot of firmware subroutines without knowing their actual addresses. Instead of calling a firmware subroutine directly, you load up a subroutine number in a register and call the single known address, $E10000.

To keep things organized, the //gs firmware designers require you to call $E10000 with the 65816 in Native mode, with a JSL $E10000. Any parameters the subroutines need must be pushed onto the stack before the JSL, and any results will be on the stack when the subroutine is finished. The carry status will indicate whether the subroutine returned an error code or not, just as in ProDOS MLI. If carry is clear, there was no error; if carry is set, there was an error and the error code is in the A-register. Regardless of the setting of the m- and x-status bits when you call $E10000, it will return with both of them zero (full 16-bit mode).

You tell the Tool Locater which "tool" to call by a code number in the X-register. This is a 16-bit value, so you must have 16-bit mode on for the X-register when you call $E10000 (x-status bit=0). It doesn't matter whether m-status is 0 or 1. The tool code is made up of a tool set number (00-FF, in the low byte) and a tool number (00-FF, in the high byte). The tool code to read all 256 bytes of battery RAM is $0A03; to write 256 bytes out to battery RAM, the tool code is $0903. The following program will read battery RAM:

  1010 *--------------------------------
  1020        .OP 65816
  1030 *--------------------------------
  1040 R      CLC
  1050        XCE
  1060        REP #$30
  1070 *--------------------------------
  1080        PEA BUF/256/256
  1090        PEA BUF
  1100        LDX ##$0A03  READ BATTERY RAM
  1110        JSR $E10000
  1120 *--------------------------------
  1130        SEC
  1140        XCE
  1150        RTS
  1160 *--------------------------------
  1170 W      CLC
  1180        XCE
  1190        REP #$30
  1200 *--------------------------------
  1210        PEA BUF/256/256
  1220        PEA BUF
  1230        LDX ##$0903  WRITE BATTERY RAM
  1240        JSR $E10000
  1250 *--------------------------------
  1260        SEC
  1270        XCE
  1280        RTS
  1290 *--------------------------------
  1300 BUF    .EQ $900
  1310 *--------------------------------

When I did this on my //gs prototype, this is what I got:

     0900-00 00 00 01 00 00 0d 06 02 01 01 00 01 00 00 00-................
     0910-00 00 07 06 02 01 01 00 00 00 00 0F 07 00 08 0B-................
     0920-01 01 00 00 00 00 01 01 05 00 00 00 03 02 02 02-................
     0930-00 00 00 00 00 00 00 0C 08 00 01 02 03 04 05 06-................
     0940-07 0A 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D-................
     0950-0E 0F FF FF FF FF FF FF FF FF FF FF FF FF FF FF-................
     0960-FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF-................
     0970-FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF-................
     0980-FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF-................
     0990-FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF-................
     09A0-FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF-................
     09B0-FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF-................
     09C0-C2 CF C2 A0 D3 C1 CE C4 C5 D2 AD C3 C5 C4 C5 D2-BOB SANDER-CEDER
     09D0-CC CF C6 A0 FF FF FF FF FF FF FF FF FF FF FF FF-LOF ............
     09E0-FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF-................
     09F0-FF FF FF FF FF FF FF FF FF FF FF FF 27 CE 8D 64-............'N.d

Those last four bytes are some kind of a check sum, handled automatically by the tool. I suppose that if the checksum is incorrect on power up, you will be popped into the configurator instead of going into a boot. You can read these bytes, but you cannot write them with the tool: the tool will calculate a checksum and write it when you write the other 252 bytes. Bytes $52 through $FB are either used by the operating system or reserved for the future. Just for fun, I have now written my own name in ASCII code into the bytes starting at $C0.

The rest of the bytes are used as shown in the following table. Where two choices are shown, separated by a slash, the left one has a code of $00 and the right choice has a code of $01.

     Port 1   Port 2
      $00      $0C     Printer/Modem
      $01      $0D     Line Length (Any/40/72/80/132)
      $02      $0E     Delete LF after CR (No/Yes)
      $03      $0F     Add LF after CR (No/Yes)
      $04      $10     Echo (No/Yes)
      $05      $11     Buffer (No/Yes)
      $06      $12     Baud Rate
      $07      $13     Data & Stop Bits
      $08      $14     Parity
      $09      $15     DCD Handshake (No/Yes)
      $0A      $16     DSR Handshake (No/Yes)
      $0B      $17     XON-XOFF Handshake (No/Yes)

     Display Parameters
       $18     Color/Monochrome
       $19     40/80 Column
       $1A     Text Color (00-0F)
       $1B     Background Color (00-0F)
       $1C     Border Color (00-0F)
       $1D     60/50 Hertz Operation
       $29     Text Language (0=English)
       $2F     Flash Rate

     Keyboard Parameters
       $2A     Language (0=English)
       $2B     Buffering (No/Yes)
       $2C     Repeat Speed
       $2D     Repeat Delay
       $30     Shift Caps-LowerCase (No/Yes)
       $31     Fast Space-Delete Keys (No/Yes)
       $32     Dual Speed (Normal/Fast)

     Slot Configuration
       $21-27  Slot 1-7 Internal/External
       $28     Boot Slot

       $1E     User Volume
       $1F     Bell Volume
       $20     System Speed (Normal/Fast)
       $2E     Double-Click Time
       $33     High Mouse Resolution
       $34     Date Format
       $35     Time Format
       $36     Min RAM for Ramdisk
       $37     Max RAM for Ramdisk
       $38-40  Count & Languages
       $41-51  Count & Layouts
       $80     AppleTalk Node Number
       $81-A1  Operating System Variables

It is possible, as I said before, to talk directly to the battery RAM via I/O addresses. If you learn how to do this, and you use the skill to write values into battery RAM, you will probably do so without properly changing the checksum. In that case you have violated your system, and your next power-up will revert to default values for all parameters. It will stay that way until you reconfigure everything and/or install a proper checksum. The best policy is to use the standard firmware tools for all reading and writing, so that the checksum stays current.

You do not have to read or write the whole battery RAM at once. There are two tools for reading and writing a single byte. Tool Code $0B03 will write one byte, and tool code $0C03 will read one byte. The following code segments illustrate how to do it. The code as shown must be in Native Mode, with both x- and m-bits zero (full 16-bit mode).

   WR  PEA $00xx     xx is new value for byte
       PEA $00yy     yy is address in battery RAM
       LDX ##$0B03   write xx at yy
       JSL $E10000
   RD  PEA $0000     make room for result
       PEA $00yy     yy is address in battery RAM
       LDX ##$0C03   read value at yy
       JSL $E10000
       PLA           get result from stack (00xx)

The Clock/Calendar Chip not only contains the battery RAM; it also contains the date and time information, naturally. There are three tools for reading and writing the time and date. You can read time/date in either hexadecimal format or as an ASCII string, and you can write a new time/date in a hex format. The following code segments illustrate how to use the tools.


       PEA BUFFER/256/256  Hi 16-bits of buffer address
       PEA BUFFER          Lo 16-bits of buffer address
       LDX ##$0F03         Tool Code
       JSL $E10000

The date and time will be converted to ASCII (with msb = 1) and stored in BUFFER, according to the formats selected in the configuration menu (stored in battery RAM locations $34 and $35). The most likely choice among North Americans will be the format "mm/dd/yy HH:MM:SS xM", but you have five other possibilities.


       PEA 0   Make room for 8 bytes
       PEA 0   to be returned
       PEA 0
       PEA 0
       LDX ##$0D03
       JSL $E10000
       PLA     Get $MMSS (minutes, seconds)
       STA MMSS
       PLA     Get $yyHH (year, hours)
       STA YYHH
       PLA     Get $mmdd (month, day)
       STA MMDD
       PLA     Get day of week (in low byte)
       STA DOW

The value for day of week runs from 0 to 6, with 0=Sunday. The value for "day" is 0-30, meaning that you have to add 1 to get the true day number. (Why? This is a little ridiculous!) Likewise, the value for month is 0-11, with 0 standing for January. (I can understand why the hardware might work with 0-based values for day and month, but why couldn't the firmware do the correction to "real" day and month numbers?) The year is specified as the actual year number minus 1900. I hope that means my //gs will still give correct dates after 1999. If the value of the "yy" byte can go all the way to 255, then we could use //gs until the end of the year 2155. Frankly, I think I'll get tired of computers before then.

To write a new date and time out to the Clock chip, you have to push the values onto the stack and call the tool:


       PEA $mmdd    month, day
       PEA $yyHH    year, hour
       PEA $MMSS    minute, second
       LDX ##$0E03
       JSR $E10000

Again, the month and day values are zero-based. Note that you cannot update the day-of-week directly; apparently it is only a CALCULATED value provided when you READ the date/time in hex format.

You might wonder whether anyone would really NEED all the above information. After all, Apple has provided the configuration system to see/modify all those parameters. The problem is you cannot really use that system unless you can SEE. A lot of Apple owners are not able to see, so they use the ECHO or other some other brand of speech synthesizer to speak everything that goes out to the screen. The configuration program cannot be made to speak, as it is now written. Larry Skutchan is planning to write some sort of speaking version of the configurator, and the information above is just what he needs.

Apple Assembly Line is published monthly by S-C SOFTWARE CORPORATION, P.O. Box 280300, Dallas, Texas 75228. Phone (214) 324-2050. Subscription rate is $18 per year in the USA, sent Bulk Mail; add $3 for First Class postage in USA, Canada, and Mexico; add $14 postage for other countries. Back issues are available for $1.80 each (other countries add $1 per back issue for postage).

All material herein is copyrighted by S-C SOFTWARE CORPORATION, all rights reserved. (Apple is a registered trademark of Apple Computer, Inc.)