Volume 5 -- Issue 3 December, 1984 Apple Assembly Line - V5N3 - Dec 1984 Apple Assembly Line
Volume 5 -- Issue 3 December 1984

In This Issue...

New Source for 65802's

I talked to Constantine Geromnimon at Alliance Computers this morning. His company has ordered hundreds of 65802's, and offers them to you at $49.95 each. They expect their next shipment to come in around the middle of January, so now is the time to order. Call them at (718) 672-0684, or write to P. O. Box 408, Corona, NY 11368.

EPROM Programmer

A new EPROM Programmer, called the PROmGRAMER, is out from SCRG (the makers of quikLoader). This one burns anything from 2716's up to 27256's, and retails at $149.50. We'll sell 'em to you for a nice round $140. The software comes on disk, with instructions for loading it into EPROM for the quikLoader card.

Tom Weishaar Writes Again!

If you are among the throng who mourn the passing of Softalk, and particularly of the many informative columns such as DOStalk by Tom Weishaar, you will be as glad as I am that Tom has started publishing his own monthly newsletter.

Called "Open-Apple", you can subscribe for $24. In an unprecedented move toward international goodwill and the wholesome exchange of information, Tom has set the price the same for everyone, everywhere. We promptly sent him a check. If you love your Apple, do likewise. Send to Open-Apple, 10026 Roe, Overland Park, Kansas 66207. If you are cautious, send no money; Tom will bill you with the first issue, and you can cancel if you lose interest.

18-Digit Arithmetic, Part 8 Bob Sander-Cederlof

Someone pointed out last week that this series is getting a little long. Well, we are nearing the end. What we are doing is probably unprecedented in the industry: listing the source code and explaining it for a large commercially valuable software product. It takes time and space to break precedents.

This month's installment completes the normal set of math functions, with sine, cosine, and arc tangent. We even slipped in a simple form of the tangent function. Still to come are the formatted INPUT and PRINT routines.

Some Elementary Info:

Trigonometry is a frightening word. (If it doesn't scare you, skip ahead several paragraphs.) The "-ometry" refers to measurement, but what is a "trigon". Believe it or not, "trigon" is another name for a triangle. Trigon means three sides, and figures with three sides just happen to also have three angles. "Trig" (a nice nickname) is a branch of mathematics dealing with triangles, without which we could not fly to the moon, draw a map, or build bridges. Strangely enough, much of electronics also uses trig funtions ... are electrons triangular?

When I took trig in high school, long before the day of personal calculators, we used trig tables. (These were not articles of furniture made in the local woodshop, but rather long lists of strange numbers printed and bound into books.) The tables contained values for various ratios of the sides of a triangle having one 90-degree angle. Now we use calculators or computers, but obviously the trig tables would not fit in them. Instead, approximation formulas are used.

In high school, we talked about six different ratios: sine, cosine, tangent, cotangent, secant, and cosecant. When it is all boiled down, we really only need the sine; all the rest are derivable from those. The sine function gives a a number for any angle. We frequently need to be able to go from a trig value back to an angle, and the most useful function for that is called the inverse tangent, or arctangent.

Even though I have been talking about triangles, trig functions are even more related to circles. We compute functions of the angle between any two radii, like the hands on an old-fashioned, pre-digital wrist watch. When we start talking about circles, we get into radians vs. degrees.

Just as scientists like logarithms to the base e (rather than 10), they also like trig functions based on angles expressed in radians, rather than degrees. Degrees were invented back in Babylon, I understand, and are nice and clean: 360 make a complete circle. Radians are not clean: 360 degrees is two-times-pi radians. Nevertheless, many physical and electronic formulas simplify when angles are expressed in radians. Consequently, calculators and computer languages usually expect your angles to be expressed in radians. Some allow both options. Applesoft expects radians, and so do my DP18 programs.

We commonly think of an angle as being somewhere between 0 and 360 degrees, or the equivalent range in radians. However, angles can actually be any number, from -infinity to +infinity. The numbers beyond one complete circle are valid, but they don't buy much. If you stand in one place and spin around 1445 degrees (4*360 + 5) you will end up pointing the same direction as if you merely swiveled 5 degrees. Therefore the first step in a sine function calculation involves subtracting out all the multiples of a full circle from the angle.

The arctangent function could return an infinite number of answers, but that is impractical. We will return only the principal value, which is the one closest to 0. All others are that value plus or minus any number of full circles. In DP18 the ATN function may have one or two arguments. If you only have one argument, the result will be an angle between -pi/2 and +pi/2. If you specify two arguments, a value between -pi and +pi will be returned.

The Nitty-Gritty:

Enough of this preliminary stuff, let's get into the code. In the listing which follows, you will find entries for four functions: SIN, COS, TAN, and ATN.

Perhaps the easiest is the TAN function, at lines 2530-2630. Since tan=sin/cos, that is all this code does. We lose a little speed and possibly some precision with this simplistic solution, but the TAN function is relatively rarely called.

Next in difficulty is the COS function, lines 1630-1710. Since cos(-x)=cos(x), we start by making the sign positive (lines 1690-1700. Since cos(x)=sin(x+pi/2), we add pi/2 and fall into the SIN function. Simple, but effective.

The SIN function gets more interesting. For very very small angles, within the precision of 20 digits, sin(x)=x. Lines 1780-1810 check for exponents below -10; all angles smaller than 10^-10 are small enough that sin(x)=x.

Next we take advantage of the fact that sin(-x)=-sin(x), at lines 1820-1860. We remember the sign by shoving it on the stack, and force the sign of x positive.

Lines 1870-1950 get the principal angle. I divide x by twopi, and throw away the integral part. The fractional part that remains is a fraction of a full circle, a value between 0 and .999999...9 (not radians, and not degrees either). Note that if x was extremely large there will be no fractional part, and the remainder will be zero. Some SIN function calculators give an error message when this happens, but I chose to let it ride.

Lines 1960-2000 multiply the circle-fraction by four. This gives a number between 0 and 3.99999...9, which I will refer to later as the "circle fraction times four", or c-f-t-f. The integer part is effectively a quadrant number, and the fractional part a fraction within the quadrant:

               1 | 0
               2 | 3

Lines 2010-2030 determine if the angle is in the first (0) quadrant. If so, no folding need be done.

Lines 2040-2070 determine if the angle is in the second (1) quadrant. If so, we skip ahead to apply the fact that sin(pi/2 + x) = sin(pi/2 - x).

Lines 2080-2160 are executed if the angle is in the 3rd or 4th quadrants (integral part is 2 or 3). Here I apply the fact that sin(pi+x)=-sin(x). I pull the saved sign off the stack, complement it, and shove it back on (lines 2090-2110). Then I subtract 2 from the c-f-t-f, yielding a number between 0 and 1.99999...9. We have folded the third and fourth quadrants over the first and second quadrants. Next lines 2170-2190 determine if the result was in the first quadrant or not.

Lines 2200-2240 fold a second quadrant number into the first quadrant, by applying the fact that sin(pi/2+x) = sin(pi/2-x). Subtacting the c-f-t-f from 2 flips us into the first quadrant.

Lines 2260-2270 pull the sign off the stack and make it the sign of the angle. Remember that now the angle is a fraction (between 0 and .99999...9) of a quadrant. After all these folding operations, the angle might again be very very small, so lines 2280-2300 check for that possibility. If so, sin(x)=x, but that is only true when x is in radians. Lines 2490-2520 convert the quadrant-fraction to radians by multiplying by pi/2, and exits.

Lines 2310-2470 handle larger angles by computing x*P/Q, where P and Q are polynomials in x^2. The constants for P and Q are given in lines 1420-1550, and come from the Hart book. [ I should mention here that I wrote those constants with pretty periods separating groups of five digits. This will not assemble in some older versions of the S-C Macro Assembler. If you get a syntax error, just leave out the periods. ]

Turning the Tables:

ATN is hardest to compute. First we have to deal with the two variants of calls, having one or two arguments. While all the previous function programs were called with the argument already in DAC, DP.ATN is called immediately after parsing the ATN-token. Lines 2960-3070 parse and process the following parentheses and whatever is between them.

Lines 2960-2970 require an opening parenthesis. Line 3070 requires the closing parenthesis. In between we expect one expression, or two expressions separated by a comma. If there is only one, we fake a second one (= 1.0).

What are the two arguments? Looking at a cartesian system, with the vector shown below, the arguments are (Y,X). If you call with one argument, it is (Y/X). vector showing x and y

By using two separate arguments, rather than just the ratio, we can tell which of the four quadrants the vector was in. DP.ATAN will return a value between -pi and +pi, depending on the two signs. If you specify only the ratio, DP.ATAN will return a value between 0 and +pi depending on the sign.

Lines 3120-3160 save the two signs in bits 6 and 7 of UV.SIGN. Way at the end, lines 4100 and following, UV.SIGN determines the final value. If the sign of the denominator (X-vector) was negative, the composite vector is in the 2nd or 3rd quadrant: computing pi - angle gives a result between pi/2 and pi.

If the numerator (Y-vector) was negative, the composite vector is in the 3rd or 4th quadrant. Flipping the sign gives a result between 0 and -pi.

Lines 3180-3220 check for special cases of Y=0 or X=0. If the first argument (Y-vector) is zero, the angle is 0 or pi depending on the sign of the second argument. If the second argument (X-vector) is zero, the angle is either +pi/2 or -pi/2, depending on the sign of the first argument. What if both arguments are zero? That should produce an error message, but I am overlooking it: I will return an angle of 0 in this case.

If neither argument is zero, some special checks are made to see if the value of the ratio is very small or very large. I check before actually dividing, so the divide routine won't kick out on an overflow error. If the ratio would be greater than 10^20, I return a value of pi/2. This is accurate within the precision of DP18. On the other hand, if the ratio is smaller than 10^-63 I return 0. If neither extreme is true, I go ahead an divide to get the actual ratio. Then I check for an extremely small ratio, in which case atan(x)=x.

If we find our way down to line 3390, the ratio is between 10^-10 and 10^20. That is still too large a range for comfort, so we apply the fact that atan(1/x) = atan(pi/2 - x). If the ratio of Y/X is greater than 1.0, then we take the reciprocal and remember that we did so. This in effect folds the range at pi/4. The resulting argument range is between 10^-10 and 1. The variable N holds either 0 or 2 as a flag: 0 if we were already under 1, 2 if we formed the reciprocal.

The shape of the curve of the arctangent function between 0 and 1 (an angle between 0 and pi/4) is deceptive. It looks nice and easy, but a polynomial over that range with 20 digits of precision is much too long. We can easily reduce the range still further by applying another identity. If the reduced argument is now already below tan(pi/12), fine. If not, calculating (x*sqr(3)-1) / (sqr(3)+x) will bring it into that range. If we have to apply that formula, N will be incremented (making it 1 or 3).

The curve between 0 and tan(pi/12) looks almost like a straight line to the naked eye, but it really is far from straight. It takes a ratio of the form P/xQ where P and Q are polynomials in x^2. The coefficients are given in lines 2650-2770, again from Hart. The ratio is computed in lines 3800-3960.

Lines 3970-4080 start the unfolding process. The variable N is either 0, 1, 2, or 3 by this time. If N is 0, no folding was done. If N is 1, only folding above pi/12 was done. If N is 2, only folding above pi/4 was done. If N is 3, both folds were done. These lines convert the angle back to the correct value, using a table of addends and an optional sign flip:

       N       unfolding formula
       0       none
       1       pi/6 + x
       2       pi/2 - x
       3       pi/2 - ( pi/6 + x) = pi/3 - x

That's it! We already discussed the code beyond line 4100, which figures out which quadrant the angle is in.

Any questions?
  1000 *SAVE S.DP18 TRIG
  1010 *--------------------------------
  1020 AS.CHRGET  .EQ $B1
  1030 AS.CHRGOT  .EQ $B7
  1050 AS.CHKOPN  .EQ $DEB8
  1060 *--------------------------------
  1070 POLY.1     .EQ $FFFF
  1080 POLY.N     .EQ $FFFF
  1090 DADD       .EQ $FFFF
  1100 DSUB       .EQ $FFFF
  1110 DMULT      .EQ $FFFF
  1120 DDIV       .EQ $FFFF
  1130 DP.INT     .EQ $FFFF
  1140 DP.EXP     .EQ $FFFF
  1150 DP.TRUE    .EQ $FFFF
  1160 DP.FALSE   .EQ $FFFF
  1180 MOVE.YA.ARG.1  .EQ $FFFF
  1190 MOVE.YA.DAC.1  .EQ $FFFF
  1300 *--------------------------------
  1310 DAC.EXPONENT   .BS 1
  1320 DAC.HI         .BS 10
  1330 DAC.SIGN       .BS 1
  1340 *--------------------------------
  1350 ARG.EXPONENT   .BS 1
  1360 ARG.HI         .BS 10
  1370 ARG.SIGN       .BS 1
  1380 *--------------------------------
  1390 N          .BS 1
  1400 UV.SIGN    .BS 1
  1410 *--------------------------------
  1420 P.SIN      .EQ *
  1430 P.SIN.N    .EQ 6    P6*X^6 + P5*X^5 + ... + P1*X + P0
  1440            .HS 3C.50312.63884.64664.12845   P6
  1450            .HS BE.82818.08039.29577.39110   P5
  1460            .HS 40.62919.63490.93113.55230   P4
  1470            .HS C2.25642.44036.60338.57070   P3
  1480            .HS 43.53892.64053.57788.76289   P2
  1490            .HS C4.49326.67470.47152.36677   P1
  1500            .HS 45.12596.16380.91365.41816   P0
  1510 *--------------------------------
  1520 Q.SIN      .EQ *
  1530 Q.SIN.N    .EQ 2    X^2 + Q1*X + Q0
  1540            .HS 43.15743.43316.33194.13935   Q1
  1550            .HS 44.80189.66936.87727.15787   Q0
  1560 *--------------------------------
  1570 CON.ONE    .HS 41.10000.00000.00000.00000
  1580 CON.TWO    .HS 41.20000.00000.00000.00000
  1590 CON.2PI    .HS 41.62831.85307.17958.64769
  1600 CON.PI.2   .HS 41.15707.96326.79489.66192
  1610 CON.PI     .HS 41.31415.92653.58979.32385
  1620 CON.1..2PI .HS 40.15915.49430.91895.33577   1/2PI
  1630 *--------------------------------
  1640 *      COS (DAC)
  1650 *--------------------------------
  1660 DP.COS LDA #CON.PI.2     PI/2
  1670        LDY /CON.PI.2
  1680        JSR MOVE.YA.ARG.1 COS(X) = SIN(X+PI/2)
  1690        LDA #0            GET ABS(DAC) TO FORCE
  1700        STA DAC.SIGN      ...COS(-X)=COS(X)
  1710        JSR DADD
  1720 *--------------------------------
  1730 *      SIN (DAC)
  1740 *      #3371
  1750 *--------------------------------
  1760 DP.SIN
  1770 *---IF X VERY SMALL...-----------
  1780        LDA DAC.EXPONENT
  1790        CMP #$40-10
  1800        BCS .1            NOT VERY SMALL
  1810        RTS               VERY SMALL, SIN(X)=X
  1820 *---ADJUST FOR SIGN OF X---------
  1830 .1     LDA DAC.SIGN      SIN(-X) = - SIN(X)
  1840        PHA               ...SO SAVE SIGN OF X
  1850        LDA #0            ...AND MAKE X POSITIVE
  1860        STA DAC.SIGN
  1870 *---X*(1/2PI)--------------------
  1880        LDA #CON.1..2PI
  1890        LDY /CON.1..2PI
  1900        JSR MOVE.YA.ARG.1
  1910        JSR DMULT
  1920 *---GET FRACTIONAL PART----------
  1930        JSR MOVE.DAC.ARG
  1940        JSR DP.INT
  1950        JSR DSUB
  1960 *---FOLD QUADRANTS INTO ONE------
  1980        JSR DADD          BY DOUBLING TWICE
  1990        JSR MOVE.DAC.ARG
  2000        JSR DADD          0 <= DAC < 4
  2010        LDA DAC.EXPONENT  IS DAC < 1?
  2020        CMP #$41
  2030        BCC .4            ...YES, IT IS IN 1ST QUADRANT
  2040 *---2ND, 3RD, OR 4TH-------------
  2050        LDA DAC.HI
  2060        CMP #$20          IS DAC < 2.0?
  2070        BCC .3            ...YES, 1ST OR 2ND QUADRANT
  2080 *---FOLD 3RD-4TH OVER 1ST-2ND----
  2090        PLA               ...NO, FLIP SIGN FOR
  2100        EOR #$80                 3RD OR 4TH QUADRANTS
  2110        PHA
  2120        LDA #CON.TWO      FOLD 3RD & 4TH OVER 1ST & 2ND
  2130        LDY /CON.TWO
  2140        JSR MOVE.YA.ARG.1
  2150        JSR SWAP.ARG.DAC
  2160        JSR DSUB
  2170        LDA DAC.EXPONENT
  2180        CMP #$41
  2190        BCC .4            ...ALREADY IN 1ST
  2200 *---FOLD 2ND OVER 1ST------------
  2210 .3     LDA #CON.TWO      LET X=2-X
  2220        LDY /CON.TWO
  2230        JSR MOVE.YA.ARG.1
  2240        JSR DSUB
  2250 *---ANGLE NOW IN 1ST QUADRANT----
  2260 .4     PLA               PUT FINAL SIGN ON X
  2270        STA DAC.SIGN
  2290        CMP #$40-9
  2300        BCC .5            ...YES, SIN(X)=X*PI/2
  2320        JSR MOVE.DAC.TEMP1      X IN TEMP1
  2330        JSR DMULT               X*X IN TEMP2
  2340        JSR MOVE.DAC.TEMP2
  2350        LDA #P.SIN
  2360        LDY /P.SIN
  2370        LDX #P.SIN.N
  2380        JSR POLY.N
  2390        JSR MOVE.DAC.TEMP3
  2400        LDA #Q.SIN
  2410        LDY /Q.SIN
  2420        LDX #Q.SIN.N
  2430        JSR POLY.1
  2440        JSR MOVE.TEMP3.ARG
  2450        JSR DDIV               P/Q
  2460        JSR MOVE.TEMP1.ARG     XP/Q
  2470        JMP DMULT
  2480 *--------------------------------
  2490 .5     LDA #CON.PI.2     FOR VERY SMALL X
  2500        LDY /CON.PI.2     SIN(2X/PI) = X*PI/2
  2510        JSR MOVE.YA.ARG.1
  2520        JMP DMULT
  2530 *--------------------------------
  2540 *      TAN (DAC) = SIN(DAC) / COS(DAC)
  2550 *--------------------------------
  2570        JSR DP.SIN          TAN=SIN/COS
  2600        JSR SWAP.ARG.DAC
  2610        JSR DP.COS          GET COSINE
  2620        JSR POP.STACK.ARG   GET SIN
  2630        JMP DDIV            SIN/COS
  2640 *--------------------------------
  2650 P.ATN      .EQ *    HART # 5505
  2660 P.ATN.N    .EQ 3    P3*X^3 + P2*X^2 + P1*X + P0
  2670            .HS 42.12595.80226.30295.47240   P3
  2680            .HS 43.12557.91664.37980.65520   P2
  2690            .HS 43.29892.80380.69396.22448   P1
  2700            .HS 43.19720.30956.84935.02854   P0
  2710 *--------------------------------
  2720 Q.ATN      .EQ *
  2730 Q.ATN.N    .EQ 4    X^4 + Q3X^3 + Q2X^2 + Q1X + Q0
  2740            .HS 42.37066.08632.20190.23801   Q3
  2750            .HS 43.20769.26817.33604.63361   Q2
  2760            .HS 43.36466.24032.97707.76242   Q1
  2770            .HS 43.19720.30956.84935.02861   Q0
  2780 *--------------------------------
  2790 ATN.TBL.H
  2800        .DA /CON.PI.6
  2810        .DA /CON.PI.2
  2820        .DA /CON.PI.3
  2830 ATN.TBL.L
  2840        .DA #CON.PI.6
  2850        .DA #CON.PI.2
  2860        .DA #CON.PI.3
  2870 *--------------------------------
  2880 CON.TAN.PI.12 .HS 40.26794.91924.31122.70647
  2890 CON.PI.6   .HS 40.52359.87755.98298.87308
  2900 CON.PI.3   .HS 41.10471.97551.19659.77462
  2910 CON.SQR.3  .HS 41.17320.50807.56887.72935
  2920 *--------------------------------
  2930 *      ATN FUNCTION
  2940 *      1 OR 2 ARGUMENTS
  2950 *--------------------------------
  2970        JSR AS.CHKOPN CHECK FOR (
  2990        JSR PUSH.DAC.STACK
  3010        JSR AS.CHRGOT
  3020        CMP #',      TWO-ARG?
  3030        BNE .1       NO
  3040        JSR AS.CHRGET GOBBLE ,
  3050        JSR DP.EXP   YES,GET OTHER ONE
  3070        JSR AS.CHKCLS     REQUIRE ")"
  3080 *--------------------------------
  3090 *      ATN (ARG,DAC)  ARG/DAC
  3100 *--------------------------------
  3110 DP.ATAN
  3120        LDA DAC.SIGN      SAVE BOTH SIGNS
  3130        ASL               SIGN OF DENOMINATOR
  3150        ROR               BIT 7 = DENOM SIGN
  3160        STA UV.SIGN       BIT 6 = NUMER SIGN
  3170 *---CHECK FOR BOUNDARIES---------
  3190        BEQ .1            ...V/0, SO RETURN PI/2
  3200        SEC
  3210        LDA ARG.EXPONENT
  3220        BEQ .12           ...0/U, SO RETURN 0
  3230        SBC DAC.EXPONENT
  3240        BMI .13
  3250        CMP #20           IF >10^20, RETURN PI/2
  3260        BCC .11           ...NOT >10^20
  3270 .1     LDA #CON.PI.2     V/0 OR OVERFLOW
  3280        LDY /CON.PI.2     SO RETURN PI/2
  3290        JSR MOVE.YA.DAC.1
  3300        JMP DP.ATN.C
  3310 .13    CMP #-63          IF <10^-63, RETURN 0
  3320        BCS .11
  3330 .12    JSR DP.FALSE      RETURN 0
  3340 .14    JMP DP.ATN.B
  3350 .11    JSR DDIV          CALCULATE V/U
  3360        LDA DAC.EXPONENT
  3370        CMP #$40-10       IF X VERY SMALL, ATAN(X)=X
  3380        BCC .14           ...VERY SMALL INDEED!
  3390 *---FOLD AT PI/4-----------------
  3400        LDA #0            GET ABS(X), BECAUSE
  3420        STA N
  3430        LDA DAC.EXPONENT  IS X<1?
  3440        CMP #$41
  3450        BCC .3            ...YES, X<1
  3460        LDA #CON.ONE      FORM RECIPROCAL
  3470        LDY /CON.ONE
  3480        JSR MOVE.YA.ARG.1
  3490        JSR DDIV          1/X
  3500        LDA #2            AND REMEMBER WE DID IT
  3510        STA N
  3520 *---FOLD AT PI/12----------------
  3530 .3     JSR MOVE.DAC.TEMP1 SAVE X
  3540        LDA #CON.TAN.PI.12 TAN(PI/12)
  3550        LDY /CON.TAN.PI.12
  3560        JSR MOVE.YA.ARG.1
  3570        JSR DSUB           IS X>TAN(PI/12)?
  3580        LDA DAC.SIGN
  3590        PHA
  3610        PLA
  3620        BPL .4             ...NO, WE DON'T HAVE TO FOLD
  3630        INC N              ...YES, SO FORM
  3640        LDA #CON.SQR.3     (X*SQR(3)-1) / (SQR(3)+X)
  3650        LDY /CON.SQR.3
  3660        JSR MOVE.YA.ARG.1
  3670        JSR DMULT          X*SQR(3)
  3680        JSR MOVE.DAC.ARG
  3690        JSR DP.TRUE
  3700        JSR DSUB           X*SQR(3)-1
  3710        JSR MOVE.DAC.TEMP2 SAVE IT
  3720        JSR MOVE.TEMP1.ARG GET X
  3730        LDA #CON.SQR.3
  3740        LDY /CON.SQR.3
  3750        JSR MOVE.YA.DAC.1
  3760        JSR DADD           SQR(3)+X
  3770        JSR MOVE.TEMP2.ARG
  3780        JSR DDIV           THE ANSWER
  3800 *---ATAN(0...PI/12)--------------
  3810 .4     JSR MOVE.DAC.ARG
  3820        JSR DMULT          X^2
  3830        JSR MOVE.DAC.TEMP2 SAVE X^2
  3840        LDA #P.ATN
  3850        LDY /P.ATN
  3860        LDX #P.ATN.N
  3870        JSR POLY.N
  3880        JSR MOVE.DAC.TEMP3
  3890        LDA #Q.ATN
  3900        LDY /Q.ATN
  3910        LDX #Q.ATN.N
  3920        JSR POLY.1
  3930        JSR MOVE.TEMP3.ARG GET P
  3940        JSR DDIV           P/Q
  3950        JSR MOVE.TEMP1.ARG GET X
  3960        JSR DMULT          P(X^2)/Q(X^2)*X
  3970 *---UNFOLD FROM PI/12, PI/4------
  3980        LDX N             0, 1, 2, OR 3
  3990        BEQ DP.ATN.B      ...NO ADDEND
  4000        DEX               0, 1, OR 2
  4010        BEQ .5            ...NO COMPLEMENT
  4020        LDA DAC.SIGN      ATAN(1/X)=ATAN(PI/2 - X)
  4030        EOR #$80
  4040        STA DAC.SIGN
  4050 .5     LDA ATN.TBL.L,X   GET A(N)
  4060        LDY ATN.TBL.H,X
  4070        JSR MOVE.YA.ARG.1
  4080        JSR DADD          X + A(N)
  4090 *---UNFOLD INTO QUADRANTS--------
  4100 DP.ATN.B
  4120        BPL DP.ATN.C      ...POSITIVE, 1ST OR 4TH
  4130        LDA #CON.PI       ...NEGATIVE, 2ND OR 3RD
  4140        LDY /CON.PI           SO DO PI-X
  4150        JSR MOVE.YA.ARG.1
  4160        JSR DSUB
  4170 *--------------------------------
  4180 DP.ATN.C
  4200        BVC .6            ...POSITIVE, 1ST OR 2ND
  4210        LDA DAC.SIGN      ...NEGATIVE, 3RD OR 4TH
  4220        EOR #$80          -X
  4230        STA DAC.SIGN
  4240 .6     RTS
  4250 *--------------------------------

More Detail on Using 65C02's in old Apples Andrew Jackson

In recent issues of AAL there have been several articles on the 65C02 and how to get it running in the Apple II+. I too was keen to get a 65C02 working in my machine, and had spent some time trying to get first a 1MHz part and then a 2MHz part to work.

William D. O'Ryan's letter in the June 84 AAL prompted me to try again and I am happy to report that the modification he described does work (replacing the LS257's at B6 and B7 with F257's). I wanted to find exactly why I could not simply substitute a 65C02 for a 6502, and so I spent some time looking at the circuit and specifications, using an oscilloscope to check my results.

The reasons that I eventually came up with are as follows. The Apple II circuit relies on various 'features' of the 6502 so that all the various parts of the Apple will work. The circuit diagram shows that the system timing is derived from o/0; the 6502 actually expects system timing to be derived from o/2. There is a slight delay between these two signals: on a 6502 it is about 50ns and on a 65C02 it is about 30ns. This difference in delays is what causes the problems when fitting a 65C02.

To simplify its circuit design the Apple uses a rather dirty trick when reading data from RAM memory. Normally when the 6502 reads data it expects the data on the bus to be valid 100ns before the end of o/2, and it latches the data into its internal registers when o/2 changes. The setup time allows the data bus to settle into a consistent state before being read. The Apple reduces the setup time to about 45 ns (worst case). This setup time would be ample for the 65C02 were it not for the shift between o/0 and o/2; this shift reduces the setup time to 25ns. A 2MHz 65C02 specifies a MINIMUM 40ns setup time; obviously there is a -15ns tolerance on the setup time, and hence the processor works erratically when timings fall into worst case conditions.

The tolerance is regained by substituting 74F257's for the two 74LS257's at board locations B6 and B7. These two chips multiplex the RAM data and the keyboard data; in doing so they add a delay of 30ns worst case to the data. By substituting F257's, the added delay is reduced to 5 ns; this changes the tolerance on the data setup time from -15ns to +10ns.

The Apple //e must use a slightly modified technique when reading data from RAM which explains why a 65C02 works in it without any modifications. I cannot check this as I do not have a //e circuit description. Anyway, it is probably all inside the MMU chip.

[ The 65816 specifications state a minimum read data setup time of 50ns, 10ns longer than the 65C02. One AAL reader has called us to report that the 65802 works wonderfully well in his old II+, even better than the original 6502. Some of you have wondered where to get the F257's: try Jameco Electronics, 1355 Shoreway Road, Belmont, CA 94002, phone (415) 592-8097. Their ad in Byte, Dec '84, page 349, says they have 74F257's at $1.79 each. (editor) ]

Gary Little's New Book, "Inside the Apple //e" Bob Sander-Cederlof

This is a useful book. The kind you want to keep, read, and constantly use as a reference. About 400 pages thick, 6x9, published by Brady Communications at $19.95.

Gary, a lawyer in Vancouver, has been serious about Apples since 1978 (almost as long as me). He's a long-time subscriber to AAL, Call APPLE, and other sources of the in-depth knowledge crammed into his book. He's also a programmer, with serious software on the market such as "Modem Magician". He knows what he's writing about, and writes it well.

A walk through the chapters may be the quickest way to get the measure of the book.

1--condensed history of Apple; intro. to binary, hex, and assembly language.

2--inside the 6502 itself: zero page, stack, registers, status, opcodes, address modes, I/O, interrupts, and the memory layout in the //e.

3--the Apple monitor: the commands explained, plus a table of the most useful subroutines in the monitor ROM.

4--Applesoft: memory map, tokenization, variable storage, integer and real numbers, the CHRGET subroutine, linking to assembly language programs, subroutines in ROM, and more.

5--DOS: internal structure, memory map, page 3 vectors, VTOC, catalog, track/sector lists, RWTS, and a read.sector program. ProDOS: memory map, page 3 vectors, volume bit map, directory, MLI, and a read.block program.

6--character input and the keyboard: RDKEY, 80-column firmware, RDCHAR, reading a line, changing input devices, encoding of keys, auto-repeat, type-ahead, all about RESET.

7--character and graphic output: too much to list here, all the way through double hi-res.

8--memory management: bank switching of ROM and RAM, auxiliary RAM, running co-resident programs.

9--speaker and cassette ports: music and voice.

10--game port: experiments, push button inputs, annunciators, strobe.

11--peripheral slots: I/O memory locations, slot ROM, expansion ROM, scratchpad RAM, auxiliary slot, software protocols.

Many useful and interesting programs are listed in the book. There is an optional diskette available (coupon bound in the book offers it for $20). The diskette also includes a few bonus utility programs for use with DOS 3.3, including RAMDISK and DISK MAP.

Each chapter ends with a bibliography of related books, manuals, and articles. (You'll find lots of references to AAL.)

If you grew along with Apple, as I did, you probably don't really need this book. On the other hand, you will still enjoy it, and probably want it for you collection. If you are relatively new, and having difficulty gathering all the information from past publications and scattered sources, you will want Gary's book too.

As you might suspect, we like the book so well we have decided to stock it. You can get from us for $18 plus shipping (and tax where applicable).

Correction re MVN and MVP in 65802 Bob Sander-Cederlof

In the October AAL I presented a general memory mover written in 65802 code. I stated that the MVP and MVN instructions took 3 cycles-per-byte during the move. I was wrong.

In looking through small tiny print in the preliminary documentation for the chip, I came across the number "7". Shocked, I wrote a little test program which moved 10000 bytes 1000 times. That means the MVN in my test would move a total of 10,000,000 bytes. With a stop watch I clocked the running time at just under 70 seconds. If it had been 3 cycles-per-byte, the test would have run in 30 seconds.

I don't know how I got that "3" in my head, but the right number is "7". Still considerably faster than 6502, though.
  1010        .OP 65816
  1020        .OR $300
  1030 *--------------------------------
  1040 CNTR   .EQ 0 AND 1
  1050 *--------------------------------
  1060 MVN.TIMER
  1070        CLC          65816 MODE
  1080        XCE
  1090        REP #$30     16-BIT MODE
  1100 *--------------------------------
  1110        LDA ##1000
  1120        STA CNTR
  1130 *--------------------------------
  1140 .1     LDX ##$3000  Source start address
  1150        LDY ##$4000  Destination start address
  1160        LDA ##9999   # Bytes - 1
  1170        MVN 0,0
  1180        DEC CNTR
  1190 *      BNE .1
  1200 *--------------------------------
  1210        SEC          RETURN TO 6502 MODE
  1220        XCE
  1230        RTS

Strange Way to Divide by 7 Bob Sander-Cederlof

Division by seven is a necessary step for hi-res plotting routines. The quotient is the byte index on a given scan line. The remainder gives the bit position within that byte.

The hi-res code inside the Applesoft ROMs uses a subtraction loop to divide by seven, which can loop up to 36 times at 7 cycles per loop. This is a maximum of over 250 cycles, which is why super-fast hi-res usually uses lookup tables for the quotient and remainder.

I stumbled on a faster way of dividing any value up to 255 by seven. This is not directly usable by standard hi-res, because the x-coordinate can be as large as 279. My trick also does not give the remainder, just the quotient.

Here is the program, along with a test routine which tries every value from 0 to $FF, printing the quotient. The output from the test program is also shown, and you can see that the quotient is correct in every case. Can you explain why it works?

[ Hint: 1/7 = 1/8 + 1/64 + 1/512 + 1/4096 + ... ]
  1010 *--------------------------------
  1020 BYTE   .EQ 0
  1030 *--------------------------------
  1040 T      LDA #0
  1050        STA BYTE
  1060 .2     LDX #14
  1070 .1     CPX #7
  1080        BNE .4
  1090        LDA #$A0
  1100        JSR $FDED              0000000000000000 0101010101010101
  1110 .4     JSR DIVIDE.BY.SEVEN    0202020202020202 0303030303030303
  1120        JSR $FDDA              0404040404040404 0505050505050505
  1130        INC BYTE               0606060606060606 0707070707070707
  1140        BEQ .3                 0808080808080808 0909090909090909
  1150        DEX                    0A0A0A0A0A0A0A0A 0B0B0B0B0B0B0B0B
  1160        BNE .1                 0C0C0C0C0C0C0C0C 0D0D0D0D0D0D0D0D
  1170        JSR $FD8E              0E0E0E0E0E0E0E0E 0F0F0F0F0F0F0F0F
  1180        JMP .2                 1010101010101010 1111111111111111
  1190 .3     RTS                    1212121212121212 1313131313131313
  1200 *---------------------------  1414141414141414 1515151515151515
  1210 DIVIDE.BY.SEVEN               1616161616161616 1717171717171717
  1220        LDA BYTE               1818181818181818 1919191919191919
  1230        LSR                    1A1A1A1A1A1A1A1A 1B1B1B1B1B1B1B1B
  1240        LSR                    1C1C1C1C1C1C1C1C 1D1D1D1D1D1D1D1D
  1250        LSR                    1E1E1E1E1E1E1E1E 1F1F1F1F1F1F1F1F
  1260        ADC BYTE               2020202020202020 2121212121212121
  1270        ROR                    2222222222222222 2323232323232323
  1280        LSR                    24242424
  1290        LSR
  1300        ADC BYTE
  1310        ROR 
  1320        LSR
  1330        LSR
  1340        RTS
  1350 *--------------------------------

It is possible to divide by 3 or 15 using a program based on the same principle as the divide-by-seven above. Here is the code for those.
  1220        LDA BYTE
  1230        LSR
  1240        LSR
  1250        LSR
  1260        LSR
  1270        ADC BYTE
  1280        ROR 
  1290        LSR
  1300        LSR
  1310        LSR
  1320        ADC BYTE
  1330        ROR 
  1340        LSR
  1350        LSR
  1360        LSR
  1370        RTS
  1220        LDA BYTE
  1230        LSR
  1240        LSR
  1250        ADC BYTE
  1260        ROR 
  1270        LSR
  1280        ADC BYTE
  1290        ROR 
  1300        LSR
  1310        ADC BYTE
  1320        ROR 
  1330        LSR
  1340        ADC BYTE
  1350        ROR 
  1360        LSR
  1370        RTS

Using the divide by 15, you could make a divide by ten. First multiply the original number by three (by shifting one bit left and adding), then divide by 15 using the above program, and then by 2 (by shifting one bit right). Since 3X/30 = X/10, there you have it.

Sly Hex Conversion Bob Sander-Cederlof

Have you ever wondered what would happen if you added, in the 6502 decimal mode,values that were not decimal? I have. I also wondered if any of the results might be useful.

For example, what happens if I add 0 to $0A, in decimal mode? The following little piece of code will tell me:

       SED             set decimal mode
       LDA #$0A
       ADC #0
       CLD             clear decimal mode
       JMP $FDDA       monitor print byte routine

Lo! The $0A turns into $10! It makes sense, because of course adding zero does not change anything. But the automatic "decimal adjust" that occurs after the add when the 6502 is in decimal mode detects the "A" nybble, generates a carry to the next nybble, and subtracts $0A.

It turns out the same process turns $0B into $11, $0C into $12, and so on up to $0F into $15.

That is a useful result! That means that I can convert a hex nybble to BCD byte by merely adding zero when in decimal mode!

A little further experimentation will lead to another useful trick. If I add first $90 and then $40, both additions in decimal mode, a value between $00 and $0F will be converted to the ASCII code for the digits 0-9 and letter A-F. Believe it or not!

The first addition, of $90, gives us $90-$9F. The automatic "decimal adjust" does nothing to $90-$99, and carry will be clear afterwards. If the intermediate result was $9A-$9F, the decimal adjust will first generate a nybble carry because the A-F nybble is greater than 9, and reduce that nybble by A. The nybble carry will increment the 9 nybble to A, which gets reduced back to 0 and a byte carry is set. This means we end up with $90-$99 with carry clear or $00-$05 with carry set.

Adding $40 in the next step brings the $90-$99 up to $30-$39 (with carry out of the byte, which we will ignore). The $00-$05 will be brought up to $41-$45, ASCII codes for A-F. Voila!

Useful, but maybe not the best. It turns out that a more traditional approach is only one byte longer and saves a few cycles. With the value $00-$0F in the A-register:

       CMP #$0A
       BCC .1          0-9
       ADC #6          convert A-F to $11-16
   .1  ADC #$30

will convert to ASCII.
  1010 T      LDX #0
  1020 .1     TXA
  1030        JSR $FDDA
  1040        LDA #"-"
  1050        JSR $FDED
  1060        TXA
  1070 *--------------------------------
  1080        SED
  1090        CLC
  1110        ADC #0
  1120        CLD
  1130 *--------------------------------
  1140        JSR $FDDA
  1150        LDA #"-"
  1160        JSR $FDED
  1170        TXA
  1180 *--------------------------------
  1190        SED
  1200        CLC
  1210        ADC #$90
  1220        ADC #$40
  1230        CLD
  1240 *--------------------------------
  1250        JSR $FDDA
  1260        LDA #"-"
  1270        JSR $FDED
  1280        TXA
  1290 *--------------------------------
  1300        CMP #10
  1310        BCC .2
  1320        ADC #6
  1330 .2     ADC #$30
  1340 *--------------------------------
  1350        JSR $FDDA
  1360 *--------------------------------
  1370        JSR $FD8E
  1380        INX
  1390        CPX #16
  1400        BCC .1
  1410        RTS

Remembering When Bob Sander-Cederlof

There is a lot of grumbling going on, or at least so says the media. Supposedly Mac owners are MAD over Apple's $995 price tag for the 512K upgrade kit. And the fact that new buyers get a lower system price makes them even madder.

If it's true, then I guess the computer "for the rest of us" has found a market with a real-estate or Detroit mentality. Haven't they noticed that prices on virtually all electronic items go down every year? (I always say, "If houses and cars had gone the way electronics has over the last 30 years, we would now be able to buy a 3-bedroom home for two dollars and a nice car for 50 cents. Of course they would both fit on the head of a pin....")

I remember when I bought my Apple, with two rows of 4K RAM chips totalling 8K bytes. Adding another row of 4K chips would have cost me about $50. The price at that time for one set of 8 16K chips was $520. Through a special arrangement at Mostek, members of our local club were able to get them for $150. So to raise my Apple from 8K to 48K cost me $450. Retail price would have been $1560, plus tax.

Looking back even further, I found a letter from a Raymond Hoobler to the editor of the Journal of Dentistry, from October 1976. Ray owned an Apple 1, which was populated with 1K RAM chips. He was VERY happy with Apple's promise of an upgrade kit consisting of 4K RAM chips for ONLY $500!

It will not be too long before the price of 256K RAMs drops. Then we can start grumbling about the price of 4-megabyte upgrade kits. Or, we could rejoice at the blessings of ever improving technology, mass marketing, and understanding wives.

Generating Tables for Faster Hi-Res Bob Sander-Cederlof

Look on page A23 in the Apple Supplement in the back of the December 1984 issue of Byte for an excellent article for the hi-res graphics buff: "Preshift-Table Graphics on Your Apple", by Bill Budge, Gregg Williams, and Rob Moore.

The article presents another of Bill Budge's secrets for fast animation using block graphics. If you want to move a block a few dots left or right, it is time-consuming to shift the 7-bits-in-8 dot images. Older techniques stored pre-shifted sets for each image that might be moved. The neater method described in this article stores a 14x256 byte table of all possible shifts of all possible bytes, and uses a fast lookup technique. I am not going to repeat all that here ... get the article.

The article also included some sample programs that used two other tables: a 192 entry address table for the addresses of each hi-res line, and a 280 entry table for the quotient and remainder of each horizontal position. Both of these tables were originally generated by Applesoft programs, and BSAVEd. The example program BLOADed them.

It dawned on me that a machine language program to generate those two tables would take less than half a page of code and be considerably faster than BLOADing pre-generated tables. Furthermore, once the tables were generated, the half-page of code could be overlaid with other programs or data. In a commercial product, this could cut down the boot time significantly.

First I wrote a program to generate the 192 addresses. This was almost a hand-compilation of the Applesoft program in the Byte article, but not quite. (I wrote the comments in near- Basic, as you can see.)

Then I merged into that program the stuff to generate the first 192 quotients and remainders. This is the horizontal dot position divided by 7 (7 dots per byte) to give the byte position on the line and the bit position in that byte.

After the 192 trips through that code, I added a loop to generate the rest of the Q/R pairs, from dot position 192 up to 279.

I timed the program by running it 250 times. All 250 took roughly 3 seconds, which means building the tables once takes about 12 milliseconds. Compare that to loading them from disk, which would take at least a half second.

I haven't tried it yet, but I think the preshift tables which were the meat of the Byte article could also be generated by a machine language program much quicker than BLOADing the same. And since the program only needs to be used once, during initialization, it too could be burned after using.
  1010 *--------------------------------
  1020 I      .EQ 0
  1030 JL     .EQ 1
  1040 JH     .EQ 2
  1050 K      .EQ 3
  1060 Q      .EQ 4
  1070 R      .EQ 5
  1080 *--------------------------------
  1090 ADDRL  .EQ $900
  1100 ADDRH  .EQ $9C0
  1110 QUO.1  .EQ $A80
  1120 QUO.2  .EQ QUO.1+192
  1130 REM.1  .EQ QUO.1+280
  1140 REM.2  .EQ REM.1+192
  1150 *--------------------------------
  1160 BUILD  LDX #0       FOR X = 0 TO 191 STEP 1
  1170        STX I        FOR I = 0 TO $50 STEP $28
  1180        STX JL       FOR J = 0 TO $0380 STEP $0080
  1190        STX JH
  1200        STX K        FOR K = 0 TO $1C STEP $04
  1210        STX Q        QUOTIENT = 0
  1220        STX R        REMAINDER = 0
  1230 *---BUILD NEXT HI-RES ADDR-------
  1240 .1     LDA I
  1250        ORA JL
  1260        STA ADDRL,X
  1270        LDA #$20
  1280        ORA JH
  1290        ORA K
  1300        STA ADDRH,X
  1310 *---SAVE NEXT Q/R PAIR-----------
  1320        LDA Q
  1330        STA QUO.1,X
  1340        LDA R
  1350        STA REM.1,X
  1360 *---NEXT K-----------------------
  1370        CLC
  1380        LDA K
  1390        ADC #4
  1400        STA K
  1410        EOR #$20
  1420        BNE .2
  1430 *---NEXT J-----------------------
  1440        STA K
  1450        LDA JL
  1460        EOR #$80
  1470        STA JL
  1480        BNE .2
  1490        INC JH
  1500        LDA JH
  1510        EOR #4
  1520        BNE .2
  1530 *---NEXT I-----------------------
  1540        STA JH
  1550        CLC
  1560        LDA I
  1570        ADC #$28
  1580        STA I
  1590 *---BUMP Q/R PAIR----------------
  1600 .2     INC R        R COUNTS 0...6
  1610        LDA R
  1620        EOR #7       IF R=7, MAKE 0 AND BUMP Q
  1630        BNE .3       ...NOT 7 YET
  1640        STA R        ...R=7, SO MAKE IT 0
  1650        INC Q             AND BUMP Q
  1660 *---NEXT X-----------------------
  1670 .3     INX
  1680        CPX #192
  1690        BCC .1
  1700 *---NOW FINISH Q/R PAIRS---------
  1710 *---BETWEEN 192 AND 279----------
  1720        LDX #0       FOR X = 0 TO 280-192-1
  1730 .4     LDA Q
  1740        STA QUO.2,X
  1750        LDA R
  1760        STA REM.2,X
  1770 *---BUMP Q/R PAIR AS BEFORE------
  1780        INC R
  1790        LDA R
  1800        EOR #7
  1810        BNE .5
  1820        STA R
  1830        INC Q
  1840 *---NEXT X-----------------------
  1850 .5     INX
  1860        CPX #280-192
  1870        BCC .4
  1880        RTS
  1890 *--------------------------------
  1900        .LIST OFF

Blankenship's Basic Bob Sander-Cederlof

John Blankenship has put together an Applesoft enhancement package, at a mouth-watering price. (See his ad elsewhere in this issue for his $20 introductory offer.) He sent me a review copy, so I tried it out.

BBASIC is a large chunk of machine language code that sits between HIMEM and the DOS file buffers. It also sits between you and Applesoft, hiding itself behind a facade of new editing and listing features. BBASIC takes control even in direct mode, giving you an EDIT command, structured listings, and the ability to skip out of long catalogs.

In pure BBASIC, line numbers are used only as line numbers, not as destinations for GOTOs or GOSUBs. A built-in RENUM command soon convinces you to live this way and like it. In place of line-number branches, you use alphabetic "names" for subroutines, and WHEN-ELSE-ENDWHEN for logic flow. John has also added WHILE-ENDWHILE, REPEAT-UNTIL, CASE, and other structured looping and branching words.

During execution, a special COMPILE verb creates a table of "names" used in your program. This speeds up execution.

Hires Text generation is built-in, along with some extensions to the hires graphics. Musical tone generation with control over pitch, duration, and timbre is also included. You also get SORT, SEARCH, and PRINT USING.

I am just scratching the surface. I didn't like every feature, but there is plenty left over. Worth a lot more than $20.

By the way, if John's name sounds familiar, it may be because he is the author of "The Apple House", a book on controlling your home published by Prentice-Hall. John also is a Professor at DeVry Institute.

A Solution to Overlapping DOS Patches Paul Lewis
Fairfax, Virginia

I have recently resolved a compatibility problem between two desirable sets of DOS 3.3 patches: the RAMdisk of the 192K Neptune extended memory card, and the DOS Dater that comes with Applied Engineering's Timemaster II. It seems they both want to put patches into the same "unused" spaces inside DOS.

After examining the two patches carefully, I found out which parts of the patches were overlapping. Being unable to find a truly unused area inside DOS, I used the technique on page 7.3 of "Beneath Apple DOS" of placing routines in the "safe" area between DOS and its buffers. This seems to work fine. [ Until you try to run some other program that does the same thing, like PLE... (editor) ]

The file DATER.OBJ0 contains the DOS.DATER patch that I use. I noticed that the patch could be placed anywhere, since there are no internal references. Using an Applesoft program (part of my HELLO), I move the DOS buffers down far enough to fit this code in, and then BLOAD the patches.

     120 POKE 40192,128 : REM Lower the buffers
     130 PRINT CHR$(4)"MAXFILES 3"
     160 POKE 45571,15 :REM Patch file name length
     170 POKE 42883,14
     180 POKE 44085,208 :REM Hook DOS to the DATER code
     190 POKE 44086,156                   

Apple Assembly Line is published monthly by S-C SOFTWARE CORPORATION, P.O. Box 280300, Dallas, Texas 75228. Phone (214) 324-2050. Subscription rate is $18 per year in the USA, sent Bulk Mail; add $3 for First Class postage in USA, Canada, and Mexico; add $12 postage for other countries. Back issues are available for $1.80 each (other countries add $1 per back issue for postage).

All material herein is copyrighted by S-C SOFTWARE CORPORATION, all rights reserved. (Apple is a registered trademark of Apple Computer, Inc.)