aboutsummaryrefslogtreecommitdiff
path: root/xorg-server/hw/xfree86/xaa/XAA.HOWTO
blob: cbd71c1386acfac5012b20655b7d621197bc82e5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427


                          XAA.HOWTO

  This file describes how to add basic XAA support to a chipset driver.

0)  What is XAA
1)  XAA Initialization and Shutdown
2)  The Primitives
  2.0  Generic Flags
  2.1  Screen to Screen Copies
  2.2  Solid Fills
  2.3  Solid Lines
  2.4  Dashed Lines
  2.5  Color Expand Fills
    2.5.1 Screen to Screen Color Expansion
    2.5.2 CPU to Screen Color Expansion
      2.5.2.1 The Direct Method
      2.5.2.2 The Indirect Method
  2.6  8x8 Mono Pattern Fills
  2.7  8x8 Color Pattern Fills
  2.8  Image Writes
    2.8.1 The Direct Method
    2.8.2 The Indirect Method
  2.9 Clipping
3)  The Pixmap Cache
4)  Offscreen Pixmaps

/********************************************************************/

0) WHAT IS XAA
	
   XAA (the XFree86 Acceleration Architecture) is a device dependent
layer that encapsulates the unaccelerated framebuffer rendering layer,
intercepting rendering commands sent to it from higher levels of the
server.  For rendering tasks where hardware acceleration is not 
possible, XAA allows the requests to proceed to the software rendering
code.  Otherwise, XAA breaks the sometimes complicated X primitives
into simpler primitives more suitable for hardware acceleration and
will use accelerated functions exported by the chipset driver to 
render these.

   XAA provides a simple, easy to use driver interface that allows
the driver to communicate its acceleration capabilities and restrictions
back to XAA.  XAA will use the information provided by the driver
to determine whether or not acceleration will be possible for a
particular X primitive.



1) XAA INITIALIZATION AND SHUTDOWN

   All relevant prototypes and defines are in xaa.h.

   To Initialize the XAA layer, the driver should allocate an XAAInfoRec
via XAACreateInfoRec(), fill it out as described in this document
and pass it to XAAInit().  XAAInit() must be called _after_ the 
framebuffer initialization (usually cfb?ScreenInit or similar) since 
it is "wrapping" that layer.  XAAInit() should be called _before_ the 
cursor initialization (usually miDCInitialize) since the cursor
layer needs to "wrap" all the rendering code including XAA.

   When shutting down, the driver should free the XAAInfoRec
structure in its CloseScreen function via XAADestroyInfoRec().
The prototypes for the functions mentioned above are as follows:

   XAAInfoRecPtr XAACreateInfoRec(void);
   Bool XAAInit(ScreenPtr, XAAInfoRecPtr);
   void XAADestroyInfoRec(XAAInfoRec);

   The driver informs XAA of it's acceleration capablities by
filling out an XAAInfoRec structure and passing it to XAAInit().
The XAAInfoRec structure contains many fields, most of which are
function pointers and flags.  Each primitive will typically have
two functions and a set of flags associated with it, but it may
have more.  These two functions are the "SetupFor" and "Subsequent" 
functions.  The "SetupFor" function tells the driver that the 
hardware should be initialized for a particular type of graphics 
operation.  After the "SetupFor" function, one or more calls to the 
"Subsequent" function will be made to indicate that an instance
of the particular primitive should be rendered by the hardware.
The details of each instance (width, height, etc...) are given
with each "Subsequent" function.   The set of flags associated
with each primitive lets the driver tell XAA what its hardware
limitations are (eg. It doesn't support a planemask, it can only
do one of the raster-ops, etc...).

  Of the XAAInfoRec fields, one is required.  This is the
Sync function.  XAA initialization will fail if this function
is not provided.

void Sync(ScrnInfoPtr pScrn)			/* Required */

   Sync will be called when XAA needs to be certain that all
   graphics coprocessor operations are finished, such as when
   the framebuffer must be written to or read from directly
   and it must be certain that the accelerator will not be
   overwriting the area of interest.

   One needs to make certain that the Sync function not only
   waits for the accelerator fifo to empty, but that it waits for
   the rendering of that last operation to complete.

   It is guaranteed that no direct framebuffer access will
   occur after a "SetupFor" or "Subsequent" function without
   the Sync function being called first.



2)  THE PRIMITIVES

2.0  Generic Flags

  Each primitive type has a set of flags associated with it which
allow the driver to tell XAA what the hardware limitations are.
The common ones are as follows:

/* Foreground, Background, rop and planemask restrictions */

   GXCOPY_ONLY

     This indicates that the accelerator only supports GXcopy
     for the particular primitive.

   ROP_NEEDS_SOURCE

     This indicates that the accelerator doesn't supports a
     particular primitive with rops that don't involve the source.
     These rops are GXclear, GXnoop, GXinvert and GXset. If neither
     this flag nor GXCOPY_ONLY is defined, it is assumed that the
     accelerator supports all 16 raster operations (rops) for that
     primitive.

   NO_PLANEMASK

     This indicates that the accelerator does not support a hardware
     write planemask for the particular primitive.

   RGB_EQUAL

     This indicates that the particular primitive requires the red, 
     green and blue bytes of the foreground color (and background color,
     if applicable) to be equal. This is useful for 24bpp when a graphics
     coprocessor is used in 8bpp mode, which is not uncommon in older
     hardware since some have no support for or only limited support for 
     acceleration at 24bpp. This way, many operations will be accelerated 
     for the common case of "grayscale" colors.  This flag should only
     be used in 24bpp.

  In addition to the common ones listed above which are possible for
nearly all primitives, each primitive may have its own flags specific
to that primitive.  If such flags exist they are documented in the
descriptions of those primitives below.
 



2.1  Screen to Screen Copies

   The SetupFor and Subsequent ScreenToScreenCopy functions provide
   an interface for copying rectangular areas from video memory to
   video memory.  To accelerate this primitive the driver should
   provide both the SetupFor and Subsequent functions and indicate
   the hardware restrictions via the ScreenToScreenCopyFlags.  The
   NO_PLANEMASK, GXCOPY_ONLY and ROP_NEEDS_SOURCE flags as described
   in Section 2.0 are valid as well as the following:

    NO_TRANSPARENCY
     
      This indicates that the accelerator does not support skipping
      of color keyed pixels when copying from the source to the destination.

    TRANSPARENCY_GXCOPY_ONLY

      This indicates that the accelerator supports skipping of color keyed
      pixels only when the rop is GXcopy.

    ONLY_LEFT_TO_RIGHT_BITBLT

      This indicates that the hardware only accepts blitting when the
      x direction is positive.

    ONLY_TWO_BITBLT_DIRECTIONS

      This indicates that the hardware can only cope with blitting when
      the direction of x is the same as the direction in y.


void SetupForScreenToScreenCopy( ScrnInfoPtr pScrn,
			int xdir, int ydir,
			int rop,
			unsigned int planemask,
			int trans_color )

    When this is called, SubsequentScreenToScreenCopy will be called
    one or more times directly after.  If ydir is 1, then the accelerator
    should copy starting from the top (minimum y) of the source and
    proceed downward.  If ydir is -1, then the accelerator should copy
    starting from the bottom of the source (maximum y) and proceed
    upward.  If xdir is 1, then the accelerator should copy each
    y scanline starting from the leftmost pixel of the source.  If
    xdir is -1, it should start from the rightmost pixel.  
       If trans_color is not -1 then trans_color indicates that the
    accelerator should not copy pixels with the color trans_color
    from the source to the destination, but should skip them. 
    Trans_color is always -1 if the NO_TRANSPARENCY flag is set.
 

void SubsequentScreenToScreenCopy(ScrnInfoPtr pScrn,
			int x1, int y1,
			int x2, int y2,
			int width, int height)

    Copy a rectangle "width" x "height" from the source (x1,y1) to the 
    destination (x2,y2) using the parameters passed by the last
    SetupForScreenToScreenCopy call. (x1,y1) and (x2,y2) always denote 
    the upper left hand corners of the source and destination regardless 
    of which xdir and ydir values are given by SetupForScreenToScreenCopy.  



2.2 Solid Fills

   The SetupFor and Subsequent SolidFill(Rect/Trap) functions provide
   an interface for filling rectangular areas of the screen with a
   foreground color.  To accelerate this primitive the driver should
   provide both the SetupForSolidFill and SubsequentSolidFillRect 
   functions and indicate the hardware restrictions via the SolidFillFlags.
   The driver may optionally provide a SubsequentSolidFillTrap if
   it is capable of rendering the primitive correctly.  
   The GXCOPY_ONLY, ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags
   as described in Section 2.0 are valid.

  
void SetupForSolidFill(ScrnInfoPtr pScrn, 
                       int color, int rop, unsigned int planemask)

    SetupForSolidFill indicates that any combination of the following 
    may follow it.

	SubsequentSolidFillRect
	SubsequentSolidFillTrap


 
void SubsequentSolidFillRect(ScrnInfoPtr pScrn, int x, int y, int w, int h)

     Fill a rectangle of dimensions "w" by "h" with origin at (x,y) 
     using the color, rop and planemask given by the last 
     SetupForSolidFill call.

void SubsequentSolidFillTrap(ScrnInfoPtr pScrn, int y, int h, 
	int left, int dxL, int dyL, int eL,
	int right, int dxR, int dyR, int eR)

     These parameters describe a trapezoid via a version of
     Bresenham's parameters. "y" is the top line. "h" is the
     number of spans to be filled in the positive Y direction.
     "left" and "right" indicate the starting X values of the
     left and right edges.  dy/dx describes the edge slope.
     These are not the deltas between the beginning and ending
     points on an edge.  They merely describe the slope. "e" is
     the initial error term.  It's the relationships between dx,
     dy and e that define the edge.
	If your engine does not do bresenham trapezoids or does
     not allow the programmer to specify the error term then
     you are not expected to be able to accelerate them.


2.3  Solid Lines

    XAA provides an interface for drawing thin lines.  In order to
    draw X lines correctly a high degree of accuracy is required.
    This usually limits line acceleration to hardware which has a
    Bresenham line engine, though depending on the algorithm used,
    other line engines may come close if they accept 16 bit line 
    deltas.  XAA has both a Bresenham line interface and a two-point
    line interface for drawing lines of arbitrary orientation.  
    Additionally there is a SubsequentSolidHorVertLine which will
    be used for all horizontal and vertical lines.  Horizontal and
    vertical lines are handled separately since hardware that doesn't
    have a line engine (or has one that is unusable due to precision
    problems) can usually draw these lines by some other method such
    as drawing them as thin rectangles.  Even for hardware that can
    draw arbitrary lines via the Bresenham or two-point interfaces,
    the SubsequentSolidHorVertLine is used for horizontal and vertical
    lines since most hardware is able to render the horizontal lines
    and sometimes the vertical lines faster by other methods (Hint:
    try rendering horizontal lines as flattened rectangles).  If you have 
    not provided a SubsequentSolidHorVertLine but you have provided 
    Bresenham or two-point lines, a SubsequentSolidHorVertLine function 
    will be supplied for you.

    The flags field associated with Solid Lines is SolidLineFlags and 
    the GXCOPY_ONLY, ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags as
    described in Section 2.0 are valid restrictions.  

    Some line engines have line biases hardcoded to comply with
    Microsoft line biasing rules.  A tell-tale sign of this is the
    hardware lines not matching the software lines in the zeroth and
    fourth octants.  The driver can set the flag:
	
	MICROSOFT_ZERO_LINE_BIAS

    in the AccelInfoRec.Flags field to adjust the software lines to
    match the hardware lines.   This is in the generic flags field
    rather than the SolidLineFlags since this flag applies to all
    software zero-width lines on the screen and not just the solid ones.


void SetupForSolidLine(ScrnInfoPtr pScrn, 
                       int color, int rop, unsigned int planemask)

    SetupForSolidLine indicates that any combination of the following 
    may follow it.

	SubsequentSolidBresenhamLine
	SubsequentSolidTwoPointLine
        SubsequentSolidHorVertLine 	


void SubsequentSolidHorVertLine( ScrnInfoPtr pScrn,
        			int x, int y, int len, int dir )

    All vertical and horizontal solid thin lines are rendered with
    this function.  The line starts at coordinate (x,y) and extends
    "len" pixels inclusive.  In the direction indicated by "dir."
    The direction is either DEGREES_O or DEGREES_270.  That is, it
    always extends to the right or down.



void SubsequentSolidTwoPointLine(ScrnInfoPtr pScrn,
        	int x1, int y1, int x2, int y2, int flags)

    Draw a line from (x1,y1) to (x2,y2).  If the flags field contains
    the flag OMIT_LAST, the last pixel should not be drawn.  Otherwise,
    the pixel at (x2,y2) should be drawn.

    If you use the TwoPoint line interface there is a good possibility
    that your line engine has hard-coded line biases that do not match
    the default X zero-width lines.  If so, you may need to set the
    MICROSOFT_ZERO_LINE_BIAS flag described above.  Note that since
    any vertex in the 16-bit signed coordinate system is valid, your
    line engine is expected to handle 16-bit values if you have hardware
    line clipping enabled.  If your engine cannot handle 16-bit values,
    you should not use hardware line clipping.


void SubsequentSolidBresenhamLine(ScrnInfoPtr pScrn,
        int x, int y, int major, int minor, int err, int len, int octant)

    "X" and "y" are the starting point of the line.  "Major" and "minor" 
    are the major and minor step constants.  "Err" is the initial error
    term.  "Len" is the number of pixels to be drawn (inclusive). "Octant"
    can be any combination of the following flags OR'd together:

      Y_MAJOR		Y is the major axis (X otherwise)
      X_DECREASING	The line is drawn from right to left
      Y_DECREASING	The line is drawn from bottom to top
	  
    The major, minor and err terms are the "raw" Bresenham parameters
    consistent with a line engine that does:

	e = err;
	while(len--) {
	   DRAW_POINT(x,y);
	   e += minor;
	   if(e >= 0) {
		e -= major;
		TAKE_ONE_STEP_ALONG_MINOR_AXIS;
	   }
	   TAKE_ONE_STEP_ALONG_MAJOR_AXIS;
	}

    IBM 8514 style Bresenham line interfaces require their parameters
    modified in the following way:

	Axial = minor;
	Diagonal = minor - major;
	Error = minor + err;

SolidBresenhamLineErrorTermBits

    This field allows the driver to tell XAA how many bits large its
    Bresenham parameter registers are.  Many engines have registers that
    only accept 12 or 13 bit Bresenham parameters, and the parameters
    for clipped lines may overflow these if they are not scaled down.
    If this field is not set, XAA will assume the engine can accomodate
    16 bit parameters, otherwise, it will scale the parameters to the
    size specified.


2.4  Dashed Lines

    The same degree of accuracy required by the solid lines is required
    for drawing dashed lines as well.  The dash pattern itself is a
    buffer of binary data where ones are expanded into the foreground
    color and zeros either correspond to the background color or
    indicate transparency depending on whether or not DoubleDash or
    OnOffDashes are being drawn.  

    The flags field associated with dashed Lines is DashedLineFlags and 
    the GXCOPY_ONLY, ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags as
    described in Section 2.0 are valid restrictions.  Additionally, the
    following flags are valid:

      NO_TRANSPARENCY

	This indicates that the driver cannot support dashed lines
	with transparent backgrounds (OnOffDashes).

      TRANSPARENCY_ONLY

	This indicates that the driver cannot support dashes with
	both a foreground and background color (DoubleDashes).

      LINE_PATTERN_POWER_OF_2_ONLY

	This indicates that only patterns with a power of 2 length
	can be accelerated.

      LINE_PATTERN_LSBFIRST_MSBJUSTIFIED
      LINE_PATTERN_LSBFIRST_LSBJUSTIFIED
      LINE_PATTERN_MSBFIRST_MSBJUSTIFIED
      LINE_PATTERN_MSBFIRST_LSBJUSTIFIED

	These describe how the line pattern should be packed.
	The pattern buffer is DWORD padded.  LSBFIRST indicates
	that the pattern runs from the LSB end to the MSB end.
	MSBFIRST indicates that the pattern runs from the MSB end
	to the LSB end.  When the pattern does not completely fill
	the DWORD padded buffer, the pattern will be justified 
	towards the MSB or LSB end based on the flags above.


    The following field indicates the maximum length dash pattern that
    should be accelerated.

	int DashPatternMaxLength


void SetupForDashedLine(ScrnInfoPtr pScrn,
		int fg, int bg, int rop, unsigned int planemask,
        	int length, unsigned char *pattern)

    
    SetupForDashedLine indicates that any combination of the following 
    may follow it.

	SubsequentDashedBresenhamLine
	SubsequentDashedTwoPointLine

    If "bg" is -1, then the background (pixels corresponding to clear
    bits in the pattern) should remain unmodified. "Bg" indicates the
    background color otherwise.  "Length" indicates the length of
    the pattern in bits and "pattern" points to the DWORD padded buffer
    holding the pattern which has been packed according to the flags
    set above.  

    
void SubsequentDashedTwoPointLine( ScrnInfoPtr pScrn,
        int x1, int y1, int x2, int y2, int flags, int phase)

void SubsequentDashedBresenhamLine(ScrnInfoPtr pScrn,
        int x1, int y1, int major, int minor, int err, int len, int octant,
        int phase)
  
    These are the same as the SubsequentSolidTwoPointLine and
    SubsequentBresenhamLine functions except for the addition
    of the "phase" field which indicates the offset into the dash 
    pattern that the pixel at (x1,y1) corresponds to.

    As with the SubsequentBresenhamLine, there is an
 
	int DashedBresenhamLineErrorTermBits 
   
    field which indicates the size of the error term registers
    used with dashed lines.  This is usually the same value as
    the field for the solid lines (because it's usually the same
    register).
       
      

2.5   Color Expansion Fills

    When filling a color expansion rectangle, the accelerator
    paints each pixel depending on whether or not a bit in a
    corresponding bitmap is set or clear. Opaque expansions are 
    when a set bit corresponds to the foreground color and a clear 
    bit corresponds to the background color.  A transparent expansion
    is when a set bit corresponds to the foreground color and a
    clear bit indicates that the pixel should remain unmodified.
   
    The graphics accelerator usually has access to the source 
    bitmap in one of two ways: 1) the bitmap data is sent serially
    to the accelerator by the CPU through some memory mapped aperture
    or 2) the accelerator reads the source bitmap out of offscreen
    video memory.  Some types of primitives are better suited towards 
    one method or the other.  Type 2 is useful for reusable patterns
    such as stipples which can be cached in offscreen memory.  The
    aperature method can be used for stippling but the CPU must pass
    the data across the bus each time a stippled fill is to be performed.  
    For expanding 1bpp client pixmaps or text strings to the screen,
    the aperature method is usually superior because the intermediate
    copy in offscreen memory needed by the second method would only be 
    used once.  Unfortunately, many accelerators can only do one of these
    methods and not both.  

    XAA provides both ScreenToScreen and CPUToScreen color expansion 
    interfaces for doing color expansion fills.  The ScreenToScreen
    functions can only be used with hardware that supports reading
    of source bitmaps from offscreen video memory, and these are only
    used for cacheable patterns such as stipples.  There are two
    variants of the CPUToScreen routines - a direct method intended
    for hardware that has a transfer aperature, and an indirect method
    intended for hardware without transfer aperatures or hardware
    with unusual transfer requirements.  Hardware that can only expand
    bitmaps from video memory should supply ScreenToScreen routines
    but also ScanlineCPUToScreen (indirect) routines to optimize transfers 
    of non-cacheable data.  Hardware that can only accept source bitmaps
    through an aperature should supply CPUToScreen (or ScanlineCPUToScreen) 
    routines. Hardware that can do both should provide both ScreenToScreen 
    and CPUToScreen routines.

    For both ScreenToScreen and CPUToScreen interfaces, the GXCOPY_ONLY,
    ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags described in
    Section 2.0 are valid as well as the following:

    /* bit order requirements (one of these must be set) */
   
    BIT_ORDER_IN_BYTE_LSBFIRST

      This indicates that least significant bit in each byte of the source
      data corresponds to the leftmost of that block of 8 pixels.  This
      is the prefered format.

    BIT_ORDER_IN_BYTE_MSBFIRST    

      This indicates that most significant bit in each byte of the source
      data corresponds to the leftmost of that block of 8 pixels.

    /* transparency restrictions */

    NO_TRANSPARENCY

      This indicates that the accelerator cannot do a transparent expansion.

    TRANSPARENCY_ONLY

      This indicates that the accelerator cannot do an opaque expansion.
      In cases where where the background needs to be filled, XAA will
      render the primitive in two passes when using the CPUToScreen
      interface, but will not do so with the ScreenToScreen interface 
      since that would require caching of two patterns.  Some 
      ScreenToScreen hardware may be able to render two passes at the
      driver level and remove the TRANSPARENCY_ONLY restriction if
      it can render pixels corresponding to the zero bits.



2.5.1  Screen To Screen Color Expansion

    The ScreenToScreenColorExpandFill routines provide an interface
    for doing expansion blits from source patterns stored in offscreen
    video memory.

    void SetupForScreenToScreenColorExpandFill (ScrnInfoPtr pScrn,
        			int fg, int bg, 
				int rop, unsigned int planemask)


    Ones in the source bitmap will correspond to the fg color.
    Zeros in the source bitmap will correspond to the bg color
    unless bg = -1.  In that case the pixels corresponding to the
    zeros in the bitmap shall be left unmodified by the accelerator.

    For hardware that doesn't allow an easy implementation of skipleft, the
    driver can replace CacheMonoStipple function with one that stores multiple
    rotated copies of the stipple and select between them. In this case the
    driver should set CacheColorExpandDensity to tell XAA how many copies of
    the pattern are stored in the width of a cache slot. For instance if the
    hardware can specify the starting address in bytes, then 8 rotated copies
    of the stipple are needed and CacheColorExpandDensity should be set to 8.

    void SubsequentScreenToScreenColorExpandFill( ScrnInfoPtr pScrn,
				int x, int y, int w, int h,
				int srcx, int srcy, int offset )

   
    Fill a rectangle "w" x "h" at location (x,y).  The source pitch
    between scanlines is the framebuffer pitch (pScrn->displayWidth
    pixels) and srcx and srcy indicate the start of the source pattern 
    in units of framebuffer pixels. "Offset" indicates the bit offset
    into the pattern that corresponds to the pixel being painted at
    "x" on the screen.  Some hardware accepts source coordinates in
    units of bits which makes implementation of the offset trivial.
    In that case, the bit address of the source bit corresponding to
    the pixel painted at (x,y) would be:
	
     (srcy * pScrn->displayWidth + srcx) * pScrn->bitsPerPixel + offset

    It should be noted that the offset assumes LSBFIRST hardware.  
    For MSBFIRST hardware, the driver may need to implement the 
    offset by bliting only from byte boundaries and hardware clipping.



2.5.2  CPU To Screen Color Expansion


    The CPUToScreenColorExpandFill routines provide an interface for 
    doing expansion blits from source patterns stored in system memory.
    There are two varieties of this primitive, a CPUToScreenColorExpandFill
    and a ScanlineCPUToScreenColorExpandFill.  With the 
    CPUToScreenColorExpandFill method, the source data is sent serially
    through a memory mapped aperature.  With the Scanline version, the
    data is rendered scanline at a time into intermediate buffers with
    a call to SubsequentColorExpandScanline following each scanline.

    These two methods have separate flags fields, the
    CPUToScreenColorExpandFillFlags and ScanlineCPUToScreenColorExpandFillFlags
    respectively.  Flags specific to one method or the other are described 
    in sections 2.5.2.1 and 2.5.2.2 but for both cases the bit order and
    transparency restrictions listed at the beginning of section 2.5 are 
    valid as well as the following:
    
    /* clipping  (optional) */
    
    LEFT_EDGE_CLIPPING
 
      This indicates that the accelerator supports omission of up to
      31 pixels on the left edge of the rectangle to be filled.  This
      is beneficial since it allows transfer of the source bitmap to
      always occur from DWORD boundaries. 

    LEFT_EDGE_CLIPPING_NEGATIVE_X

      This flag indicates that the accelerator can render color expansion
      rectangles even if the value of x origin is negative (off of
      the screen on the left edge).

    /* misc */

    TRIPLE_BITS_24BPP

      When enabled (must be in 24bpp mode), color expansion functions
      are expected to require three times the amount of bits to be
      transferred so that 24bpp grayscale colors can be used with color
      expansion in 8bpp coprocessor mode. Each bit is expanded to 3
      bits when writing the monochrome data.


 2.5.1 The Direct Method 


    Using the direct method of color expansion XAA will send all
    bitmap data to the accelerator serially through an memory mapped
    transfer window defined by the following two fields:

      unsigned char *ColorExpandBase

        This indicates the memory address of the beginning of the aperture.

      int ColorExpandRange

        This indicates the size in bytes of the aperture.

    The driver should specify how the transfered data should be padded.
    There are options for both the padding of each Y scanline and for the
    total transfer to the aperature.
    One of the following two flags must be set:

      CPU_TRANSFER_PAD_DWORD

        This indicates that the total transfer (sum of all scanlines) sent
        to the aperature must be DWORD padded.  This is the default behavior.

      CPU_TRANSFER_PAD_QWORD 

	This indicates that the total transfer (sum of all scanlines) sent
	to the aperature must be QWORD padded.  With this set, XAA will send
        an extra DWORD to the aperature when needed to ensure that only
        an even number of DWORDs are sent.

    And then there are the flags for padding of each scanline:

      SCANLINE_PAD_DWORD

	This indicates that each Y scanline should be DWORD padded.
        This is the only option available and is the default.

    Finally, there is the CPU_TRANSFER_BASE_FIXED flag which indicates
    that the aperture is a single register rather than a range of
    registers, and XAA should write all of the data to the first DWORD.
    If the ColorExpandRange is not large enough to accomodate scanlines
    the width of the screen, this option will be forced. That is, the
    ColorExpandRange must be:

        ((virtualX + 31)/32) * 4   bytes or more.

        ((virtualX + 62)/32 * 4) if LEFT_EDGE_CLIPPING_NEGATIVE_X is set.
  
    If the TRIPLE_BITS_24BPP flag is set, the required area should be 
    multiplied by three.
     
    
void SetupForCPUToScreenColorExpandFill(ScrnInfoPtr pScrn,
        		int fg, int bg,
			int rop,
			unsigned int planemask)

  
 
     Ones in the source bitmap will correspond to the fg color.
     Zeros in the source bitmap will correspond to the bg color
     unless bg = -1.  In that case the pixels corresponding to the
     zeros in the bitmap shall be left unmodified by the accelerator.


void SubsequentCPUToScreenColorExpandFill(ScrnInfoPtr pScrn,
			int x, int y, int w, int h,
			int skipleft )

     When this function is called, the accelerator should be setup
     to fill a rectangle of dimension "w" by "h" with origin at (x,y)
     in the fill style prescribed by the last call to 
     SetupForCPUToScreenColorExpandFill.  XAA will pass the data to 
     the aperture immediately after this function is called.  If the 
     skipleft is non-zero (and LEFT_EDGE_CLIPPING has been enabled), then 
     the accelerator _should_not_ render skipleft pixels on the leftmost
     edge of the rectangle.  Some engines have an alignment feature
     like this built in, some others can do this using a clipping
     window.

     It can be arranged for XAA to call Sync() after it is through 
     calling the Subsequent function by setting SYNC_AFTER_COLOR_EXPAND 
     in the  CPUToScreenColorExpandFillFlags.  This can provide the driver 
     with an oportunity to reset a clipping window if needed.

    
2.5.2  The Indirect Method 

     Using the indirect method, XAA will render the bitmap data scanline
     at a time to one or more buffers.  These buffers may be memory
     mapped apertures or just intermediate storage.

     int NumScanlineColorExpandBuffers

       This indicates the number of buffers available.

     unsigned char **ScanlineColorExpandBuffers

       This is an array of pointers to the memory locations of each buffer.
       Each buffer is expected to be large enough to accommodate scanlines
       the width of the screen.  That is:

        ((virtualX + 31)/32) * 4   bytes or more.

        ((virtualX + 62)/32 * 4) if LEFT_EDGE_CLIPPING_NEGATIVE_X is set.
  
     Scanlines are always DWORD padded.
     If the TRIPLE_BITS_24BPP flag is set, the required area should be 
     multiplied by three.


void SetupForScanlineCPUToScreenColorExpandFill(ScrnInfoPtr pScrn,
        		int fg, int bg,
			int rop,
			unsigned int planemask)
 
     Ones in the source bitmap will correspond to the fg color.
     Zeros in the source bitmap will correspond to the bg color
     unless bg = -1.  In that case the pixels corresponding to the
     zeros in the bitmap shall be left unmodified by the accelerator.

     
void SubsequentScanlineCPUToScreenColorExpandFill(ScrnInfoPtr pScrn,
			int x, int y, int w, int h,
			int skipleft )

void SubsequentColorExpandScanline(ScrnInfoPtr pScrn, int bufno)


    When SubsequentScanlineCPUToScreenColorExpandFill is called, XAA 
    will begin transfering the source data scanline at a time, calling  
    SubsequentColorExpandScanline after each scanline.  If more than
    one buffer is available, XAA will cycle through the buffers.
    Subsequent scanlines will use the next buffer and go back to the
    buffer 0 again when the last buffer is reached.  The index into
    the ScanlineColorExpandBuffers array is presented as "bufno"
    with each SubsequentColorExpandScanline call.

    The skipleft field is the same as for the direct method.

    The indirect method can be use to send the source data directly 
    to a memory mapped aperture represented by a single color expand
    buffer, scanline at a time, but more commonly it is used to place 
    the data into offscreen video memory so that the accelerator can 
    blit it to the visible screen from there.  In the case where the
    accelerator permits rendering into offscreen video memory while
    the accelerator is active, several buffers can be used so that
    XAA can be placing source data into the next buffer while the
    accelerator is blitting the current buffer.  For cases where
    the accelerator requires some special manipulation of the source
    data first, the buffers can be in system memory.  The CPU can
    manipulate these buffers and then send the data to the accelerator.



2.6   8x8 Mono Pattern Fills

    XAA provides support for two types of 8x8 hardware patterns -
    "Mono" patterns and "Color" patterns.  Mono pattern data is
    64 bits of color expansion data with ones indicating the
    foreground color and zeros indicating the background color.
    The source bitmaps for the 8x8 mono patterns can be presented
    to the graphics accelerator in one of two ways.  They can be
    passed as two DWORDS to the 8x8 mono pattern functions or
    they can be cached in offscreen memory and their locations
    passed to the 8x8 mono pattern functions.  In addition to the
    GXCOPY_ONLY, ROP_NEEDS_SOURCE, NO_PLANEMASK and RGB_EQUAL flags
    defined in Section 2.0, the following are defined for the
    Mono8x8PatternFillFlags:

    HARDWARE_PATTERN_PROGRAMMED_BITS

      This indicates that the 8x8 patterns should be packed into two
      DWORDS and passed to the 8x8 mono pattern functions.  The default
      behavior is to cache the patterns in offscreen video memory and
      pass the locations of these patterns to the functions instead.
      The pixmap cache must be enabled for the default behavior (8x8 
      pattern caching) to work.  See Section 3 for how to enable the
      pixmap cache. The pixmap cache is not necessary for 
      HARDWARE_PATTERN_PROGRAMMED_BITS.

    HARDWARE_PATTERN_PROGRAMMED_ORIGIN

      If the hardware supports programmable pattern offsets then
      this option should be set. See the table below for further
      infomation.

    HARDWARE_PATTERN_SCREEN_ORIGIN

      Some hardware wants the pattern offset specified with respect to the
      upper left-hand corner of the primitive being drawn.  Other hardware 
      needs the option HARDWARE_PATTERN_SCREEN_ORIGIN set to indicate that 
      all pattern offsets should be referenced to the upper left-hand 
      corner of the screen.  HARDWARE_PATTERN_SCREEN_ORIGIN is preferable 
      since this is more natural for the X-Window system and offsets will 
      have to be recalculated for each Subsequent function otherwise.

    BIT_ORDER_IN_BYTE_MSBFIRST
    BIT_ORDER_IN_BYTE_LSBFIRST

      As with other color expansion routines this indicates whether the
      most or the least significant bit in each byte from the pattern is 
      the leftmost on the screen.

    TRANSPARENCY_ONLY
    NO_TRANSPARENCY

      This means the same thing as for the color expansion rect routines
      except that for TRANSPARENCY_ONLY XAA will not render the primitive
      in two passes since this is more easily handled by the driver.
      It is recommended that TRANSPARENCY_ONLY hardware handle rendering
      of opaque patterns in two passes (the background can be filled as
      a rectangle in GXcopy) in the Subsequent function so that the
      TRANSPARENCY_ONLY restriction can be removed. 



    Additional information about cached patterns...
    For the case where HARDWARE_PATTERN_PROGRAMMED_BITS is not set and 
    the pattern must be cached in offscreen memory, the first pattern
    starts at the cache slot boundary which is set by the 
    CachePixelGranularity field used to configure the pixmap cache.
    One should ensure that the CachePixelGranularity reflects any 
    alignment restrictions that the accelerator may put on 8x8 pattern 
    storage locations.  When HARDWARE_PATTERN_PROGRAMMED_ORIGIN is set 
    there is only one pattern stored.  When this flag is not set,
    all 64 pre-rotated copies of the pattern are cached in offscreen memory.
    The MonoPatternPitch field can be used to specify the X position pixel
    granularity that each of these patterns must align on.  If the
    MonoPatternPitch is not supplied, the patterns will be densely packed
    within the cache slot.  The behavior of the default XAA 8x8 pattern
    caching mechanism to store all 8x8 patterns linearly in video memory.
    If the accelerator needs the patterns stored in a more unusual fashion,
    the driver will need to provide its own 8x8 mono pattern caching 
    routines for XAA to use. 

    The following table describes the meanings of the "patx" and "paty"
    fields in both the SetupFor and Subsequent functions.

    With HARDWARE_PATTERN_SCREEN_ORIGIN
    -----------------------------------

    HARDWARE_PATTERN_PROGRAMMED_BITS and HARDWARE_PATTERN_PROGRAMMED_ORIGIN

	SetupFor: patx and paty are the first and second DWORDS of the
		  8x8 mono pattern.

	Subsequent: patx and paty are the x,y offset into that pattern.
		    All Subsequent calls will have the same offset in 
		    the case of HARDWARE_PATTERN_SCREEN_ORIGIN so only
		    the offset specified by the first Subsequent call 
		    after a SetupFor call will need to be observed.

    HARDWARE_PATTERN_PROGRAMMED_BITS only

	SetupFor: patx and paty hold the first and second DWORDS of
		  the 8x8 mono pattern pre-rotated to match the desired
		  offset.

	Subsequent: These just hold the same patterns and can be ignored.

    HARDWARE_PATTERN_PROGRAMMED_ORIGIN only

	SetupFor: patx and paty hold the x,y coordinates of the offscreen
		  memory location where the 8x8 pattern is stored.  The
		  bits are stored linearly in memory at that location.

	Subsequent: patx and paty hold the offset into the pattern.
		    All Subsequent calls will have the same offset in 
		    the case of HARDWARE_PATTERN_SCREEN_ORIGIN so only
		    the offset specified by the first Subsequent call 
		    after a SetupFor call will need to be observed.

    Neither programmed bits or origin

	SetupFor: patx and paty hold the x,y coordinates of the offscreen 	
		  memory location where the pre-rotated 8x8 pattern is
		  stored.

	Subsequent: patx and paty are the same as in the SetupFor function
		    and can be ignored.
		  

    Without HARDWARE_PATTERN_SCREEN_ORIGIN
    -------------------------------------- 

    HARDWARE_PATTERN_PROGRAMMED_BITS and HARDWARE_PATTERN_PROGRAMMED_ORIGIN

	SetupFor: patx and paty are the first and second DWORDS of the
		  8x8 mono pattern.

	Subsequent: patx and paty are the x,y offset into that pattern.

    HARDWARE_PATTERN_PROGRAMMED_BITS only

	SetupFor: patx and paty holds the first and second DWORDS of
		  the unrotated 8x8 mono pattern.  This can be ignored. 

	Subsequent: patx and paty hold the rotated 8x8 pattern to be 
		    rendered.

    HARDWARE_PATTERN_PROGRAMMED_ORIGIN only

	SetupFor: patx and paty hold the x,y coordinates of the offscreen
		  memory location where the 8x8 pattern is stored.  The
		  bits are stored linearly in memory at that location.

	Subsequent: patx and paty hold the offset into the pattern.

    Neither programmed bits or origin

	SetupFor: patx and paty hold the x,y coordinates of the offscreen 	
		  memory location where the unrotated 8x8 pattern is
		  stored.  This can be ignored.

	Subsequent: patx and paty hold the x,y coordinates of the
		    rotated 8x8 pattern to be rendered.



void SetupForMono8x8PatternFill(ScrnInfoPtr pScrn, int patx, int paty,
        int fg, int bg, int rop, unsigned int planemask)

    SetupForMono8x8PatternFill indicates that any combination of the 
    following  may follow it.

	SubsequentMono8x8PatternFillRect
	SubsequentMono8x8PatternFillTrap

    The fg, bg, rop and planemask fields have the same meaning as the
    ones used for the other color expansion routines.  Patx's and paty's
    meaning can be determined from the table above.

 
void SubsequentMono8x8PatternFillRect( ScrnInfoPtr pScrn,
        	int patx, int paty, int x, int y, int w, int h)

     Fill a rectangle of dimensions "w" by "h" with origin at (x,y) 
     using the parameters give by the last SetupForMono8x8PatternFill
     call.  The meanings of patx and paty can be determined by the
     table above.

void SubsequentMono8x8PatternFillTrap( ScrnInfoPtr pScrn,
     			   int patx, int paty, int y, int h, 
     			   int left, int dxL, int dyL, int eL,
     			   int right, int dxR, int dyR, int eR )

     The meanings of patx and paty can be determined by the table above.
     The rest of the fields have the same meanings as those in the 
     SubsequentSolidFillTrap function. 



2.7   8x8 Color Pattern Fills
  
    8x8 color pattern data is 64 pixels of full color data that
    is stored linearly in offscreen video memory.  8x8 color patterns 
    are useful as a substitute for 8x8 mono patterns when tiling,
    doing opaque stipples, or in the case where transperency is
    supported, regular stipples.  8x8 color pattern fills also have
    the additional benefit of being able to tile full color 8x8
    patterns instead of just 2 color ones like the mono patterns.
    However, full color 8x8 patterns aren't used very often in the
    X Window system so you might consider passing this primitive
    by if you already can do mono patterns, especially if they 
    require alot of cache area.  Color8x8PatternFillFlags is
    the flags field for this primitive and the GXCOPY_ONLY,
    ROP_NEEDS_SOURCE and NO_PLANEMASK flags as described in
    Section 2.0 are valid as well as the following:


    HARDWARE_PATTERN_PROGRAMMED_ORIGIN

      If the hardware supports programmable pattern offsets then
      this option should be set.  

    HARDWARE_PATTERN_SCREEN_ORIGIN

      Some hardware wants the pattern offset specified with respect to the
      upper left-hand corner of the primitive being drawn.  Other hardware 
      needs the option HARDWARE_PATTERN_SCREEN_ORIGIN set to indicate that 
      all pattern offsets should be referenced to the upper left-hand 
      corner of the screen.  HARDWARE_PATTERN_SCREEN_ORIGIN is preferable 
      since this is more natural for the X-Window system and offsets will 
      have to be recalculated for each Subsequent function otherwise.

    NO_TRANSPARENCY
    TRANSPARENCY_GXCOPY_ONLY

      These mean the same as for the ScreenToScreenCopy functions.


    The following table describes the meanings of patx and paty passed
    to the SetupFor and Subsequent fields:

    HARDWARE_PATTERN_PROGRAMMED_ORIGIN && HARDWARE_PATTERN_SCREEN_ORIGIN
	
	SetupFor: patx and paty hold the x,y location of the unrotated 
		  pattern.

	Subsequent: patx and paty hold the pattern offset.  For the case
		    of HARDWARE_PATTERN_SCREEN_ORIGIN all Subsequent calls
		    have the same offset so only the first call will need
		    to be observed.

    
    HARDWARE_PATTERN_PROGRAMMED_ORIGIN only

	SetupFor: patx and paty hold the x,y location of the unrotated
		  pattern.

	Subsequent: patx and paty hold the pattern offset. 

    HARDWARE_PATTERN_SCREEN_ORIGIN

	SetupFor: patx and paty hold the x,y location of the rotated pattern.

	Subsequent: patx and paty hold the same location as the SetupFor
		    function so these can be ignored.

    neither flag

	SetupFor: patx and paty hold the x,y location of the unrotated
		  pattern.  This can be ignored.

	Subsequent: patx and paty hold the x,y location of the rotated
		    pattern.

    Additional information about cached patterns...
    All 8x8 color patterns are cached in offscreen video memory so
    the pixmap cache must be enabled to use them. The first pattern
    starts at the cache slot boundary which is set by the 
    CachePixelGranularity field used to configure the pixmap cache.
    One should ensure that the CachePixelGranularity reflects any 
    alignment restrictions that the accelerator may put on 8x8 pattern 
    storage locations.  When HARDWARE_PATTERN_PROGRAMMED_ORIGIN is set 
    there is only one pattern stored.  When this flag is not set,
    all 64 rotations off the pattern are accessible but it is assumed
    that the accelerator is capable of accessing data stored on 8
    pixel boundaries.  If the accelerator has stricter alignment 
    requirements than this the dirver will need to provide its own 
    8x8 color pattern caching routines. 


void SetupForColor8x8PatternFill(ScrnInfoPtr pScrn, int patx, int paty,
        	int rop, unsigned int planemask, int trans_color)

    SetupForColor8x8PatternFill indicates that any combination of the 
    following  may follow it.

	SubsequentColor8x8PatternFillRect
	SubsequentColor8x8PatternFillTrap	(not implemented yet)

    For the meanings of patx and paty, see the table above.  Trans_color
    means the same as for the ScreenToScreenCopy functions.


 
void SubsequentColor8x8PatternFillRect( ScrnInfoPtr pScrn,
        	int patx, int paty, int x, int y, int w, int h)

     Fill a rectangle of dimensions "w" by "h" with origin at (x,y) 
     using the parameters give by the last SetupForColor8x8PatternFill
     call.  The meanings of patx and paty can be determined by the
     table above.

void SubsequentColor8x8PatternFillTrap( ScrnInfoPtr pScrn,
     			   int patx, int paty, int y, int h, 
     			   int left, int dxL, int dyL, int eL,
     			   int right, int dxR, int dyR, int eR )

    For the meanings of patx and paty, see the table above. 
    The rest of the fields have the same meanings as those in the 
    SubsequentSolidFillTrap function. 



2.8  Image Writes

    XAA provides a mechanism for transfering full color pixel data from
    system memory to video memory through the accelerator.  This is 
    useful for dealing with alignment issues and performing raster ops
    on the data when writing it to the framebuffer.  As with color
    expansion rectangles, there is a direct and indirect method.  The
    direct method sends all data through a memory mapped aperature.
    The indirect method sends the data to an intermediated buffer scanline 
    at a time.

    The direct and indirect methods have separate flags fields, the
    ImageWriteFlags and ScanlineImageWriteFlags respectively.
    Flags specific to one method or the other are described in sections 
    2.8.1 and 2.8.2 but for both cases the GXCOPY_ONLY, ROP_NEEDS_SOURCE
    and NO_PLANEMASK flags described in Section 2.0 are valid as well as
    the following:

    NO_GXCOPY

      In order to have accelerated image transfers faster than the 
      software versions for GXcopy, the engine needs to support clipping,
      be using the direct method and have a large enough image transfer
      range so that CPU_TRANSFER_BASE_FIXED doesn't need to be set.
      If these are not supported, then it is unlikely that transfering
      the data through the accelerator will be of any advantage for the
      simple case of GXcopy.  In fact, it may be much slower.  For such
      cases it's probably best to set the NO_GXCOPY flag so that 
      Image writes will only be used for the more complicated rops.

    /* transparency restrictions */

    NO_TRANSPARENCY
     
      This indicates that the accelerator does not support skipping
      of color keyed pixels when copying from the source to the destination.

    TRANSPARENCY_GXCOPY_ONLY

      This indicates that the accelerator supports skipping of color keyed
      pixels only when the rop is GXcopy.

    /* clipping  (optional) */
    
    LEFT_EDGE_CLIPPING
 
      This indicates that the accelerator supports omission of up to
      3 pixels on the left edge of the rectangle to be filled.  This
      is beneficial since it allows transfer from the source pixmap to
      always occur from DWORD boundaries. 

    LEFT_EDGE_CLIPPING_NEGATIVE_X

      This flag indicates that the accelerator can fill areas with
      image write data even if the value of x origin is negative (off of
      the screen on the left edge).


2.8.1 The Direct Method

    Using the direct method of ImageWrite XAA will send all
    bitmap data to the accelerator serially through an memory mapped
    transfer window defined by the following two fields:

      unsigned char *ImageWriteBase

        This indicates the memory address of the beginning of the aperture.

      int ImageWriteRange

        This indicates the size in bytes of the aperture.

    The driver should specify how the transfered data should be padded.
    There are options for both the padding of each Y scanline and for the
    total transfer to the aperature.
    One of the following two flags must be set:

      CPU_TRANSFER_PAD_DWORD

        This indicates that the total transfer (sum of all scanlines) sent
        to the aperature must be DWORD padded.  This is the default behavior.

      CPU_TRANSFER_PAD_QWORD 

	This indicates that the total transfer (sum of all scanlines) sent
	to the aperature must be QWORD padded.  With this set, XAA will send
        an extra DWORD to the aperature when needed to ensure that only
        an even number of DWORDs are sent.

    And then there are the flags for padding of each scanline:

      SCANLINE_PAD_DWORD

	This indicates that each Y scanline should be DWORD padded.
        This is the only option available and is the default.

    Finally, there is the CPU_TRANSFER_BASE_FIXED flag which indicates
    that the aperture is a single register rather than a range of
    registers, and XAA should write all of the data to the first DWORD.
    XAA will automatically select CPU_TRANSFER_BASE_FIXED if the 
    ImageWriteRange is not large enough to accomodate an entire scanline.   


void SetupForImageWrite(ScrnInfoPtr pScrn, int rop, unsigned int planemask,
        			int trans_color, int bpp, int depth)

     If trans_color is not -1 then trans_color indicates the transparency
     color key and pixels with color trans_color passed through the 
     aperature should not be transfered to the screen but should be 
     skipped.  Bpp and depth indicate the bits per pixel and depth of
     the source pixmap.  Trans_color is always -1 if the NO_TRANSPARENCY
     flag is set.


void SubsequentImageWriteRect(ScrnInfoPtr pScrn, 
				int x, int y, int w, int h, int skipleft)

     
     Data passed through the aperature should be copied to a rectangle
     of width "w" and height "h" with origin (x,y).  If LEFT_EDGE_CLIPPING
     has been enabled, skipleft will correspond to the number of pixels
     on the left edge that should not be drawn.  Skipleft is zero 
     otherwise.

     It can be arranged for XAA to call Sync() after it is through 
     calling the Subsequent functions by setting SYNC_AFTER_IMAGE_WRITE 
     in the  ImageWriteFlags.  This can provide the driver with an
     oportunity to reset a clipping window if needed.

2.8.2  The Indirect Method

     Using the indirect method, XAA will render the pixel data scanline
     at a time to one or more buffers.  These buffers may be memory
     mapped apertures or just intermediate storage.

     int NumScanlineImageWriteBuffers

       This indicates the number of buffers available.

     unsigned char **ScanlineImageWriteBuffers

       This is an array of pointers to the memory locations of each buffer.
       Each buffer is expected to be large enough to accommodate scanlines
       the width of the screen.  That is:

         pScrn->VirtualX * pScreen->bitsPerPixel/8   bytes or more.

       If LEFT_EDGE_CLIPPING_NEGATIVE_X is set, add an additional 4
       bytes to that requirement in 8 and 16bpp, 12 bytes in 24bpp.
  
     Scanlines are always DWORD padded.

void SetupForScanlineImageWrite(ScrnInfoPtr pScrn, int rop, 
				unsigned int planemask, int trans_color, 
				int bpp, int depth)

     If trans_color is not -1 then trans_color indicates the transparency
     color key and pixels with color trans_color in the buffer should not 
     be transfered to the screen but should be skipped.  Bpp and depth 
     indicate the bits per pixel and depth of the source bitmap.  
     Trans_color is always -1 if the NO_TRANSPARENCY flag is set.


void SubsequentImageWriteRect(ScrnInfoPtr pScrn, 
				int x, int y, int w, int h, int skipleft)

     
void SubsequentImageWriteScanline(ScrnInfoPtr pScrn, int bufno)


    When SubsequentImageWriteRect is called, XAA will begin
    transfering the source data scanline at a time, calling  
    SubsequentImageWriteScanline after each scanline.  If more than
    one buffer is available, XAA will cycle through the buffers.
    Subsequent scanlines will use the next buffer and go back to the
    buffer 0 again when the last buffer is reached.  The index into
    the ScanlineImageWriteBuffers array is presented as "bufno"
    with each SubsequentImageWriteScanline call.

    The skipleft field is the same as for the direct method.

    The indirect method can be use to send the source data directly 
    to a memory mapped aperture represented by a single image write
    buffer, scanline at a time, but more commonly it is used to place 
    the data into offscreen video memory so that the accelerator can 
    blit it to the visible screen from there.  In the case where the
    accelerator permits rendering into offscreen video memory while
    the accelerator is active, several buffers can be used so that
    XAA can be placing source data into the next buffer while the
    accelerator is blitting the current buffer.  For cases where
    the accelerator requires some special manipulation of the source
    data first, the buffers can be in system memory.  The CPU can
    manipulate these buffers and then send the data to the accelerator.


2.9 Clipping

    XAA supports hardware clipping rectangles.  To use clipping
    in this way it is expected that the graphics accelerator can
    clip primitives with verticies anywhere in the 16 bit signed 
    coordinate system. 

void SetClippingRectangle ( ScrnInfoPtr pScrn,
        		int left, int top, int right, int bottom)

void DisableClipping (ScrnInfoPtr pScrn)

    When SetClippingRectangle is called, all hardware rendering
    following it should be clipped to the rectangle specified
    until DisableClipping is called.

    The ClippingFlags field indicates which operations this sort
    of Set/Disable pairing can be used with.  Any of the following
    flags may be OR'd together.

	HARDWARE_CLIP_SCREEN_TO_SCREEN_COLOR_EXPAND
	HARDWARE_CLIP_SCREEN_TO_SCREEN_COPY
	HARDWARE_CLIP_MONO_8x8_FILL
	HARDWARE_CLIP_COLOR_8x8_FILL
	HARDWARE_CLIP_SOLID_FILL
	HARDWARE_CLIP_DASHED_LINE
	HARDWARE_CLIP_SOLID_LINE



3)  XAA PIXMAP CACHE

   /* NOTE:  XAA has no knowledge of framebuffer particulars so until
	the framebuffer is able to render into offscreen memory, usage
	of the pixmap cache requires that the driver provide ImageWrite
	routines or a WritePixmap or WritePixmapToCache replacement so
	that patterns can even be placed in the cache.

      ADDENDUM: XAA can now load the pixmap cache without requiring
	that the driver supply an ImageWrite function, but this can
	only be done on linear framebuffers.  If you have a linear
	framebuffer, set LINEAR_FRAMEBUFFER in the XAAInfoRec.Flags
	field and XAA will then be able to upload pixmaps into the
	cache without the driver providing functions to do so.
   */


   The XAA pixmap cache provides a mechanism for caching of patterns
   in offscreen video memory so that tiled fills and in some cases
   stippling can be done by blitting the source patterns from offscreen
   video memory. The pixmap cache also provides the mechanism for caching 
   of 8x8 color and mono hardware patterns.  Any unused offscreen video
   memory gets used for the pixmap cache and that information is 
   provided by the XFree86 Offscreen Memory Manager. XAA registers a 
   callback with the manager so that it can be informed of any changes 
   in the offscreen memory configuration.  The driver writer does not 
   need to deal with any of this since it is all automatic.  The driver 
   merely needs to initialize the Offscreen Memory Manager as described 
   in the DESIGN document and set the PIXMAP_CACHE flag in the 
   XAAInfoRec.Flags field.  The Offscreen Memory Manager initialization 
   must occur before XAA is initialized or else pixmap cache 
   initialization will fail.  

   PixmapCacheFlags is an XAAInfoRec field which allows the driver to
   control pixmap cache behavior to some extent.  Currently only one
   flag is defined:

   DO_NOT_BLIT_STIPPLES

     This indicates that the stippling should not be done by blitting
     from the pixmap cache.  This does not apply to 8x8 pattern fills. 


   CachePixelGranularity is an optional field.  If the hardware requires
   that a 8x8 patterns have some particular pixel alignment it should
   be reflected in this field.  Ignoring this field or setting it to
   zero or one means there are no alignment issues.


4)  OFFSCREEN PIXMAPS

   XAA has the ability to store pixmap drawables in offscreen video 
   memory and render into them with full hardware acceleration.  Placement
   of pixmaps in the cache is done automatically on a first-come basis and 
   only if there is room.  To enable this feature, set the OFFSCREEN_PIXMAPS
   flag in the XAAInfoRec.Flags field.  This is only available when a
   ScreenToScreenCopy function is provided, when the Offscreen memory 
   manager has been initialized and when the LINEAR_FRAMEBUFFER flag is
   also set.

   int maxOffPixWidth
   int maxOffPixHeight

       These two fields allow the driver to limit the maximum dimensions
     of an offscreen pixmap.  If one of these is not set, it is assumed
     that there is no limit on that dimension.  Note that if an offscreen
     pixmap with a particular dimension is allowed, then your driver will be
     expected to render primitives as large as that pixmap.  

$XFree86: xc/programs/Xserver/hw/xfree86/xaa/XAA.HOWTO,v 1.12 2000/04/12 14:44:42 tsi Exp $