
 Almirante Peskanov / Zaborra presents:
 
  A new Copper Chunky Screen 25% faster than normal  

      ,---------------------------.
      | Index                     |
      |     - What is all about   |
      |     - The source          |
      |     - The color 0 problem |
      |     - Posibilities        |
      |     - At last             |
      `---------------------------
      
      
     -===============-
     What is all about
     -===============-
     
  Well, this text is for people who had already worked with Copper Chunky
  screens or understand them. As well, you can read it for laugh with my
  ortography mistakes and my strange english pitinglis. I have done this
  text because I think what I'm telling is interesting for somebody and
  I still haven't seen a text talking about this trick before.
  
  Now for work:
  
   If you have been using Copper Chunky Screens for a while, you have seen
   than the processor performance decrease in 25% or more (I'm talking 
   about screens opened with 256,128 or 64 colors, not screens of continous
   Color 0 alteration, like in Chaosland). Not only the processor is running 
   slow (And we are not looking the blitter performance! Take note that the
   blitter,copper and CPU runned only on odd cycles on A500), but the copper
   loses cycles because the intense bitplanes DMA access. Think that Bitplane
   DMA have priority over Copper DMA, and to open a 256 colors screen you
   have to put 2*32 bit access on the fetch mode.There are many demos where
   you see strange irregularities on horizontal lines (For example, the doom
   section on Mindflow/Stellar). This occurs because the copper loses cycles
   and runs in a strange manner (I would like to see somebody docummenting
   this). As a result, you have to play with the quantity of copper ins-
   tructions you put on a line, getting restricted at 45 instrucctions/line.
   People with fast ram will not care about this problem, altough still 
   they have slower chip ram access.    
   
   Well,this is the problem. Here goes my solution.
   
   First a brief explanation for advanced coders:
    My solution consist on displaying 3 or 4 64 bits wide-16 colors sprites
    positioned along the screen with the color pattern: 
        0 0 1 1 2 2 ... 15 15 0 0 ....15 15 -=> One sprite (16*2 pixels *2)
    Whe the first line of sprite data is readed by DMA I cut Sprite DMA
    access.Now the data in the video sprite buffer is diplayed every line
    without asking for memory access. No bitplanes accesing,no sprites 
    accesing = Faster computer. Now you only have the copper running 
    to feed the colors. 
    To make the chunky display working with only 16 colors, you must make
    wise use of the BPLCON4 register ($DFF10C). The copper looks like that:
         ..
         Wait X position of the first pixel of sprite
         Move XYZ,$10C  Point the sprite bank 
         Move XX,ColorX    \
         Move YY,ColorX+1   |> There is time to feed 3 registers betwen
         Move ZZ,ColorX+2  /    sprite bank changes (8 pixels * 4= 32)
         Move XYZ2,$10C Point the next sprite bank 
          ..
          ..
         Move LAST,$10C 
         You have to change $10C while sprite displaying. At the last $10C
         change you still will have some time to feed more colors.(About
         54 copper total instructions/line).
         In my example program I have used 3 sprites, giving me a 192*260
         display area. I have to waste 6 copper instructions /line to feed
         $10C,1 for the wait, and 1 for BPLCON3. I still have time to change
         45-46 colors.
         
         As you can see I have used 2 pixels wide blocks. Of course,you
         can use the size you want and reduce the $10C changing, or let
         the odd or even lines black with only a $10C inst. and change
         52 colors this lines, etc.. The real limit is that you have
         54 inst./line, or 54*260=14040 inst./screen.(Why those guys 
         of VD don't explain us a little his method of chunky enhanced?
         I can't do anything with the tiny text they released, and I
         can't understand it very well.It overheat the monitors?It run
         only on selected displays?Has it anything to do with voodoo or
         black magic? Will ever be a AAA machine? Will David Pleasance
         win another Flamenco guitar competition?).
                  
     -===============-
        The source
     -===============-
   
    Well,enough of this.As I have said,this was a brief explanation. Now I
    will explain the source that comes with this text.What? An AMOS source?
    Yes.Please don't spit over me! I'm too lazy. I will confess, I like to 
    code in AMOS. I supose all of you have ever programmed in Basic or Pas-
    cal or anything like that (except gengis and lonestarr: they born 
    speaking in binary). Well, here you have the strangest commands and
    functions of amos I used:
       - Reserve as chip No.,Size \ Reserves a block of Chip or best   
                work No.,Size / avalaible memory (No.= 1 to 15)
       - = Start(No.) -=> Returns the adress of start of a reserved block
       - = Cop logic  -=> Returns the adress of the current Copperlist
       - Poke Adress,number -=> Really dont'n know this?
         Doke Adress,number -=> equal to Move.w number,(adress)
         Loke Adress,number -=> equal to Move.l number,(adress)   
       - = Peek(Adress) \
         = Deek(Adress)  > I love peek & poke.How Can you live without them?
         = Leek(Adress) /
    Hey! I was forgetting this! You can't execute the source with easy
    amos as it had all those instructions cutted.(Has anybody Easy AMOS?)
    
      *** Starting ***
    - I reserve block 14 for the copperlist (100 Kb,it's not acotted).
    - DIMension PRSO (contains pointers of Sprites).
    - Jumps to the subroutine that makes the Copperlist.
    
    Make chunky:      
    - I put the X offsets on start off block 14. I will be back on this 
      in a while. 
      *** Sprites images ***
    - First 3 lines search the next 64 bit aligned adress avalaible 
      in block 14.
    - For t=0 to 5 -=> Number of sprite
    - Now I build the sprite control double-double-words.
      If the sprite is odd then it has the attached bit on (AASP=$8x);if
      ist's even then it hasn't (AASP=$x).
      You will find the the structure of 64 bit sprites at the end of this
      doc.
    - For P=1 to 2
       It makes the data image. If you want to alter it then modify
       VB1,VB2,VB3,VB4.Every vertical group of 4 bits is a color.(I
       recommend you put Rem at the beginning of the lines while editing and
       remove them when finished;you will see why).
       There are 4 "if"'s, but these really work like 2. 
       *** Copperlist ***
       Bof!! The real show begins here.
       The copperlist is written in memory after sprite data.
     - Two first instructions : DMA -=> No bitplanes, Yes sprites.
     - Fetchmode -=> 64 bit sprites
     - I feed BPLCONx. I don't remenber what values are there. Remove it 
       and see what happens.
       I think there is a bit that allow sprite display on borders. That's
       essential.If clear,you will not obtain display. 
     - Color 0 -=> Black background.
     - For t= 0 to 7
        Feeding Sprite pointers. I keeped them on PRSO(x)
     - Loke PR,$1F09FFFE -=> Wait to the second sprite line.Here is the 
       TRICK. Loke PR,$960020 cuts the Sprite DMA access and now the same
       sprite lines are displayed along the screen without memory accessing.
    
     - For lin=43 to 292 step 4
       Now comes the real copper colors structure.
       It read four lines of copper instructions for four lines of display,
       preceded everyone by a wait.
       The copper data is on Data lines at the end of the source. You 
       can see them using Action Replay (C Adress). I will give you a look:
          Wait Xpos_of_spr,Y  -=> Wait for the beginning of Spr.
          Move XX,Bplcon4     -=> Select color bank of the first 32 pixels
          Move XYZ,ColorZZ    \      
          Move MNL,ColorZZ+1  |> Feed colors while 32 pixels passes away          
          Move IJK,ColorZZ+2  /  
          Move XX,Bplcon4     -=> Select color bank of the second 32 pixels   
           ... And the rest is equal,until copper is runnig after sprite 
               displaying. Then you haven't to control Bplcon4, and only
               feed colors.
               By the way, there some "Move BB,Bplcon3".These control
               the number of the bank where you are writing colors.
                          
     - Loke PR,$10C0011
       For BAN=0 to 7
       ...
       Next BAN
       Loke PR,$FFFFFFFE
       Put all colors in black for clear the rest of display and End
       of Copperlist.
       
       *** Testing program ***
       Here you can see how do you use the screen.
     - Loke,doke -=> Change old copperlist for new one.
     - For y..,for x.. You know
       To access a point is easy. As the structure of copperlist
       is very irregular I have had to make a list of offsets,one
       for every X position posible on screen (0-89). This is what I 
       have done at the beginning of Makechunky. The offsets are, as
       you can see, at the beginning of block 14.
       The adress will be: 
              Adress=DIPAN+Deek(Start(14)+X*2)+(Y*(108*4))
       DIPAN is the adress of the first copper color.
           
     - Lokes -=> Restore DMAs,Bplconxs, and the old copperlist.
        
    -==================- 
    The color 0 problem 
    -==================-
    
 If you have seen the source and the examples you will have noticed that
  there are some ugly black vertical lines along the screen. Well, I have
  not fixed that problem, but I will tell you what's the problem and I
  will left you this work. A 16 colors sprite have really 15 colors + 
  background color. This that color 0 of the sprite doesn't care about
  what color bank is active : it's ever the same color as $Dff180.
  The posible solutions are:
    - Don't use color0 on sprite. I've tried this doing two pixels longer =
      13*2+2*3=32 pixels. But there are problems in the bank-changing-area.
      I think that finetuning the Waits and X sprite positions it can be
      done. Look at the end of this doc, you will find how to move sprites 
      with 1/4 pixel precision.
    - Keep color 0 changing all the line. This is a nice solution, but only
      is interesting for X*1 chunky displays. And you need to keep Bplcon3
      pointing to bank 0 while the displays run. You can do an interesting
      4*1 display with this technic, and you can gain some copper 
      instructions too!  (You have to change less $10C).
    - Open 3 bitplanes to cover vertical lines. Arrgggghhhhh! I HATE
      THIS SOLUTION. Pure chunky display rulez!!

     -===============-
       Posibilities
     -===============-  
 
 I think there a bunch of things you can do with this shit. A 4*1 display
  is a thing that I will sooner or later, because it can be adapted easily
  on an A500. In fact I think that there is something similar on the demo
  Clairvoyance, of Absolute.(It's sad, but always there's anybody who think 
  the same thing before.All is invented.Sigh!).
 Other posibility is to left even or odd line in black.It reduces 
  chunkyness and gives at least 6 colors more.
 Doing texture mapping in this display has some problems.For example,if 
  you are doing walls type Doom you will find easy, as you go working
  vertically, on colums. To access the next pixel you only have to add
  and offset to the addres (on my example is 104*4). But if you are doing
  ground texture mapping you need to access the pixels in horizontal.
  This needs to take horizontal offset from a table(Crap!).
  If you go for heavier things, like real texture mapping, then it turns
  easier, because it doesn't matter if you get horizontal or vertical
  lines. Get always vertical lines where you can, after all adding and
  offset (Add.l D6,A0 for example) only gets 1.3 cycles and run fullspeed.
 I think is posible get more than 256 pixels wide changing the Xposition 
  sprite's registers, but I haven't tried this yet.Try yourself, if all
  this madness hasn't been enough for you.
 
 Well,enjoy youself and play with copper chunky screens. We must play
  with true color, because early we will have AAA on our desks (I will
  not bet about this, but still we have 64 bit consoles.)
    
 Yep! Last thing: If you need 24 bit textures there is an easy wait to get
  them. Convert your Iff,Jpg,etc.. to 24-bit BMP (without compression).
  The structure of a BMP is very simple. 
         File first byte 0 ..
                1 ..
                .
                . 
                18 X size low    \ Inverted order data. It's
                19 X size high   /  PeeCeee way of life : Ugh!
                .        ^
                .        `- Picture sizes.
                22 Y size low           < 
                23 Y size high  
                     .
                     .
                     54 B component pixel 1
                     55 G component pixel 1
                     56 R component pixel 1
                     57 B    "            2
                     58 G    "            2
                     59 R    "            2
                     .
                     .
                     
 AMOS is particulary easy for dealing with this files readings and writings.
 You can gain many time using it.        

     -===============-
          At last
     -===============-                               
     
Well, I expect anyone will find this boring text useful. I believe
that some demos and games (like Fears or AB3D) could be faster using 
my trick.
Anyway, if you use this thing and improves it, or knows a better method
to do the same, or have any complain, or do any variation or want to
damn me, I will thank that you do a text telling and left it on internet.
Or a comment in your program like : "Used Zaborra Chunky Screen; at least 
until I losed my mind and turned mad".
 You can send your crap as well to relex@ceratonia.eui.upv.es


A final note: Does anybody knows a guy called Kpt. Iglo? This was my handle
 until I read a greeting on the Shocked? intro. Arrrggghhhh!! It's impo-
 sible to have a decent handle in this world! My temporary handle now
 is Almirante Peskanov, but I will change it in a few days.
  
 
 
   ,-----.    ,-----.   '         Rulez                     Z
   |     |    |        /   |                                 a
   |       -- |-----.  ----+-     forever !!                 b
   |     |    |      |     |                                 O
   `-----    `-----'           Prepare yourself for        r
                                                             r
                                   a scene in Chinesse !!  h  a
   
   
     
,-=============================-.  
|32-64 bit Sprite data structure|
`-=============================-
(I think is from YRAGAEL & JUNKIE doc. Heavy material!).
       
For 32-bit and 64-bit wide sprites use bit 3 and 2 of register $01FC
Sprite format (in particular the control words) vary for each width.

bit 3 | bit 2 | Wide        | Control Words
------+-------+-------------+----------------------------------
  0   |   0   | 16 pixels   | 2 words (normal)
  1   |   0   | 32 pixels   | 2 longwords
  0   |   1   | 32 pixels   | 2 longwords
  1   |   1   | 64 pixels   | 2 double long words (4 longwords)
---------------------------------------------------------------
Wider sprites are not available under all conditions.
The  copper  doesn't read the spritelist in the same way regarding the wide
you choose for your sprite

The adress of a 16 pixels wide sprite must be multiple of 2
The adress of a 32 pixels wide sprite must be multiple of 4
The adress of a 64 pixels wide sprite must be multiple of 16

16 pixels wide reading:

word C1, word C2
word A1, word B1
.
.
.
word An, word Bn
$0000 0000

C1=first control word
C2=second control word

Ai and Bi are combined via OR to form the sprite

32 pixels wide reading:

   CNOP  0,8      ;ALIGN 64 BIT

SPRITE32:         ;EXAMPLE OF 32 PIXELS WIDE AGA SPRITE
VSTART:
   dc.b 0      ;LONG C1
HSTART:
   DC.b 0
   DC.W 0
VSTOP:
   DC.b 0,0 ;LONG C2
   dc.w 0
 dc.L %00000000000000111100000000000000,%0000000000001000000000000000000;sprite
 dc.L %00000000000011111111000000000000,%0000000000010111100000000000000
long A3, long B3
.
.
.
long An, long Bn
   DC.W  0,0,0,0     ;END OF THE SPRITE

C1=first control long
   the  first control word is the high word of C1.  The low word of C1 must
   contain the second control word.
C2=second control long
   the second control word is the high word of C2. Low word of C2 is $0000

Ai and Bi are combined via OR to form the sprite

64 pixels wide reading:

   CNOP  0,8      ;ALIGN 64 BIT

SPRITE64:         ;EXAMPLE OF 32 PIXELS WIDE AGA SPRITE
VSTART:
   DC.B  0     ;DOUBLE C1
HSTART:
   DC.B  0
   DC.W  0
   DC.L  0
VSTOP:
   DC.B  0,0      ;DOUBLE C2
   DC.W  0
   DC.L  0
double A1, double B1
.
.
.
double An, double Bn
   DC.W  0,0,0,0,0,0,0,0   ;END OF THE SPRITE

C1=first control double
   C1=W3:W2:W1:W0 (Wi=words)
   W3 is first control word
   W2 and W1 are second control word
C2=second control double
   C2=W3:W2:W1:W0 (Wi=words)
   W3 is second control word

Ai and Bi are combined via OR to form the sprite

***************************************************************************

Moving sprites with 1/4 pixels precision: (from Yragael)

Use bits 3 and 4 of the second control word of a sprite to adjust its
position to the 1/4 pixel in lowres (every pixel in SuperHires):

 bit 0 of second control word=bit 2 horizontal position
 bit 3 of second control word=bit 0 horizontal position
 bit 4 of second control word=bit 1 horizontal position

 The position of a sprite is now coded on 11 bits and no more 9 bits !

                 
     
      
   
   
   



















































What you did here was unfair!
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    
    

    
