VDP Internals

For anything related to VDP (plane, color, sprite, tiles)

Moderators: BigEvilCorporation, Mask of Destiny

Mask of Destiny
Very interested
Posts: 615
Joined: Thu Nov 30, 2006 6:30 am

Post by Mask of Destiny » Tue Nov 13, 2012 6:40 am

With some more work, I think yes. We would need the timing of HINTs or certain HV counter changes first though. You can then use that to get a position in the line with a certain amount of uncertainty. Delay until you're sure you're in between two external access slots (taking into account the uncertainty) and then start a DMA operation.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue Nov 13, 2012 11:31 am

Chilly Willy wrote:Could the internal timing be used to figure out a way to start the DMA at the same time every frame? Right now, the code used in the DMA direct color example is black magic.
I can take the mystery out of it. I've actually specifically been working on correctly supporting the direct colour demo, so I've been looking very closely at the exact timings that occur. Yes, knowledge of the VDP internal timings, along with analysis of the M68000 timings, can be used to predict scenarios under which a stable starting point can be obtained. First let me explain exactly how the existing demos manage to obtain a stable raster position for DMA. There are a few different timing edge-cases which need to play off each other in order for these demos to work. I'm frankly amazed that the formula for a stable image was ever discovered. The basic steps are as follows:

1. Use the VBlank flag to roughly align to a position in the frame. This will have a potential margin of error larger than the total number of MCLK cycles involved in the sequence of instructions used to test the flag, since it's possible for the flag to be set before the opcode testing it finishes executing. In other words, it could be dozens of pixels out of alignment.
2. Use a full FIFO to align to a specific position in the frame. Note that this needs to be done where there is a large enough gap between access slots to absorb all the inaccuracy in the above alignment method. Since the VBlank flag is cleared just after the two consecutive access slots near the end of a line, this provides a very large gap between consecutive access slots in H40 mode, which is large enough to eliminate the error in the above synchronization method. We are still left with a few SC cycles of inaccuracy though, since VCLK is slower than SC, and the counters are not aligned, so any modification we attempt to VDP state now could be 1-2 pixels off.
3. Trigger a DMA transfer operation, carefully padding the cycle at which the DMA operation is triggered onto the exact position where a refresh cycle occurs. The refresh cycle will align the start of the DMA operation onto a 4 SC (2 pixel clock) boundary, eliminating the error introduced by the full FIFO alignment method.

Effectively, each consecutive step is designed to provide synchronization from the best and worst case timing of the preceeding step. The combination of these three methods allows a perfect synchronization to be achieved. You'll find in the supplied demo code, the position of the NOP instructions is unimportant, and can even occur after the display is re-enabled, as long as they occur after we synchronize on VBlank, and before the DMA operation is triggered. These NOP instructions are just required to pad the MCLK cycle count so that the DMA operation is triggered at the right cycle to align with a refresh cycle. You'll find that 22 or 40 NOP instructions are also stable, as these manage to "hit" the next consecutive refresh slots, and you can go on adding 18 NOP instructions to find each additional refresh slot until the end of the line.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue Nov 13, 2012 11:42 am

I'm going to publish my very latest timing information that I've obtained about the alignment of various digital and analog events with the HV counter. Here's the data I've sampled for horizontal scan information:

Code: Select all

	//Analog screen sections in relation to HCounter (H32 mode):
	//-----------------------------------------------------------------
	//| Screen section | HCounter  |Pixel| Pixel |Serial|Serial |MCLK |
	//| (PAL/NTSC H32) |  value    |clock| clock |clock |clock  |ticks|
	//|                |           |ticks|divider|ticks |divider|     |
	//|----------------|-----------|-----|-------|------|-------|-----|
	//|Left border     |0x00B-0x017|  13 |SCLK/2 |   26 |MCLK/5 | 130 |
	//|----------------|-----------|-----|-------|------|-------|-----|
	//|Active display  |0x018-0x117| 256 |SCLK/2 |  512 |MCLK/5 |2560 |
	//|----------------|-----------|-----|-------|------|-------|-----|
	//|Right border    |0x118-0x125|  14 |SCLK/2 |   28 |MCLK/5 | 140 |
	//|----------------|-----------|-----|-------|------|-------|-----|
	//|Back porch      |0x126-0x127|   9 |SCLK/2 |   18 |MCLK/5 |  90 |
	//|(Right Blanking)|0x1D2-0x1D8|     |       |      |       |     |
	//|----------------|-----------|-----|-------|------|-------|-----|
	//|Horizontal sync |0x1D9-0x1F2|  26 |SCLK/2 |   52 |MCLK/5 | 260 |
	//|----------------|-----------|-----|-------|------|-------|-----|
	//|Front porch     |0x1F3-0x00A|  24 |SCLK/2 |   48 |MCLK/5 | 240 |
	//|(Left Blanking) |           |     |       |      |       |     |
	//|----------------|-----------|-----|-------|------|-------|-----|
	//|TOTALS          |           | 342 |       |  684 |       |3420 |
	//-----------------------------------------------------------------

	//Analog screen sections in relation to HCounter (H40 mode):
	//--------------------------------------------------------------------
	//| Screen section |   HCounter    |Pixel| Pixel |EDCLK| EDCLK |MCLK |
	//| (PAL/NTSC H40) |    value      |clock| clock |ticks|divider|ticks|
	//|                |               |ticks|divider|     |       |     |
	//|----------------|---------------|-----|-------|-----|-------|-----|
	//|Left border     |0x00D-0x019    |  13 |EDCLK/2|  26 |MCLK/4 | 104 |
	//|----------------|---------------|-----|-------|-----|-------|-----|
	//|Active display  |0x01A-0x159    | 320 |EDCLK/2| 640 |MCLK/4 |2560 |
	//|----------------|---------------|-----|-------|-----|-------|-----|
	//|Right border    |0x15A-0x167    |  14 |EDCLK/2|  28 |MCLK/4 | 112 |
	//|----------------|---------------|-----|-------|-----|-------|-----|
	//|Back porch      |0x168-0x16C    |   9 |EDCLK/2|  18 |MCLK/4 |  72 |
	//|(Right Blanking)|0x1C9-0x1CC    |     |       |     |       |     |
	//|----------------|---------------|-----|-------|-----|-------|-----|
	//|Horizontal sync |0x1CD.0-0x1D4.5| 7.5 |EDCLK/2|  15 |MCLK/5 |  75 |
	//|                |0x1D4.5-0x1D5.5|   1 |EDCLK/2|   2 |MCLK/4 |   8 |
	//|                |0x1D5.5-0x1DC.0| 7.5 |EDCLK/2|  15 |MCLK/5 |  75 |
	//|                |0x1DD.0        |   1 |EDCLK/2|   2 |MCLK/4 |   8 |
	//|                |0x1DE.0-0x1E5.5| 7.5 |EDCLK/2|  15 |MCLK/5 |  75 |
	//|                |0x1E5.5-0x1E6.5|   1 |EDCLK/2|   2 |MCLK/4 |   8 |
	//|                |0x1E6.5-0x1EC.0| 6.5 |EDCLK/2|  13 |MCLK/5 |  65 |
	//|                |===============|=====|=======|=====|=======|=====|
	//|        Subtotal|0x1CD-0x1EC    | (32)|       | (64)|       |(314)|
	//|----------------|---------------|-----|-------|-----|-------|-----|
	//|Front porch     |0x1ED          |   1 |EDCLK/2|   2 |MCLK/5 |  10 |
	//|(Left Blanking) |0x1EE-0x00C    |  31 |EDCLK/2|  62 |MCLK/4 | 248 |
	//|                |===============|=====|=======|=====|=======|=====|
	//|        Subtotal|0x1ED-0x00C    | (32)|       | (64)|       |(258)|
	//|----------------|---------------|-----|-------|-----|-------|-----|
	//|TOTALS          |               | 420 |       | 840 |       |3420 |
	//--------------------------------------------------------------------

	//Digital render events in relation to HCounter:
	//----------------------------------------------------
	//|        Video |PAL/NTSC         |PAL/NTSC         |
	//|         Mode |H32     (RSx=00) |H40     (RSx=11) |
	//|              |V28/V30 (M2=*)   |V28/V30 (M2=*)   |
	//| Event        |Int any (LSMx=**)|Int any (LSMx=**)|
	//|--------------------------------------------------|
	//|HCounter      |[1]0x000-0x127   |[1]0x000-0x16C   |
	//|progression   |[2]0x1D2-0x1FF   |[2]0x1C9-0x1FF   |
	//|9-bit internal|                 |                 |
	//|--------------------------------------------------|
	//|VCounter      |HCounter changes |HCounter changes |
	//|increment     |from 0x109 to    |from 0x149 to    |
	//|              |0x10A in [1].    |0x14A in [1].    |
	//|--------------------------------------------------| //Logic analyzer tests conducted on 2012-11-03 confirm 18 SC
	//|HBlank set    |HCounter changes |HCounter changes | //cycles between HBlank set in status register and HSYNC
	//|              |from 0x125 to    |from 0x165 to    | //asserted in H32 mode, and 21 SC cycles in H40 mode.
	//|              |0x126 in [1].    |0x166 in [1].    | //Note this actually means in H40 mode, HBlank is set at 0x166.5.
	//|--------------------------------------------------| //Logic analyzer tests conducted on 2012-11-03 confirm 46 SC
	//|HBlank cleared|HCounter changes |HCounter changes | //cycles between HSYNC cleared and HBlank cleared in status
	//|              |from 0x009 to    |from 0x00A to    | //register in H32 mode, and 61 SC cycles in H40 mode.
	//|              |0x00A in [1].    |0x00B in [1].    | //Note this actually means in H40 mode, HBlank is cleared at 0x00B.5.
	//|--------------------------------------------------|
	//|F flag set    |HCounter changes |HCounter changes | //Logic analyzer tests conducted on 2012-11-03 confirm 28 SC
	//|              |from 0x000 to    |from 0x000 to    | //cycles between HSYNC cleared and odd flag toggled in status
	//|              |0x001 in [1]     |0x001 in [1]     | //register in H32 mode, and 40 SC cycles in H40 mode.
	//|--------------------------------------------------|
	//|ODD flag      |HCounter changes |HCounter changes | //Logic analyzer tests conducted on 2012-11-03 confirm 30 SC
	//|toggled       |from 0x001 to    |from 0x001 to    | //cycles between HSYNC cleared and odd flag toggled in status
	//|              |0x002 in [1]     |0x002 in [1]     | //register in H32 mode, and 42 SC cycles in H40 mode.
	//|--------------------------------------------------|
	//|HINT flagged  |HCounter changes |HCounter changes | //Logic analyzer tests conducted on 2012-11-02 confirm 74 SC
	//|via IPL lines |from 0x109 to    |from 0x149 to    | //cycles between HINT flagged in IPL lines and HSYNC
	//|              |0x10A in [1].    |0x14A in [1].    | //asserted in H32 mode, and 78 SC cycles in H40 mode.
	//|--------------------------------------------------|
	//|VINT flagged  |HCounter changes |HCounter changes | //Logic analyzer tests conducted on 2012-11-02 confirm 28 SC
	//|via IPL lines |from 0x000 to    |from 0x000 to    | //cycles between HSYNC cleared and VINT flagged in IPL lines
	//|              |0x001 in [1].    |0x001 in [1].    | //in H32 mode, and 40 SC cycles in H40 mode.
	//|--------------------------------------------------|
	//|HSYNC asserted|HCounter changes |HCounter changes |
	//|              |from 0x1D8 to    |from 0x1CC to    |
	//|              |0x1D9 in [2].    |0x1CD in [2].    |
	//|--------------------------------------------------|
	//|HSYNC negated |HCounter changes |HCounter changes |
	//|              |from 0x1F2 to    |from 0x1EC to    |
	//|              |0x1F3 in [2].    |0x1ED in [2].    |
	//----------------------------------------------------
And here's my data for vertical scan information:

Code: Select all

	//Analog screen sections in relation to VCounter:
	//-------------------------------------------------------------------------------------------
	//|           Video |NTSC             |NTSC             |PAL              |PAL              |
	//|            Mode |H32/H40(RSx00/11)|H32/H40(RSx00/11)|H32/H40(RSx00/11)|H32/H40(RSx00/11)|
	//|                 |V28     (M2=0)   |V30     (M2=1)   |V28     (M2=0)   |V30     (M2=1)   |
	//|                 |Int none(LSMx=*0)|Int none(LSMx=*0)|Int none(LSMx=*0)|Int none(LSMx=*0)|
	//|                 |------------------------------------------------------------------------
	//|                 | VCounter  |Line | VCounter  |Line | VCounter  |Line | VCounter  |Line |
	//| Screen section  |  value    |count|  value    |count|  value    |count|  value    |count|
	//|-----------------|-----------|-----|-----------|-----|-----------|-----|-----------|-----|
	//|Active display   |0x000-0x0DF| 224 |0x000-0x1FF| 240*|0x000-0x0DF| 224 |0x000-0x0EF| 240 |
	//|-----------------|-----------|-----|-----------|-----|-----------|-----|-----------|-----|
	//|Bottom border    |0x0E0-0x0E7|   8 |           |   0 |0x0E0-0x0FF|  32 |0x0F0-0x107|  24 |
	//|-----------------|-----------|-----|-----------|-----|-----------|-----|-----------|-----|
	//|Bottom blanking  |0x0E8-0x0EA|   3 |           |   0 |0x100-0x102|   3 |0x108-0x10A|   3 |
	//|-----------------|-----------|-----|-----------|-----|-----------|-----|-----------|-----|
	//|Vertical sync    |0x1E5-0x1E7|   3 |           |   0 |0x1CA-0x1CC|   3 |0x1D2-0x1D4|   3 |
	//|-----------------|-----------|-----|-----------|-----|-----------|-----|-----------|-----|
	//|Top blanking     |0x1E8-0x1F4|  13 |           |   0 |0x1CD-0x1D9|  13 |0x1D5-0x1E1|  13 |
	//|-----------------|-----------|-----|-----------|-----|-----------|-----|-----------|-----|
	//|Top border       |0x1F5-0x1FF|  11 |           |   0 |0x1DA-0x1FF|  38 |0x1E2-0x1FF|  30 |
	//|-----------------|-----------|-----|-----------|-----|-----------|-----|-----------|-----|
	//|TOTALS           |           | 262 |           | 240*|           | 313 |           | 313 |
	//-------------------------------------------------------------------------------------------
	//*When V30 mode and NTSC mode are both active, no border, blanking, or retrace
	//occurs. A 30-row display is setup and rendered, however, immediately following the
	//end of the 30th row, the 1st row starts again. In addition, the VCounter is never
	//reset, which usually happens at the beginning of vertical blanking. Instead, the
	//VCounter continuously counts from 0x000-0x1FF, then wraps around back to 0x000 and
	//begins again. Since there are only 240 lines output as part of the display, this
	//means the actual line being rendered is desynchronized from the VCounter. Digital
	//events such as vblank flags being set/cleared, VInt being triggered, the odd flag
	//being toggled, and so forth, still occur at the correct VCounter positions they
	//would occur in (IE, the same as PAL mode V30), however, since the VCounter has 512
	//lines per cycle, this means VInt is triggered at a slower rate than normal.
	//##TODO## Confirm on the hardware that the rendering row is desynchronized from the
	//VCounter. This would seem unlikely, since a separate render line counter would have
	//to be maintained apart from VCounter for this to occur.

	//Digital render events in relation to VCounter under NTSC mode:
	//#ODD - Runs only when the ODD flag is set
	//----------------------------------------------------------------------------------------
	//|        Video |NTSC             |NTSC             |NTSC             |NTSC             |
	//|         Mode |H32/H40(RSx00/11)|H32/H40(RSx00/11)|H32/H40(RSx00/11)|H32/H40(RSx00/11)|
	//|              |V28     (M2=0)   |V28     (M2=0)   |V30     (M2=1)   |V30     (M2=1)   |
	//| Event        |Int none(LSMx=*0)|Int both(LSMx=*1)|Int none(LSMx=*0)|Int both(LSMx=*1)|
	//|--------------------------------------------------------------------------------------|
	//|VCounter      |[1]0x000-0x0EA   |[1]0x000-0x0EA   |[1]0x000-0x1FF   |[1]0x000-0x1FF   |
	//|progression   |[2]0x1E5-0x1FF   |[2]0x1E4(#ODD)   |                 |                 |
	//|9-bit internal|                 |[3]0x1E5-0x1FF   |                 |                 |
	//|--------------------------------------------------------------------------------------|
	//|VBlank set    |VCounter changes |                 |VCounter changes |                 |
	//|              |from 0x0DF to    |     <Same>      |from 0x0EF to    |     <Same>      |
	//|              |0x0E0 in [1].    |                 |0x0F0 in [1].    |                 |
	//|--------------------------------------------------------------------------------------|
	//|VBlank cleared|VCounter changes |                 |VCounter changes |                 |
	//|              |from 0x1FE to    |     <Same>      |from 0x1FE to    |     <Same>      |
	//|              |0x1FF in [2].    |                 |0x1FF in [1].    |                 |
	//|--------------------------------------------------------------------------------------|
	//|F flag set    |At indicated     |                 |At indicated     |                 |
	//|              |HCounter position|                 |HCounter position|                 |
	//|              |while VCounter is|     <Same>      |while VCounter is|     <Same>      |
	//|              |set to 0x0E0 in  |                 |set to 0x0F0 in  |                 |
	//|              |[1].             |                 |[1].             |                 |
	//|--------------------------------------------------------------------------------------|
	//|VSYNC asserted|VCounter changes |                 |      Never      |                 |
	//|              |from 0x0E7 to    |     <Same>      |                 |     <Same>      |
	//|              |0x0E8 in [1].    |                 |                 |                 |
	//|--------------------------------------------------------------------------------------|
	//|VSYNC cleared |VCounter changes |                 |      Never      |                 |
	//|              |from 0x1F4 to    |     <Same>      |                 |     <Same>      |
	//|              |0x1F5 in [2].    |                 |                 |                 |
	//|--------------------------------------------------------------------------------------|
	//|ODD flag      |At indicated     |                 |At indicated     |                 |
	//|toggled       |HCounter position|                 |HCounter position|                 |
	//|              |while VCounter is|     <Same>      |while VCounter is|     <Same>      |
	//|              |set to 0x0E0 in  |                 |set to 0x0F0 in  |                 |
	//|              |[1].             |                 |[1].             |                 |
	//----------------------------------------------------------------------------------------

	//Digital render events in relation to VCounter under PAL mode:
	//#ODD - Runs only when the ODD flag is set
	//----------------------------------------------------------------------------------------
	//|        Video |PAL              |PAL              |PAL              |PAL              |
	//|         Mode |H32/H40(RSx00/11)|H32/H40(RSx00/11)|H32/H40(RSx00/11)|H32/H40(RSx00/11)|
	//|              |V28     (M2=0)   |V28     (M2=0)   |V30     (M2=1)   |V30     (M2=1)   |
	//| Event        |Int none(LSMx=*0)|Int both(LSMx=*1)|Int none(LSMx=*0)|Int both(LSMx=*1)|
	//|--------------------------------------------------------------------------------------|
	//|VCounter      |[1]0x000-0x102   |[1]0x000-0x101   |[1]0x000-0x10A   |[1]0x000-0x109   |
	//|progression   |[2]0x1CA-0x1FF   |[2]0x1C9(#ODD)   |[2]0x1D2-0x1FF   |[2]0x1D1(#ODD)   |
	//|9-bit internal|                 |[3]0x1CA-0x1FF   |                 |[3]0x1D2-0x1FF   |
	//|--------------------------------------------------------------------------------------|
	//|VBlank set    |VCounter changes |                 |VCounter changes |                 |
	//|              |from 0x0DF to    |     <Same>      |from 0x0EF to    |     <Same>      |
	//|              |0x0E0 in [1].    |                 |0x0F0 in [1].    |                 |
	//|--------------------------------------------------------------------------------------|
	//|VBlank cleared|VCounter changes |                 |VCounter changes |                 |
	//|              |from 0x1FE to    |     <Same>      |from 0x1FE to    |     <Same>      |
	//|              |0x1FF in [2].    |                 |0x1FF in [2].    |                 |
	//|--------------------------------------------------------------------------------------|
	//|F flag set    |At indicated     |                 |At indicated     |                 |
	//|              |HCounter position|                 |HCounter position|                 |
	//|              |while VCounter is|     <Same>      |while VCounter is|     <Same>      |
	//|              |set to 0x0E0 in  |                 |set to 0x0F0 in  |                 |
	//|              |[1].             |                 |[1].             |                 |
	//|--------------------------------------------------------------------------------------|
	//|VSYNC asserted|VCounter changes |                 |VCounter changes |                 |
	//|              |from 0x0FF to    |     <Same>      |from 0x107 to    |     <Same>      |
	//|              |0x100 in [1].    |                 |0x108 in [1].    |                 |
	//|--------------------------------------------------------------------------------------|
	//|VSYNC cleared |VCounter changes |                 |VCounter changes |                 |
	//|              |from 0x1D9 to    |     <Same>      |from 0x1E1 to    |     <Same>      |
	//|              |0x1DA in [2].    |                 |0x1E2 in [2].    |                 |
	//|--------------------------------------------------------------------------------------|
	//|ODD flag      |At indicated     |                 |At indicated     |                 |
	//|toggled       |HCounter position|                 |HCounter position|                 |
	//|              |while VCounter is|     <Same>      |while VCounter is|     <Same>      |
	//|              |set to 0x0E0 in  |                 |set to 0x0F0 in  |                 |
	//|              |[1].             |                 |[1].             |                 |
	//----------------------------------------------------------------------------------------
The samplings of the analog screen sections are mostly based on the information that's already been published here on SpritesMind by various other authors, and I haven't independently verified all of it. All the measurements about the alignment of the status register flags, sync lines, and interrupt events, have been sampled by me though, and have been confirmed to an exact SC cycle. I used a very useful fact about the VDP in order to do this, which is that for the first couple of clock cycles when a value is being read from the status register or HV counter, but before it's been latched by the M68000, the data is actually output "live" from the internal register state, meaning if the register state changes while the data is being read and its value asserted over the external bus, its value will actually immediately change. This can be observed using a logic analyser, and the exact SC cycle at which the change occurs can then be determined. I used this method to obtain these values.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue Nov 13, 2012 12:31 pm

I've just done some more messing around with the direct color demo timing. The earliest possible timing that's stable happens when you reduce the number of FIFO writes from 13 to 5, and then insert 10 NOP instructions. This will give you a stable alignment for triggering DMA partway through the first line before the active display, with vcounter =0x1FF. Using 5 FIFO writes with 94 NOP instructions will give you the same timing as the original demo. Using this timing is probably better, since you now have 94 NOPs (376MCLK cycles) to play with, which could be used to perform calculations while you're waiting for alignment.

The next step is obviously to get a stable DMA position in H32 mode. That fails for two reasons with the H40 timing. First of all, since there's an extra access slot in the middle of the blanking period in H32 mode, we end up with our first FIFO write sometimes ending up in that slot and sometimes missing that slot. That's easily fixed by running 4 NOP opcodes just after waiting for the VBlank flag to be cleared. We'll now have alignment with the next access slot. We now need to pad to hit a refresh cycle before starting our DMA operation. The earliest timing for that is to use 5 FIFO writes, with 17 NOP cycles. That will give alignment partway through the first line before active display. Using 92 NOP instructions will hit the same refresh slot as the H40 mode demo, with a stable display.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue Nov 13, 2012 12:45 pm

Just to make this direct color timing info I just wrote up a bit more accessible, I'll post some code. Based on the original sourcecode provided by Oerg866 in this thread:
viewtopic.php?p=16233#16233
Modify the synchronization code block to the following for H40 mode:

Code: Select all

@SyncToNewFrame: 
        btst #3, (a3)
        beq.s @SyncToNewFrame
@SyncToNewFrame1: 
        btst #3, (a3)
        bne.s @SyncToNewFrame1

        rept 5
           move.w d0, (a5)
        endr
        rept 94
            nop
        endr

        ; Execute DMA
        MOVE.l  #$934094ad,             (a6)  ; INIT dma to cram 
        MOVE.l  #$95009600,             (a6)   
        MOVE.l  #$97018114,             (a6)   
        MOVE.l  #$C0000080,             (a6)  ; micro optimisations :UUUUU
        bra.w HBLwait
And for H32 mode, apart from doing "move.w #$8c00, (a6)" instead of "move.w #$8c81, (a6)", modify the synchronization code block to the following:

Code: Select all

@SyncToNewFrame: 
        btst #3, (a3)
        beq.s @SyncToNewFrame
@SyncToNewFrame1: 
        btst #3, (a3)
        bne.s @SyncToNewFrame1
        nop
        nop
        nop
        nop

        rept 5
           move.w d0, (a5)
        endr
        rept 92
            nop
        endr

        ; Execute DMA
        MOVE.l  #$934094ad,             (a6)  ; INIT dma to cram 
        MOVE.l  #$95009600,             (a6)   
        MOVE.l  #$97018114,             (a6)   
        MOVE.l  #$C0000080,             (a6)  ; micro optimisations :UUUUU
        bra.w HBLwait
That will give you a stable raster in H32 mode, but the existing image will of course be garbage since it's designed for a H40 mode display.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Tue Nov 13, 2012 5:21 pm

I did a few H32 demos in that thread as well. Where I had the problem was H32 mode using the SCD word ram as the source of the bitmap. I'll try the SCD demo with both H40 and H32 images again with your code and see what happens. :D

That's a great explanation of what's going on. It's no longer black magic... just normal magic instead. :lol:

Okay, the code proposed by Nemesis does an excellent job of giving a stable display, for both H40 and H32. I posted a new SCD demo over in the demo forum. Here's the code for the dma:

Code: Select all

dma_screen:
        move.l  4(sp),d0                /* buffer (assumed to be 0x200000) */
        move.l  8(sp),d1                /* wide flag */
        movem.l d2-d7/a2-a6,-(sp)
        move.w  #0x2700,sr              /* disallow interrupts */
        move.l  d1,d7                   /* save wide flag */

        /* clear palette */
        moveq   #0,d0
        lea     0xC00000,a2
        lea     0xC00004,a3
        moveq   #31,d1
        move.l  #0xC0000000,(a3)        /* write CRAM address 0 */
1:
        move.l  d0,(a2)                 /* clear palette */
        dbra    d1,1b

        /* init VDP regs */
        move.w  #0x8F00,(a3)            /*  clear INC register */
        tst.b   d7
        bne.b   dma_wide                /* wide screen display loop */
        move.w  #0x8C00,(a3)            /* H32 mode, no lace, no shadow/hilite */
        bra.w   dma_narrow              /* narrow screen display loop */

        /* loop - turn on display and wait for vblank */
dma_wide:
        moveq   #0,d0
        move.w  #0x8154,(a3)            /* turn on display (no VB int, V28 mode) */
        move.l  #0x40000000,(a3)        /* write vram address 0 */
1:
        btst    #3,1(a3)
        beq.b   1b                      /* wait for VB */
2:
        btst    #3,1(a3)
        bne.b   2b                      /* wait for not VB */

        .rept   5
        move.w  d0,(a2)
        .endr

        .rept   94
        nop
        .endr

        /* Execute DMA */
        move.l  #0x934094AD,(a3)        /* DMALEN LO/HI = 0xAD40 (198*224) */
        move.l  #0x95009600,(a3)        /* DMA SRC LO/MID */
        move.l  #0x97108114,(a3)        /* DMA SRC HI/MODE, Turn off Display */
        move.l  #0xC0000080,(a3)        /* start DMA */
        /* CPU is halted until DMA is complete */

        /* display finished, check request switch ram banks */
        btst    #7,0xA1200F
        beq.b   4f
        bset    #1,0xA12003             /* request switch ram banks */
3:
        btst    #1,0xA12003
        bne.b   3b                      /* wait for bank switch */
4:
        /* do other tasks here */
        pea     0.w
        bsr.w   get_pad
        addq.l  #4,sp
        move.w  d0,0xA12018

        pea     1.w
        bsr.w   get_pad
        addq.l  #4,sp
        move.w  d0,0xA1201A

        move.l  0xA1201C,d0
        addq.l  #1,d0
        move.l  d0,0xA1201C             /* increment ticks */

        move.w  0xA12000,d0
        ori.w   #0x0100,d0
        move.w  d0,0xA12000             /* generate level 2 int for CD */

        moveq   #0x7F,d0
        and.b   0xA1200F,d0
        cmpi.b  #MD_CMD_DMA_SCREEN,d0
        bne.b   exit_wide               /* no longer requesting DMA color display */
        bra.w   dma_wide
exit_wide:
        bsr.w   md_init_hw
        movem.l (sp)+,d2-d7/a2-a6
        rts

        /* loop - turn on display and wait for vblank */
dma_narrow:
        moveq   #0,d0
        move.w  #0x8154,(a3)            /* turn on display (no VB int, V28 mode) */
        move.l  #0x40000000,(a3)        /* write vram address 0 */
1:
        btst    #3,1(a3)
        beq.b   1b                      /* wait for VB */
2:
        btst    #3,1(a3)
        bne.b   2b                      /* wait for not VB */

        .rept   4
        nop
        .endr

        .rept   5
        move.w  d0,(a2)
        .endr

        .rept   92
        nop
        .endr

        /* Execute DMA */
        move.l  #0x93E0948C,(a3)        /* DMALEN LO/HI = 0x8CE0 (161*224) */
        move.l  #0x95009600,(a3)        /* DMA SRC LO/MID */
        move.l  #0x97108114,(a3)        /* DMA SRC HI/MODE, Turn off Display */
        move.l  #0xC0000080,(a3)        /* start DMA */
        /* CPU is halted until DMA is complete */

        /* display finished, check request switch ram banks */
        btst    #7,0xA1200F
        beq.b   4f
        bset    #1,0xA12003             /* request switch ram banks */
3:
        btst    #1,0xA12003
        bne.b   3b                      /* wait for bank switch */
4:
        /* do other tasks here */
        pea     0.w
        bsr.w   get_pad
        addq.l  #4,sp
        move.w  d0,0xA12018

        pea     1.w
        bsr.w   get_pad
        addq.l  #4,sp
        move.w  d0,0xA1201A

        move.l  0xA1201C,d0
        addq.l  #1,d0
        move.l  d0,0xA1201C             /* increment ticks */

        move.w  0xA12000,d0
        ori.w   #0x0100,d0
        move.w  d0,0xA12000             /* generate level 2 int for CD */

        moveq   #0x7F,d0
        and.b   0xA1200F,d0
        cmpi.b  #MD_CMD_DMA_SCREEN,d0
        bne.b   exit_narrow             /* no longer requesting DMA color display */
        bra.w   dma_narrow
exit_narrow:
        bsr.w   md_init_hw
        movem.l (sp)+,d2-d7/a2-a6
        rts
Note that I assume the SCD word ram buffer is always 0x200000 now. That's not really a problem since that's what it's normally going to be. You could change that if need be, but it's easy to deal with on either side. Notice I set the DMA length for the correct length for H32 mode. Note that I use the nops and do other things before waiting for the vblank. The nops right after the vblank do the trick for making H32 mode stable. :D

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue Nov 13, 2012 9:50 pm

Looking back at this thread, I just wanted to say sorry Mask of Destiny for hijacking/derailing your thread. I should have created a separate thread for the direct colour demo stuff.

With the timing info I posted, I'm pretty confident I have accurate timings for everything that's externally visible, but exactly when the VDP accesses VSRAM and the sprite cache is important information that we still don't have. I'm keen to try some of those tests to try and identify when access to the sprite cache occurs. I'm not sure how to go about testing access to VSRAM though. Any ideas?

Eke
Very interested
Posts: 884
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Tue Nov 13, 2012 10:40 pm

i remember some games ( i think road rash) updating vscroll during hint callback (during hblank) and expects the scroll value to take effect on the next line only, not the coming line so you might want to try that.

I am also pretty sure vscroll mode makes some difference too: see my post in regen forum about batman & robin, that game uses 2-cell scrolling and continuously updates vscroll entries during active display BUT expects the writes to take effect on the next line only. The writes timing is made so that each writes occur after the associated 2-cell column has already been rendered.

About the various registers latch, thank you for posting your infos: this seems to fit with what i figured out from analysis (the more you think about it, VDP is very logical design), with one difference : are you sure V28/V30 mode cannot be switched during active display, just like mode4 switching can be used during active display to change the screen height ? also, what happen if V30 mode is set at the start of vblank then resetted to V28 before the end of vblank ? i think i tested it on my pal mega drive but can't remember more, my issue was actually figuring how bottom/top borders were affected by these kind of switching...

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Wed Nov 14, 2012 12:33 am

Eke wrote:About the various registers latch, thank you for posting your infos: this seems to fit with what i figured out from analysis (the more you think about it, VDP is very logical design), with one difference : are you sure V28/V30 mode cannot be switched during active display, just like mode4 switching can be used during active display to change the screen height ? also, what happen if V30 mode is set at the start of vblank then resetted to V28 before the end of vblank ? i think i tested it on my pal mega drive but can't remember more, my issue was actually figuring how bottom/top borders were affected by these kind of switching...
Now that's an interesting thought... you can't use V30 mode on NTSC because it uses too many lines, but if you could switch to V28 mode before it screwed up the vertical period, you could get nearly V30 height on NTSC.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Wed Nov 14, 2012 12:47 am

The vscroll data latching definitely needs more testing. I'll try and construct some tests to interrogate the timing. As for the V28/V30 mode, I can re-test, but I'm quite confident this register is only latched at the start of vblank, at the same point where the interlace mode settings are latched. I performed these tests by actively toggling the target setting back and forth infinitely during execution. For the V28/V30 setting, I only ever got complete, stable frames of either V28 or V30 height with this test.

Oerg866
Very interested
Posts: 211
Joined: Sat Apr 19, 2008 10:58 am
Location: Frankfurt, Germany
Contact:

Post by Oerg866 » Wed Nov 14, 2012 2:01 pm

You guys rock. Keep it up :)

My current project is to exploit any possibilities to do any calcultions at all during the free cycles.

Nemesis,

do you know if and (if yes) how far preliminary ENDING the DMA early to only fill, say, half of the screen, will affect the timing? This could lead to interesting results to mix the direct color technique with ordinary data display for scrollers whatsoever.

Oerg866
Very interested
Posts: 211
Joined: Sat Apr 19, 2008 10:58 am
Location: Frankfurt, Germany
Contact:

Post by Oerg866 » Thu Feb 14, 2013 1:55 am

Haha what do you know, it works :)

Image

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Thu Feb 14, 2013 6:31 am

Nifty! Seems like a smaller direct color display over a normal display is possible, which means the mode is usable for MD games vs MCD games (something like Zero Tolerance).

I suppose the display in the bottom is from the "proper" line of the name table? I would guess the vertical count (which clearly continues while the display is off or you wouldn't get the vblank at the proper time) determines where the display is fetched from.

Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald » Fri Feb 15, 2013 11:31 pm

It's probably not too useful:

With shadow/highlight mode enabled, if you switch into V30 mode at any point in the active display and then into V28 mode on line $E1 or later, the bottom border area of the current frame and the top border of the next frame are shown at half brightness instead of normal brightness.

However it's only the center 320-pixel area associated with the active display that is affected, the left and right horizontal borders are still at normal brightness.

You have to use some caution as switching into V30 during line $E0 makes the V-Blank interrupt occur later at line $F0, but then when you switch back you are already passed the V-Blank interrupt's trigger point so it doesn't happen at all.

If you try this, remember that you need to be in V28 mode by line $EA at the latest, else the V30 mode setting is (I guess) latched and takes effect for the next frame.

There seems to be another latching point later, around $F5 or $F6 if you are in V30 mode the next frame is blanked black, though you don't get the synchronization distortion you'd get at line $EA. So for whatever reason maybe there are two times the VDP samples the V28/V30 state to make some internal decisions.

Generally speaking turn V30 mode on after line 0 and off before line $EA.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Sat Feb 16, 2013 1:50 am

That's still 234 lines on NTSC... I'd probably just go one more cell for 232 lines. I'd combine functions and not just switch modes, but then jump to the vblank code. No need to even have a vblank interrupt.

Post Reply