Super VDP

Ask anything your want about the 32X Mushroom programming.

Moderator: BigEvilCorporation

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy »

You need to either avoid using the 68K->SH2 "DMA" channel, or add error detection to it as it's buggy. Real games that use the 68K->SH2 DMA do retries for when the transfer fails. My own experiments have shown it's really damn buggy. I go with a processor polled exchange of data through the communication registers instead.

Redesigning the software might be better. Instead of transferring the data from the 68000 side, perhaps it might be stuck into the 32X framebuffer instead. Have code to manage the frame buffer memory, allocated space for the tiles and tile map, fill it from the 68000 or SH2, and then just send commands to the SH2 side for things like scrolling and palette changes.
ehaliewicz
Very interested
Posts: 50
Joined: Tue Dec 24, 2013 1:00 am

Post by ehaliewicz »

Chilly Willy wrote:You need to either avoid using the 68K->SH2 "DMA" channel, or add error detection to it as it's buggy. Real games that use the 68K->SH2 DMA do retries for when the transfer fails. My own experiments have shown it's really damn buggy. I go with a processor polled exchange of data through the communication registers instead.

Redesigning the software might be better. Instead of transferring the data from the 68000 side, perhaps it might be stuck into the 32X framebuffer instead. Have code to manage the frame buffer memory, allocated space for the tiles and tile map, fill it from the 68000 or SH2, and then just send commands to the SH2 side for things like scrolling and palette changes.
Yeah, re-trying DMA requests sounds like a pain, but the second idea seems like it's worth a shot. I'll post results once I have time to try it out.
Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy »

I've thought about trying my hand at a SuperVDP implementation as well, but have been swamped with other things.

Anywho, there's two modes for the 68K->SH2 DMA channel in the 32X IO chip. The first mode is meant for carts, and uses the 68000 to store data to the FIFO to trigger DMA operations. This works somewhat... you need to keep lengths short (no more than a couple hundred bytes), checksum the data sent, use a timeout on SH2 DMA, and retry on checksum errors and timeouts. The second mode is meant for CD32X games, and uses a MD VDP DMA operation to read the CD word ram (when switched to the MD side); you set the VDP increment to 0 so that it doesn't actually write more than one word of vram. The 32X IO chip then recognizes the DMA read to trigger DMA operations. This doesn't work at all... Sega engineers couldn't get the timing right before time for the production run of the 32X chipset, so all you get is garbage. I suspect that in trying to get the timing for CD transfers to work, they slightly screwed up the timing for 68000 transfers, which is where the corruption comes from.

In any case, existing CD32X software uses the 68000 transfer as well since the mode meant for the CD doesn't work.
Vic
Interested
Posts: 31
Joined: Wed Nov 03, 2021 6:01 pm

Re: Super VDP

Post by Vic »

In the end, Chilly Willy was able to get DREQ DMA to work by changing his code from this:

Code: Select all

        SH2_DMA_CHCR0; // read TE
        SH2_DMA_CHCR0 = 0; // clear TE
        SH2_DMA_DAR0 = 0x20000000 | (int)&data[0];
        SH2_DMA_TCR0 = j;
        SH2_DMA_CHCR0 = 0x44E1; // set DMA control mode - start DMA
        MARS_SYS_COMM0 = 0x0003; // SH2 DMA started
to this:

Code: Select all

        SH2_DMA_CHCR0; // read TE
        SH2_DMA_CHCR0 = 0x44E0; // clear TE
        SH2_DMA_DAR0 = 0x20000000 | (int)&data[0];
        SH2_DMA_TCR0 = j;
        SH2_DMA_CHCR0 = 0x44E1; // set DMA control mode - start DMA
        MARS_SYS_COMM0 = 0x0003; // SH2 DMA started
Seems to work fine without any delays on the 68k side.
ob1
Very interested
Posts: 467
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Re: Super VDP

Post by ob1 »

The only difference I spot is when clearing TE.
It would look like that when writing to CHCR, the common bits ($44Ex) must stay the same.
So, when clearing TE (aborting current DMA) don't do that:

Code: Select all

        SH2_DMA_CHCR0; // read TE
        SH2_DMA_CHCR0 = 0; // clear TE
, but do this instead:

Code: Select all

        SH2_DMA_CHCR0; // read TE
        SH2_DMA_CHCR0 = 0x44E0; // clear TE
I did not find anything related to this in the Hitachi / Renesas SH7604 Hardware Manual*,
I did not find anything related to this in the Sega 32X Hardware Manual,
so it's a great piece of knowledge.
Thank you Vic and Chilly Willy ^^

* : the only thing about writing to CHCR is in the Usage notes (9.5) : “Before rewriting CHCR0, CHCR1, DRCR0, and DRCR1, first clear the DE bit for the specified channel to 0 or clear the DME bit in DMAOR to 0”.
Vic
Interested
Posts: 31
Joined: Wed Nov 03, 2021 6:01 pm

Re: Super VDP

Post by Vic »

On page 240 of the manual it's said that "To clear the TE bit, read 1 from it and then write 0." <- it wants you to clear the bit, not the whole register. At least that's what we assume it wants based on how real hardware seems to work :)
Vic
Interested
Posts: 31
Joined: Wed Nov 03, 2021 6:01 pm

Re: Super VDP

Post by Vic »

Note that this only works in cases where the main CPU is kept off the bus and is waiting in the hot loop for the DMA to complete. Otherwise, you're still going to get data loss and/or corruption. If you want to perform DMA transfer without halting the CPU, here's what I found works at least to a certain degree: there's no data loss, just data corruption: you need to modify the write loop on the 68000 by inserting additional nops after each write:

Code: Select all

2:
        move.w  (a0)+,(a1)              /* FIFO = next word */
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        move.w  (a0)+,(a1)
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        move.w  (a0)+,(a1)
        nop
        nop
        nop
        nop
        nop
        nop
        move.w  (a0)+,(a1)
3:
        btst    #7,0xA15107             /* check FIFO full flag */
        bne.b   3b
        dbra    d0,2b
Shit's going to be pretty slow compared to the normal version without the nops, so combined with data corruption I don't think this path is worth pursuing.

This version also seems to work in the sense that it's fast, there's no data loss, just occasional corruption:

Code: Select all

2:
        move.w  (a0)+,(a1)              /* FIFO = next word */
        move.w  (a0)+,(a1)
        move.w  (a0)+,(a1)
        move.w  (a0)+,(a1)
3:
        btst    #7,0xA15107             /* check FIFO full flag */
        bne.b   3b
        dbra    d0,2b

        lea     (-8,a0),a0
        move.w  (a0)+,(a1)
        move.w  (a0)+,(a1)
        move.w  (a0)+,(a1)
        move.w  (a0)+,(a1)
4:        
        btst    #7,0xA15107             /* check FIFO full flag */
        bne.b   4b

        lea     (-8,a0),a0
        move.w  (a0)+,(a1)
        move.w  (a0)+,(a1)
        move.w  (a0)+,(a1)
        move.w  (a0)+,(a1)
ob1
Very interested
Posts: 467
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Re: Super VDP

Post by ob1 »

I can't remember who (Chilly ?), but someone once told to send a small amount of data (circa 200B. I guess, this way, without the NOPs), then compute a checksum, and if the checksum is wrong send the data again.
That's the way I did it. Ugly, slow, so not useful.
#ItShouldNotBeThatHard #ThereSGottaBeABetterWay

Edit: Chilly Indeed
Chilly Willy wrote: Wed Oct 31, 2012 1:21 am A work around would be to send packets of data (say 256 words), don't check the FIFO full flag, and send extra words at the end to make sure the SH2 DMA finishes. Then put a checksum for the packet into the COMM registers. If the checksum is good, go on to the next packet, and if it's bad, retry the packet.
It was … thirteen friggin' years ago!!
Vic
Interested
Posts: 31
Joined: Wed Nov 03, 2021 6:01 pm

Re: Super VDP

Post by Vic »

Yes, checksumming on the MD's 68000 is going to be slow, unless you incorporate checksums into your data in a pre-processing step.
Stupid Sega didn't think of a "FIFO empty" flag. The "FIFO full" flag alone isn't terribly useful..
Post Reply