Super VDP
Moderator: BigEvilCorporation
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
You need to either avoid using the 68K->SH2 "DMA" channel, or add error detection to it as it's buggy. Real games that use the 68K->SH2 DMA do retries for when the transfer fails. My own experiments have shown it's really damn buggy. I go with a processor polled exchange of data through the communication registers instead.
Redesigning the software might be better. Instead of transferring the data from the 68000 side, perhaps it might be stuck into the 32X framebuffer instead. Have code to manage the frame buffer memory, allocated space for the tiles and tile map, fill it from the 68000 or SH2, and then just send commands to the SH2 side for things like scrolling and palette changes.
Redesigning the software might be better. Instead of transferring the data from the 68000 side, perhaps it might be stuck into the 32X framebuffer instead. Have code to manage the frame buffer memory, allocated space for the tiles and tile map, fill it from the 68000 or SH2, and then just send commands to the SH2 side for things like scrolling and palette changes.
-
- Very interested
- Posts: 50
- Joined: Tue Dec 24, 2013 1:00 am
Yeah, re-trying DMA requests sounds like a pain, but the second idea seems like it's worth a shot. I'll post results once I have time to try it out.Chilly Willy wrote:You need to either avoid using the 68K->SH2 "DMA" channel, or add error detection to it as it's buggy. Real games that use the 68K->SH2 DMA do retries for when the transfer fails. My own experiments have shown it's really damn buggy. I go with a processor polled exchange of data through the communication registers instead.
Redesigning the software might be better. Instead of transferring the data from the 68000 side, perhaps it might be stuck into the 32X framebuffer instead. Have code to manage the frame buffer memory, allocated space for the tiles and tile map, fill it from the 68000 or SH2, and then just send commands to the SH2 side for things like scrolling and palette changes.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
I've thought about trying my hand at a SuperVDP implementation as well, but have been swamped with other things.
Anywho, there's two modes for the 68K->SH2 DMA channel in the 32X IO chip. The first mode is meant for carts, and uses the 68000 to store data to the FIFO to trigger DMA operations. This works somewhat... you need to keep lengths short (no more than a couple hundred bytes), checksum the data sent, use a timeout on SH2 DMA, and retry on checksum errors and timeouts. The second mode is meant for CD32X games, and uses a MD VDP DMA operation to read the CD word ram (when switched to the MD side); you set the VDP increment to 0 so that it doesn't actually write more than one word of vram. The 32X IO chip then recognizes the DMA read to trigger DMA operations. This doesn't work at all... Sega engineers couldn't get the timing right before time for the production run of the 32X chipset, so all you get is garbage. I suspect that in trying to get the timing for CD transfers to work, they slightly screwed up the timing for 68000 transfers, which is where the corruption comes from.
In any case, existing CD32X software uses the 68000 transfer as well since the mode meant for the CD doesn't work.
Anywho, there's two modes for the 68K->SH2 DMA channel in the 32X IO chip. The first mode is meant for carts, and uses the 68000 to store data to the FIFO to trigger DMA operations. This works somewhat... you need to keep lengths short (no more than a couple hundred bytes), checksum the data sent, use a timeout on SH2 DMA, and retry on checksum errors and timeouts. The second mode is meant for CD32X games, and uses a MD VDP DMA operation to read the CD word ram (when switched to the MD side); you set the VDP increment to 0 so that it doesn't actually write more than one word of vram. The 32X IO chip then recognizes the DMA read to trigger DMA operations. This doesn't work at all... Sega engineers couldn't get the timing right before time for the production run of the 32X chipset, so all you get is garbage. I suspect that in trying to get the timing for CD transfers to work, they slightly screwed up the timing for 68000 transfers, which is where the corruption comes from.
In any case, existing CD32X software uses the 68000 transfer as well since the mode meant for the CD doesn't work.
Re: Super VDP
In the end, Chilly Willy was able to get DREQ DMA to work by changing his code from this:
to this:
Seems to work fine without any delays on the 68k side.
Code: Select all
SH2_DMA_CHCR0; // read TE
SH2_DMA_CHCR0 = 0; // clear TE
SH2_DMA_DAR0 = 0x20000000 | (int)&data[0];
SH2_DMA_TCR0 = j;
SH2_DMA_CHCR0 = 0x44E1; // set DMA control mode - start DMA
MARS_SYS_COMM0 = 0x0003; // SH2 DMA started
Code: Select all
SH2_DMA_CHCR0; // read TE
SH2_DMA_CHCR0 = 0x44E0; // clear TE
SH2_DMA_DAR0 = 0x20000000 | (int)&data[0];
SH2_DMA_TCR0 = j;
SH2_DMA_CHCR0 = 0x44E1; // set DMA control mode - start DMA
MARS_SYS_COMM0 = 0x0003; // SH2 DMA started
Re: Super VDP
The only difference I spot is when clearing TE.
It would look like that when writing to CHCR, the common bits ($44Ex) must stay the same.
So, when clearing TE (aborting current DMA) don't do that:, but do this instead:
I did not find anything related to this in the Hitachi / Renesas SH7604 Hardware Manual*,
I did not find anything related to this in the Sega 32X Hardware Manual,
so it's a great piece of knowledge.
Thank you Vic and Chilly Willy ^^
* : the only thing about writing to CHCR is in the Usage notes (9.5) : “Before rewriting CHCR0, CHCR1, DRCR0, and DRCR1, first clear the DE bit for the specified channel to 0 or clear the DME bit in DMAOR to 0”.
It would look like that when writing to CHCR, the common bits ($44Ex) must stay the same.
So, when clearing TE (aborting current DMA) don't do that:
Code: Select all
SH2_DMA_CHCR0; // read TE
SH2_DMA_CHCR0 = 0; // clear TE
Code: Select all
SH2_DMA_CHCR0; // read TE
SH2_DMA_CHCR0 = 0x44E0; // clear TE
I did not find anything related to this in the Sega 32X Hardware Manual,
so it's a great piece of knowledge.
Thank you Vic and Chilly Willy ^^
* : the only thing about writing to CHCR is in the Usage notes (9.5) : “Before rewriting CHCR0, CHCR1, DRCR0, and DRCR1, first clear the DE bit for the specified channel to 0 or clear the DME bit in DMAOR to 0”.
Re: Super VDP
On page 240 of the manual it's said that "To clear the TE bit, read 1 from it and then write 0." <- it wants you to clear the bit, not the whole register. At least that's what we assume it wants based on how real hardware seems to work
Re: Super VDP
Note that this only works in cases where the main CPU is kept off the bus and is waiting in the hot loop for the DMA to complete. Otherwise, you're still going to get data loss and/or corruption. If you want to perform DMA transfer without halting the CPU, here's what I found works at least to a certain degree: there's no data loss, just data corruption: you need to modify the write loop on the 68000 by inserting additional nops after each write:
Shit's going to be pretty slow compared to the normal version without the nops, so combined with data corruption I don't think this path is worth pursuing.
This version also seems to work in the sense that it's fast, there's no data loss, just occasional corruption:
Code: Select all
2:
move.w (a0)+,(a1) /* FIFO = next word */
nop
nop
nop
nop
nop
nop
nop
move.w (a0)+,(a1)
nop
nop
nop
nop
nop
nop
nop
move.w (a0)+,(a1)
nop
nop
nop
nop
nop
nop
move.w (a0)+,(a1)
3:
btst #7,0xA15107 /* check FIFO full flag */
bne.b 3b
dbra d0,2b
This version also seems to work in the sense that it's fast, there's no data loss, just occasional corruption:
Code: Select all
2:
move.w (a0)+,(a1) /* FIFO = next word */
move.w (a0)+,(a1)
move.w (a0)+,(a1)
move.w (a0)+,(a1)
3:
btst #7,0xA15107 /* check FIFO full flag */
bne.b 3b
dbra d0,2b
lea (-8,a0),a0
move.w (a0)+,(a1)
move.w (a0)+,(a1)
move.w (a0)+,(a1)
move.w (a0)+,(a1)
4:
btst #7,0xA15107 /* check FIFO full flag */
bne.b 4b
lea (-8,a0),a0
move.w (a0)+,(a1)
move.w (a0)+,(a1)
move.w (a0)+,(a1)
move.w (a0)+,(a1)
Re: Super VDP
I can't remember who (Chilly ?), but someone once told to send a small amount of data (circa 200B. I guess, this way, without the NOPs), then compute a checksum, and if the checksum is wrong send the data again.
That's the way I did it. Ugly, slow, so not useful.
#ItShouldNotBeThatHard #ThereSGottaBeABetterWay
Edit: Chilly Indeed
That's the way I did it. Ugly, slow, so not useful.
#ItShouldNotBeThatHard #ThereSGottaBeABetterWay
Edit: Chilly Indeed
It was … thirteen friggin' years ago!!Chilly Willy wrote: Wed Oct 31, 2012 1:21 am A work around would be to send packets of data (say 256 words), don't check the FIFO full flag, and send extra words at the end to make sure the SH2 DMA finishes. Then put a checksum for the packet into the COMM registers. If the checksum is good, go on to the next packet, and if it's bad, retry the packet.
Re: Super VDP
Yes, checksumming on the MD's 68000 is going to be slow, unless you incorporate checksums into your data in a pre-processing step.
Stupid Sega didn't think of a "FIFO empty" flag. The "FIFO full" flag alone isn't terribly useful..
Stupid Sega didn't think of a "FIFO empty" flag. The "FIFO full" flag alone isn't terribly useful..