3d software rendering on megacd

Ask anything your want about Mega/SegaCD programming.

Moderator: Mask of Destiny

Fonzie
Genny lover
Posts: 323
Joined: Tue Aug 29, 2006 11:17 am
Contact:

3d software rendering on megacd

Post by Fonzie » Thu Nov 22, 2007 7:09 pm

Hi :)

I just wanted to introduce a little chat about that (as it has been asked in a general sega board)... Please serious talk here.

Compared to megadrive software 3d rendering, the segacd have those advantages :
- Rendering cpu (subcpu) 75% faster
- True 4bpp framebuffer
- Auto bmp2tiles conversion
- While maincpu DMA current frame + do other stuff, the subcpu can render next frame.

I stay with software since the megacd rendering hardware remove some nice features when used (namely the next point). And, even if it can be tweaked to render vector lines, I guess it would be tricky to get it rendering polygons...

So, my little estimation would say that the software rendering speed would be around 3 times faster because :
- DMA possible while still rendering next frame.
- Direct frame buffer access (4bpp or 8bpp>4bpp)
- Maybe overwrite/clear of the frame buffer possible in hardware
- No bmp2tiles conversion needed
- 75% faster cpu
- Since main cpu is more free, it can even use the insane DMA in extended vblank trick :D

Any inputs, confirm or more tricks to add? Its quite a shame no polygonal games were released for this hardware...

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Fri Nov 23, 2007 5:55 pm

I think you almost gave the nice advantage of the sega CD hardware.
It's true that all these features can speed up 3D rendering a lot !
You forgot something by the way : the font data converter.
It's a sort of 1 bit to 4 bits color converter.
You can use it for fast mask or font decompression :)

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Fri Nov 23, 2007 8:27 pm

Stef wrote: You forgot something by the way : the font data converter.
It's a sort of 1 bit to 4 bits color converter.
You can use it for fast mask or font decompression :)
I noticed that - that would be handy in minimizing the space taken by your fonts, not to mention being able to set the color of the font on the fly. The CD has a lot of nifty features to help the programmer.

One thing I didn't see an answer for... the timer in the CD: can it be used by programs, or is it reserved for the CD BIOS?

tomaitheous
Very interested
Posts: 256
Joined: Tue Sep 11, 2007 9:10 pm

Post by tomaitheous » Fri Nov 23, 2007 11:34 pm

Stef wrote: You forgot something by the way : the font data converter.
It's a sort of 1 bit to 4 bits color converter.
You can use it for fast mask or font decompression :)
Is that a BIOS routine or something that the ASIC can do(didn't see it listed in the specs)?

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Sat Nov 24, 2007 3:12 am

tomaitheous wrote:
Stef wrote: You forgot something by the way : the font data converter.
It's a sort of 1 bit to 4 bits color converter.
You can use it for fast mask or font decompression :)
Is that a BIOS routine or something that the ASIC can do(didn't see it listed in the specs)?
Part of the ASIC. It's in the manual, not the specs. You store the background and foreground colors to registers, then the font data to a register where each bit represents a pixel (on or off), and get back data in VDP format. It's basic color expansion hardware.

Fonzie
Genny lover
Posts: 323
Joined: Tue Aug 29, 2006 11:17 am
Contact:

Post by Fonzie » Sun Nov 25, 2007 12:17 am

Ha yeah, the fast 1bpp>4bpp thing...
There is also something obscure where you enter the full palette (16colors, 9bpc)... It is just a misunderstand or there is a hardware fading/alpha possibility when writing to the "framebuffer" (so the hardware find the nearest good color in the palette to fake fading)?

I say that because, the game batman & robin (which use the scaling hardware) have some crazy fade-in (from black) effects on sprites.

There is also the "write behind" or "write on top" or "write if !=0" feature, even more powerful than the 32x writes mode. I suppose they are still available in "software" writing...

--------------

I wanted to know precisely, in 1M ram mode, the asic is pretty disactivated, right? Because, if it was still available for some operations, it could be used to clear framebuffer or even write polygon lines...

Interesting talk, thx dudes!

tomaitheous
Very interested
Posts: 256
Joined: Tue Sep 11, 2007 9:10 pm

Post by tomaitheous » Sun Nov 25, 2007 11:21 pm

Fonzie wrote: There is also the "write behind" or "write on top" or "write if !=0" feature, even more powerful than the 32x writes mode. I suppose they are still available in "software" writing...
That's through the ASIC? For overlaying multiple object on the frame buffer? I figured it had some sort of overlaying method from the looks of Adventures of Batman & Robin, just didn't see it on the manual. Nice :D

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Mon Nov 26, 2007 8:14 am

tomaitheous wrote:
Fonzie wrote: There is also the "write behind" or "write on top" or "write if !=0" feature, even more powerful than the 32x writes mode. I suppose they are still available in "software" writing...
That's through the ASIC? For overlaying multiple object on the frame buffer? I figured it had some sort of overlaying method from the looks of Adventures of Batman & Robin, just didn't see it on the manual. Nice :D
The manual doesn't have it that I recall... it was in a bulletin if I remember correctly. It's only for the ASIC in 2M mode - the rendered stamps can be set to overwrite, underwrite, or write normally. At least, that's what I remember offhand.

tomaitheous
Very interested
Posts: 256
Joined: Tue Sep 11, 2007 9:10 pm

Post by tomaitheous » Tue Nov 27, 2007 1:33 pm

I have a few of the bulletins, I shift through them. How old is the bulletin. I'd consider that bit of information fairly important if I were a developer.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Tue Nov 27, 2007 7:42 pm

tomaitheous wrote:I have a few of the bulletins, I shift through them. How old is the bulletin. I'd consider that bit of information fairly important if I were a developer.
Okay, it's in the Mega CD Software Development Manual. The manual you see floating around the most is the Hardware Manual. Anywho, the overwrite/underwrite settings are on page 10, the font registers are on page 18, and rotation/scaling is pages 19 to 31. The Software manual is also much cleaner scan, so it's easy to read.

TascoDLX
Very interested
Posts: 262
Joined: Tue Feb 06, 2007 8:18 pm

Re: 3d software rendering on megacd

Post by TascoDLX » Wed Nov 28, 2007 7:35 am

Fonzie wrote:So, my little estimation would say that the software rendering speed would be around 3 times faster because :
- DMA possible while still rendering next frame.
- Direct frame buffer access (4bpp or 8bpp>4bpp)
- Maybe overwrite/clear of the frame buffer possible in hardware
- No bmp2tiles conversion needed
- 75% faster cpu
- Since main cpu is more free, it can even use the insane DMA in extended vblank trick :D

Any inputs, confirm or more tricks to add? Its quite a shame no polygonal games were released for this hardware...
Caution: you won't get 60 fps. You simply can not copy the framebuffer to VRAM in due time. If you limit the size of the framebuffer or make some tricky cuts, you *might* get 30 fps (20 fps is more realistic). In any case, don't expect the main cpu to be doing much more than idling through DMA transfers.

Note that if you want to render fullscreen, it'll take multiple frames to update the screen, which will result in page tearing. If you want to avoid tearing, you'll need two buffers in VRAM, which will eliminate the possibility of fullscreen (or maybe you can squeeze it in H32 cell mode). But you could avoid the need for two buffers if you implement a split screen. I don't know how that would look though.

3d object rendering, however, is definitely within the realm of possibility. There's a lot of good stuff to work with here. Keep it up!

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Re: 3d software rendering on megacd

Post by Chilly Willy » Wed Nov 28, 2007 8:21 am

TascoDLX wrote: Note that if you want to render fullscreen, it'll take multiple frames to update the screen, which will result in page tearing. If you want to avoid tearing, you'll need two buffers in VRAM, which will eliminate the possibility of fullscreen (or maybe you can squeeze it in H32 cell mode).
Well, H40 cell mode would need 40x28x2x2 = 4480 bytes for two name tables. From 65536 leaves 61056 bytes (we're not going to bother with sprites or scrolling in a game like this). 61056 / (160 *2) gives us two 320x190 size screens worth of storage. Or you could render to only 38 cells horizontally (304 pixels) to give two screens of 304x200. That would probably be best, and you wouldn't have to worry about overscan much either.

Doing 304x200 would probably look better than trying to go down to H32 cell mode, which would also reduce the DMA bandwidth as we've seen in other threads. If you wanted sprites or another plane, then you'd DEFINITELY have to go to H32 mode to free up enough space.

tomaitheous
Very interested
Posts: 256
Joined: Tue Sep 11, 2007 9:10 pm

Post by tomaitheous » Wed Nov 28, 2007 3:53 pm

Is it possible to switch between H32 and H40 mode midscreen and back again during vblank(right before active display)?

Fonzie
Genny lover
Posts: 323
Joined: Tue Aug 29, 2006 11:17 am
Contact:

Post by Fonzie » Wed Nov 28, 2007 4:51 pm

tomaitheous wrote:Is it possible to switch between H32 and H40 mode midscreen and back again during vblank(right before active display)?
No, it break the sync on TV's... (not on emulator, of course).
However, the fallowing method works :
- Put Hint (or whatever) on line 180 (the less line you display the more DMA power you get).
- Hint stop the vdp display, start DMA
- Once DMA is finished, start vdp display again, exit HINT...

Like Stef said, starting a dma at line 180 or so almost double the DMA bandwidth during a single VBLANK.

In H32 mode, i got 208 tiles DMAted during extended VBLANK without any problem, I wonder if it's a record or what...

Fonz

TascoDLX
Very interested
Posts: 262
Joined: Tue Feb 06, 2007 8:18 pm

Post by TascoDLX » Fri Nov 30, 2007 5:41 pm

Chilly Willy wrote:Well, H40 cell mode would need 40x28x2x2 = 4480 bytes for two name tables. From 65536 leaves 61056 bytes (we're not going to bother with sprites or scrolling in a game like this). 61056 / (160 *2) gives us two 320x190 size screens worth of storage. Or you could render to only 38 cells horizontally (304 pixels) to give two screens of 304x200. That would probably be best, and you wouldn't have to worry about overscan much either.
Two nametables in H40 mode would be 64x28x2x2 = 7168 bytes because the scroll plane must be 64 cells wide. But since we only need one scroll plane, perhaps we can put the nametables on top of one another. Both scroll planes would just be displaying the same pattern. If that works, it would cut the size down to 3584 bytes.

To set up the VDP, just set all nametables, the sprite attribute table, and H-scroll table to VRAM 0000. For this to work, the first 32 bytes of VRAM need to be zeroed, but this is no problem since the top cell line won't be used anyway.
Fonzie wrote:- Put Hint (or whatever) on line 180 (the less line you display the more DMA power you get).
- Hint stop the vdp display, start DMA
- Once DMA is finished, start vdp display again, exit HINT...

Like Stef said, starting a dma at line 180 or so almost double the DMA bandwidth during a single VBLANK.
That should get you 30 fps but I'm concerned that it may look awkward -- a chunk of blank at the bottom of the screen. I'm thinking line 208, maybe 200, but perhaps I should reserve judgment until I see it. Nevertheless, there are other ways to preserve bandwidth.

You could reduce the render window and use some static patterns to draw a window frame and possibly a status bar. If you need more bandwidth, you could always just DMA all the way through the active display period. It depends if you want to do anything with the main cpu other than DMA to VRAM.

Another consideration is that you need to update the nametable to update the palette assignments, and this must be done in VBLANK. Unless you only need 16 colors -- I suppose it depends on the game.

And if you're looking to squeeze a bit more out of VRAM, you can store patterns inbetween lines in the scroll nametable. That's a bit extreme, though.
Fonzie wrote:In H32 mode, i got 208 tiles DMAted during extended VBLANK without any problem, I wonder if it's a record or what...
208? That's it? Well, I guess it's a record until someone claims otherwise. Congratulations!

Post Reply