Some questions about Saturn HW

For hardware talk only (please avoid ROM dumper stuff)
danibus
Very interested
Posts: 143
Joined: Sat Feb 03, 2018 12:41 pm

Some questions about Saturn HW

Post by danibus »

Hi all! I'm learning about how Sega Saturn works, just for fun, some questions I didn't found the answer.
Hope you can help me! :wink:

- Main RAM: I read 2MB in some places, but also 1.5MB in others (labelled as "work RAM"). Why?
- Main RAM: I read 2x1MB in total, 2 different RAM chips (slow+fast)... but SCU make all transparent for SH2, is that right?
- SCU DSP: SCU has to make his job to let SH2 "speak" with others, like VPDs and Sound block. But... is this working also when using DSP?
I mean, lot of people dreams about DSP power not use by programmers, but seems to much for DSP making 3d calculations (matrix) and at the same communicate SH2 with other chips. Is this possible?
- VPD1: works at @28MHz, but its VRAM? I read half speed, non-sense if this is true
- VPD2: works at @28Mhz, but its VRAM? I read double speed, why if this is true?
- VPD1 and VPD2: VPD1 writes in framebuffer. VPD2 reads this framebuffer and copy all this info to one of its layers. Then mix all layers and send result to TV: If this is true, then all "sprites" are in one layer. This seems usual way.

But when researching about "BURNING RANGERS" game, seems they make this 2 times.
First VPD1 put transparent elements (like fire) in framebuffer. VPD2 reads and put in a layer.
Second VPD2 erase framebuffer (is this possible) and put sprites. VPD2 reads and put in another layer.
Finally VPD2 mix everything.

- SH2 and VPD2: I read that is possible to use only VPD2 (avoiding VPD1) as draws faster. But if VPD1 is the one that make sprites/quads, how SH2 can draw?

- Virtua Fighter 1 vs Virtua Fighter REMIX: I see that floor is different. In VF1 seem to have poligons but in VF1 REMIX seem to be a big texture manage by VPD2. Is this true?


- VPD1: Some people say that VPD1 lose time when "writing" texture in poligon even if the poligon is smaller than texture. I don't understand why (if it's like this). Read this about texture rendering
Saturn used a sort of "forward rendering", where it mapped lines from a sprite to stretched lines in screen space (the norm, "backwards rendering", maps pixels in screen space back to the texture, and marches through the screen one at a time). This technique suffered from a serious amount of internal overdraw, where pixels in the same quad are drawn on top of each other. This happens for two reasons. Because the line algorithm suffers from integer roundoff at the edges, adjacent lines would normally leave some gaps between them, so the lines are drawn thicker by inserting an extra pixel where they change slope. And, when the two control edges of the quad are not the same size, multiple lines from the sprite naturally overlap.
This overdraw results in wasted fillrate. Especially if you drew triangles (as degenerate quads) in the wrong way, resulting in a ton of overdraw as every line converged on the same point. It also makes the 50% blended draw mode very glitchy, because the overdraw pixels get blended multiple times. That's why very few 3D games use it, instead opting to use the garish mesh mode.
I didn't understand this explanation. Someone could clarify? Thanks
Chilly Willy
Very interested
Posts: 2995
Joined: Fri Aug 17, 2007 9:33 pm

Re: Some questions about Saturn HW

Post by Chilly Willy »

danibus wrote: Fri Apr 13, 2018 1:26 am Hi all! I'm learning about how Sega Saturn works, just for fun, some questions I didn't found the answer.
Hope you can help me! :wink:

- Main RAM: I read 2MB in some places, but also 1.5MB in others (labelled as "work RAM"). Why?
- Main RAM: I read 2x1MB in total, 2 different RAM chips (slow+fast)... but SCU make all transparent for SH2, is that right?
It has 1MB of 32 bit wide "fast" or "high" memory, and 1MB of 16 bit wide "slow" or "low" memory. The intended use of each is to put code and critical data into high mem, and general game data into low mem. You can put code into low mem, and general game data in high mem, but it won't be as fast as the other way around. You may see main memory labeled as 1.5M if it is considering both banks, but reserving some of the fast memory for "OS" functions, like all the Saturn libs.

- SCU DSP: SCU has to make his job to let SH2 "speak" with others, like VPDs and Sound block. But... is this working also when using DSP?
I mean, lot of people dreams about DSP power not use by programmers, but seems to much for DSP making 3d calculations (matrix) and at the same communicate SH2 with other chips. Is this possible?
The DSP in the SCU is not terribly flexible. You have very specific ways to feed it data via the DMA, and to DMA the results elsewhere. There's a SEGA tool to assemble DSP code, but little in the way of examples. It got very little usage in the mainstream.

- VPD1: works at @28MHz, but its VRAM? I read half speed, non-sense if this is true
- VPD2: works at @28Mhz, but its VRAM? I read double speed, why if this is true?
Actually, the VDP and SH2s use the same clock, and it's either 26MHz for 320 wide screens, or 28MHz for 384 wide screens. The vram is actually split in two, with half being for VDP1 and half for VDP2. Both blocks of vram are 16 bit wide, and are timed for the amount of data needed to draw/fetch for display modes. It gets pretty complex - I suggest reading both VDP manuals a few times to get all the info.

- VPD1 and VPD2: VPD1 writes in framebuffer. VPD2 reads this framebuffer and copy all this info to one of its layers. Then mix all layers and send result to TV: If this is true, then all "sprites" are in one layer. This seems usual way.

But when researching about "BURNING RANGERS" game, seems they make this 2 times.
First VPD1 put transparent elements (like fire) in framebuffer. VPD2 reads and put in a layer.
Second VPD2 erase framebuffer (is this possible) and put sprites. VPD2 reads and put in another layer.
Finally VPD2 mix everything.
I'm not sure I remember it exactly, but VDP1 writes some of the display into a separate part of the frame buffer to make use of transparency (you have to be very careful on using transparency on the Saturn or it looks wrong due to the way quads are drawn by VDP1). It then draws the sprites in a completely different part of the frame buffer. VDP2 then combines the two along with the cell layers for the final output.

- SH2 and VPD2: I read that is possible to use only VPD2 (avoiding VPD1) as draws faster. But if VPD1 is the one that make sprites/quads, how SH2 can draw?
Via software, like all old games. Same way the 32X does "hardware" scaling and rotation - with SH2 software. :wink:

But more games pretended the VDP2 didn't exist than the other way around. VDP1 is a pretty simple blitter, whereas VDP2 is a rather complex cell layer combiner. Most games setup VDP2 to just render a frame buffer, then use VDP1 as a blitter, along with common drawing algorithms run on the SH2.

- Virtua Fighter 1 vs Virtua Fighter REMIX: I see that floor is different. In VF1 seem to have poligons but in VF1 REMIX seem to be a big texture manage by VPD2. Is this true?
I think so, but not 100% on that.

- VPD1: Some people say that VPD1 lose time when "writing" texture in poligon even if the poligon is smaller than texture. I don't understand why (if it's like this). Read this about texture rendering
Saturn used a sort of "forward rendering", where it mapped lines from a sprite to stretched lines in screen space (the norm, "backwards rendering", maps pixels in screen space back to the texture, and marches through the screen one at a time). This technique suffered from a serious amount of internal overdraw, where pixels in the same quad are drawn on top of each other. This happens for two reasons. Because the line algorithm suffers from integer roundoff at the edges, adjacent lines would normally leave some gaps between them, so the lines are drawn thicker by inserting an extra pixel where they change slope. And, when the two control edges of the quad are not the same size, multiple lines from the sprite naturally overlap.
This overdraw results in wasted fillrate. Especially if you drew triangles (as degenerate quads) in the wrong way, resulting in a ton of overdraw as every line converged on the same point. It also makes the 50% blended draw mode very glitchy, because the overdraw pixels get blended multiple times. That's why very few 3D games use it, instead opting to use the garish mesh mode.
I didn't understand this explanation. Someone could clarify? Thanks
VDP1 does bowtie drawing. It steps down both sides of the quad, drawing a line from one point of the quad side to the other point on the other side, where the two points are often not on the same y coord. It does NOT (necessarily) render a raster line. Because it isn't rasterising the line, pixels in the middle tend to be overdrawn, which is why transparency tends to come out wrong when drawing quads.
Huge
Very interested
Posts: 197
Joined: Sat Dec 13, 2008 11:50 pm

Re: Some questions about Saturn HW

Post by Huge »

danibus wrote: Fri Apr 13, 2018 1:26 am - VPD1 and VPD2: VPD1 writes in framebuffer. VPD2 reads this framebuffer and copy all this info to one of its layers. Then mix all layers and send result to TV: If this is true, then all "sprites" are in one layer. This seems usual way.
The Video Panda 1 draws one frame to its framebuffer 0, while the Video Panda 2 is reading framebuffer 1.
When the VDP1 finishes, it draws the next frame to VDP1 framebuffer 1 while the VDP2 is accessing framebuffer 0.
Repeat.
The VDP1 only ever has one framebuffer active and accessed, because the VDP2 is using the other one, and composes the final image sent to the TV using that, as well as the other background layers in either or both banks of the VDP2 VRAM. So the VDP2 doesn't "copy" the framebuffer out, it reads it directly, and so it works out of as much as 3x 16-bit banks simultaneously.

If you use RGB sprites, then yes, all sprites are treated as 1 layer by the VDP2. 15bits are for colours, and the MSB tells the VDP2 that this is a RGB sprite, not a palette one.
If you use palette sprites, then you can put priority bits in the framebuffer, which the VDP2 uses to determine the position of a pixel. It is still treated as 1 layer, but some pixels can be below or above other backgrounds. Kind of like a VDP2 only z-index.

It's a completely stupid setup because this way you have a mode suitable for 3d (RGB mode) with shading and transparency, but you are limited in how to use the VDP2. OR, you have a mode suitable for 2d (palette mode) where you can mix VDP2 backgrounds better - but you are limited in VDP1 colour calculations (sprite transparency and gouraud shading only work on RGB mode). I would've used a 12-bit RGB mode where you have 4096 colours and 4 bits for VDP2 priority/transparency/shadowing. It would've made things SO easier (as an alias of ARGB4444), you could have used all advantages of both chips together and with the same colour fidelity as palette mode. But, alas.
danibus wrote: Fri Apr 13, 2018 1:26 am But when researching about "BURNING RANGERS" game, seems they make this 2 times.
First VPD1 put transparent elements (like fire) in framebuffer. VPD2 reads and put in a layer.
Second VPD2 erase framebuffer (is this possible) and put sprites. VPD2 reads and put in another layer.
Finally VPD2 mix everything.
Burning Rangers works in a roundabout way in order to display transparent polygons. What it does is:
- draws explosions etc. with the VDP1, fully opaque, at half resolution
- does some trickery to erase parts of the screen obscured by other polygons (think of it like software z-indexing)
- SCU DMA the VDP1 framebuffer to VDP2 VRAM
- VDP1 framebuffer is erased and VDP1 proceeds to draw the non-transparent parts.
- VDP2 then composes the final image normally from the other framebuffer and the VDP2 VRAM, and applies blending to the explosions, which is now a VDP2 background even though it was drawn by the VDP1.

Essentially you have one VDP2 background, in VDP2 VRAM, being continuously, dynamically re-drawn by the VDP1. Or depending on which way you look at it, you are using the VDP2 to render what the VDP1 drew.

The advantage of this is that it can display transparent explosions which blend with both sprites and backgrounds (normally you could only have one or the other). The disadvantage is that you only have 1 transparent layer, you have to take care to manually obscure parts which get hidden by other parts (like if someone is standing in front of the explosion), and that your transparent layer is half the resolution to speed it up. Also the framerate might be uneven as sometimes only 1 of the VDP1 passes can be finished in time, not both.

It is really complex but it works. Chris Coffin explained that this was something they came up with after STI dissolved and they went on to work on Saturn devkits. Burning Rangers seems to be the only game it was put into, as far as we know.
danibus wrote: Fri Apr 13, 2018 1:26 am - SH2 and VPD2: I read that is possible to use only VPD2 (avoiding VPD1) as draws faster. But if VPD1 is the one that make sprites/quads, how SH2 can draw?
Two ways. One is to draw in software and upload it to VDP2. Doom does this for everything but the HUD, AMOK does this for the voxel landscape and draws polygons and sprites on top, and Sonic R does it for the environmental mapping on the Sonic R logo and the loading screen.

The other way is to use expansion hardware which gets fed into the VDP2 as a background. The MPEG card does this.
danibus wrote: Fri Apr 13, 2018 1:26 am - Virtua Fighter 1 vs Virtua Fighter REMIX: I see that floor is different. In VF1 seem to have poligons but in VF1 REMIX seem to be a big texture manage by VPD2. Is this true?
This is correct. VF Remix also loses all lightning effects on the polygons. I think the original VF looked better because of the lightning. If only they could've done it in hi-res.
danibus wrote: Fri Apr 13, 2018 1:26 am - VPD1: Some people say that VPD1 lose time when "writing" texture in poligon even if the poligon is smaller than texture. I don't understand why (if it's like this). Read this about texture rendering
Normal renderers determine which part of the screen they are drawing to, and then use UV texture coordinates to determine which pixel of a texture is to be written to where they are drawing. So the only textures sampled are the ones displayed on the screen.

The Saturn works "backwards". It samples every pixel of the texture and determines whether it needs to be written to framebuffer or not. So if you have a 64x64 texture but only write 32x32 of it, then you waste a fourth of your fillrate checking texture pixels that don't end up being drawn. There are some mitigating factors like texture end codes that can be used to reduce the amount of pixels sampled, but it's still bloody stupid either way.

One thing to note though is that while it samples every pixel of the texture, it doesn't write multiple values per line to the framebuffer. So you don't actually get pixels written multiple times (in one line, anyway); the speed is wasted when reading the texture. When you DO get overwrites, is when lines intersect, ie. when the polygon is not a perfect square (4-point transformations, the manual calls these "Distorted Sprites").

I don't know if you get overdrawn pixels when you draw a poly where the right side is larger.... logic would dictate that you get dropouts here, but the VDP1 does some aliasing to get around this, but I don't know how.

I'm not clear on the actual speed of the VDP1. An old Sega tutorial lists an equation to approximate the cycles it takes for the VDP1 to draw something, but if I assume the 28.6 million cycles from the main clock and 16x16 sprites, I get something ridiculous, like ~4 MPixel/sec. Using no textures and very large sprites (to reduce memory read and VDP1 setup overhead), I got up to ~10 MPixel/sec, the last time I checked the equation. This is disturbingly low and yet it jibes in with the few developers who commented on how slow the VDP1 is compared to the PSX (which has a theoretical peak of 33MPixel/s, and some demos have done ~24MPixel/s in practice).
I also don't know if that equation takes 8bit or 16bit sprites into the equation.
Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: Some questions about Saturn HW

Post by Sik »

Huge wrote: Sat Apr 21, 2018 10:06 pm I don't know if you get overdrawn pixels when you draw a poly where the right side is larger.... logic would dictate that you get dropouts here, but the VDP1 does some aliasing to get around this, but I don't know how.
Whenever the line would step diagonally in its slope it draws two pixels (on both ends of the crossing). In the edge case of a 45º line, you basically end up with a thick line as a result of this.
Sik is pronounced as "seek", not as "sick".
Huge
Very interested
Posts: 197
Joined: Sat Dec 13, 2008 11:50 pm

Re: Some questions about Saturn HW

Post by Huge »

Sik wrote: Sat Apr 21, 2018 11:43 pm
Huge wrote: Sat Apr 21, 2018 10:06 pm I don't know if you get overdrawn pixels when you draw a poly where the right side is larger.... logic would dictate that you get dropouts here, but the VDP1 does some aliasing to get around this, but I don't know how.
Whenever the line would step diagonally in its slope it draws two pixels (on both ends of the crossing). In the edge case of a 45º line, you basically end up with a thick line as a result of this.
I'm not sure I understand how that works. What happens when you draw a poly that is 8 pixels on the left side (AD) but 80 pixels on the right side (BC)? Assuming the original graphic is 8x8 pixels. It would need to fill way more than just 1 pixel per diagonal crossing or else it would get dropouts. Or are those values cumulative for every time the Y coordinate changes, meaning that it would end up drawing a triangle?
Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: Some questions about Saturn HW

Post by Sik »

It draws multiple lines with the same texture Y coordinate in order to stretch it vertically.

That's a different thing, I was just talking about how it avoids dropouts between consecutive lines (as they're drawn in diagonal, and hence won't align perfectly). Though after working on some software polygon routines meant to draw Saturn-like quads I'm left baffled as to why it needs to go through all pixels (is bresenham-like interpolation that hard to do in hardware? serious question), not sure what it's doing internally on that respect.
Sik is pronounced as "seek", not as "sick".
danibus
Very interested
Posts: 143
Joined: Sat Feb 03, 2018 12:41 pm

Re: Some questions about Saturn HW

Post by danibus »

Thanks all for posting.

I'm still trying to understand SEGA VPD1 documentacion, pages 8 and 9.

Pls check files attached.


Also pls check this video, is not about Saturn but about uoYabause Saturn Emulator, but maybe clarify some things about distortes sprites.

https://www.youtube.com/watch?v=8TleepxIORU
Attachments
antialiassing.JPG
antialiassing.JPG (36.33 KiB) Viewed 53746 times
distorted.JPG
distorted.JPG (110.8 KiB) Viewed 53746 times
Huge
Very interested
Posts: 197
Joined: Sat Dec 13, 2008 11:50 pm

Re: Some questions about Saturn HW

Post by Huge »

Sik wrote: Sun Apr 22, 2018 3:35 am Though after working on some software polygon routines meant to draw Saturn-like quads I'm left baffled as to why it needs to go through all pixels (is bresenham-like interpolation that hard to do in hardware? serious question), not sure what it's doing internally on that respect.
It goes through every pixel of the texture because some things like end codes can't work otherwise. If it skips sampling of a texture pixel, it might skip sampling an end code.
Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: Some questions about Saturn HW

Post by Sik »

Yeah, but then didn't you say that the end code was to work around this issue for starters?

Actually, heck - what's even the point of the end code? Because they could have just told devs to fill the line up to the end if it was just to skip transparent pixels past the silhouette of a sprite. The whole thing doesn't make much sense at all.
Sik is pronounced as "seek", not as "sick".
Huge
Very interested
Posts: 197
Joined: Sat Dec 13, 2008 11:50 pm

Re: Some questions about Saturn HW

Post by Huge »

End codes are just Sega's idea of speed optimization. They tell the VDP1 to stop sampling the texture, so if large parts of the texture are transparent, you can get a potentially significant speedup at drawing.
Chilly Willy
Very interested
Posts: 2995
Joined: Fri Aug 17, 2007 9:33 pm

Re: Some questions about Saturn HW

Post by Chilly Willy »

Huge wrote: Tue Apr 24, 2018 2:32 am End codes are just Sega's idea of speed optimization. They tell the VDP1 to stop sampling the texture, so if large parts of the texture are transparent, you can get a potentially significant speedup at drawing.
And it's clearly meant for "sprite" rendering where overdraw is minimized. As mentioned, it's not as useful for "3d" where warping causes overdraw and the like. It was probably meant to help for the big 2D fighting games from Capcom and the like where you had tons of sprites that needed to be in vram.
Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: Some questions about Saturn HW

Post by Sik »

Now that you bring that up... you could use that to store smaller sprites inside larger ones (in the excess space to their right), saving quite a bunch of memory. Though splitting up the larger sprites would probably work just as well >_>

It was probably one of those things somebody thought would be a good idea without thinking it thoroughly.
Sik is pronounced as "seek", not as "sick".
Huge
Very interested
Posts: 197
Joined: Sat Dec 13, 2008 11:50 pm

Re: Some questions about Saturn HW

Post by Huge »

You'd need to do quite some magic to "defragment" your sprite table in a way like that, although I suppose it is possible, especially if you use very large sprites. But I don't think it's practical because for animation-heavy games, the VRAM is not enough and you need to load new sprites from low ram or cart ram. Some Capcom games end up running slower with the RAM cart because of that.
Chilly Willy wrote: Tue Apr 24, 2018 1:28 pm And it's clearly meant for "sprite" rendering where overdraw is minimized. As mentioned, it's not as useful for "3d" where warping causes overdraw and the like. It was probably meant to help for the big 2D fighting games from Capcom and the like where you had tons of sprites that needed to be in vram.
They are useful in the sense that you can speed up the VDP1 with it. As slow as that thing is, you need every edge. It's probably why they added Half Speed Shrink mode as well (where only every 2nd pixel of the texture is sampled). Apparently, it needs 2 cycles to draw a pixel (3 if it is textured), plus setup overhead.
danibus
Very interested
Posts: 143
Joined: Sat Feb 03, 2018 12:41 pm

Re: Some questions about Saturn HW

Post by danibus »

Huge wrote: Tue Apr 24, 2018 2:32 am End codes are just Sega's idea of speed optimization. They tell the VDP1 to stop sampling the texture, so if large parts of the texture are transparent, you can get a potentially significant speedup at drawing.
I was reading VDP1 doc but didn't understand end codes.
Sprite is made of "x" lines, seems in each line of sprite can put end code to jump to next line without reaching end line.
Is working like this?
Chilly Willy
Very interested
Posts: 2995
Joined: Fri Aug 17, 2007 9:33 pm

Re: Some questions about Saturn HW

Post by Chilly Willy »

Look at page 87 of the VDP1 manual. There are two figures (6.10 a and b) showing how end codes work. It's pretty easy to work out from the figures where the description is kinda hard to follow.

Pretty much, the first "end" code tells where to start drawing, and the second end code tells where to stop drawing for each line in the sprite.
Post Reply