GLide 32x

ob1 · Post by **ob1** » Mon Feb 05, 2007 9:50 pm

Although I have real difficulties seeing my code running on the 32X (debugger, anyone ?), I have a silly idea in the head : a rendering library, let's call it GLide, for the 32X.

link : http://www.alasir.com/software/glide/index.html

The 68k would put data on the Comm Port, and each SH2 would poll it for available data. And each SH2 would run a primitive. So, not exactly SLI, as I've already read here, but I think that way would be faster.

Except, synchronization. If the 2 SH2 are already doing something, and the 68k has several primitives to give, it has to wait for one SH2 to finish, to empty the CommPort, for being able to put another primitive.
More over, if one SH2 is defining a Vertex the other SH2 has to use (for a triangle for example), the 2nd SH2 has to wait for the 1rst to finish its job.

By the way, how to poll the CommPort ? Here's a way to optimize the access for a SH2 :
- tell everyone (68k and the other SH2) I am reading the CommPort
- read cache (long) CommPort(0)
- read cache (long) CommPort(4)
...
- read cache (long) CommPort(12)
- tell everyone (68k and the other SH2) the CommPort is available
- run my job
- invalidate CommPort cache
- poll CommPort

The CommPort reading would be even faster if the CommPort was in SDRAM : the hardware manual states the SDRAM is optimized for cache line reads. Thus, each read is done by 16 bytes. So it would need only 1 (one) read to fill the cache !!! Alas, SDRAM is not CommPort.

Wait n' see.

ob1 · Post by **ob1** » Mon Feb 12, 2007 9:26 pm

I'm working hard on rasterization.
I've found this site
http://www.devmaster.net/articles/softw ... /part3.php
It is quite handy, but not as easy as I would have guessed. So it's quite hard, for sure. For example, I haven't decided yet if I'd go for SLI or each CPU drawing a primitive.
Or sorting ! I have to sort each primitive from far to close.
And what about optimizing ? Hidden Surface Removal anyone ?
But I've got some answers.
FIFO is not the way. I'd use CommPort.
I can use either Packed Color or Direct Color.
I've learned the 32x is said to draw 50kpolys/s. That's huge. And very few in the mean time (~2kpoly/frame).

While I haven't written a single line, not even a
add #1,r1
I'm learning, and learning, and learning !!!! But I'm not sure my brain will handle everything ! :D

My final aim, for now, and if you know me you know I can change my mind faster than telling it, is to mame something that'd look Dragon Quest VIII.

No play, no gain.

Fonzie · Post by **Fonzie** » Mon Feb 12, 2007 9:36 pm

"I can use either Packed Color or Direct Color. "
I think packed color can be super fast

Games like Stellar Assault seems to use it and it looks great (bit grainy but perfect for a space shooter).

Direct color can be super cool too... you may restrict the screen width to 16/9 to be sure of the framebuffer space and drawing speed

As for dragon quest, you'll use scalled 2D for objecs/characters?

A last question, will your 32x engine work in automomy? I mean, the 68K would handle the gameplay and the 32x all the silly stuff?

Mad guy

ob1 · Post by **ob1** » Mon Feb 12, 2007 9:40 pm

In direct color, you just have 204 lines in Frame Buffer. I'd go for 320x200, that gives a 16/10 ratio. And packed color can be speedy, that's brillant.

For my DraQue, I want cell-shaded characters, and images for background. For the remaining ... I'd buy a PS2 to know what DraQue really is !!! :D

Fonzie · Post by **Fonzie** » Mon Feb 12, 2007 9:47 pm

320x200... Well, I always wondered if having a width multiple of 8 (256pixels or 512 in 32x case) in the framebuffer would speed up a lot the vertical seeking of the cpu's...

You can always use the genesis display to "borderize" the 32x display.
In this case, 256*224 pixels would be great for the buffer.

Or it can also be 512*127 pixels buffer (half the vertical size)...

Well, its probably just me who is nuts about 8 multipes.

About the cellshaded characters.. wow

WOW

2D sprites are still a solution if you cannot reach decent framerate

ob1 · Post by **ob1** » Tue Feb 13, 2007 8:52 am

Here's my way of rasterization. Let's suppose a triangle primitve. It's basically made of 3 points, or vertices A, B and C. This 3 points define 3 edges. They are [AB], [BC], and [CA]. First of all, let's compute the slope for theses edges. But this slope is the reverse of what I've learned in high-school.
I've learned y = ax + b, a and b constant, and "a" being (yM-yN)/(xM-xN) for any M and N.
Here, with computer, what I can do very easily is draw a horizontal line. So, for a given "y", I have to get the beginning "x" and the ending "x". Thus, my slope, is (xM-xN)/(yM-yN).
Step 1
For the [AB] segment, the slope is (xB-xA)/(yB-yA). Then, for every "y" between yA and yB, "x" is equal to xA + slope*y. I get a set of points.
Do the same for [BC] and [CA] (Step 1-b and 1-c)
Step 2
Then, for each segment, I take the corresponding set of points.
For every point M on this segment, I search a point in the 2 others segments, that has the same "y", and with abscisse being greater than xM.
If such a point exist, I draw a line between these 2 points.
And so on with the 2 others segments. (Step 2-b and 2-c)
That's a lot of job.
Does anybody have a better way of rasterization, or book/site to advise me ?

Nevertheless, there are ways of optimizing. Drawing an horizontal line, for example, can be done by hardware with 32X VDP FILL. But then, you've got to synchronize, or implement a stack. The most notable optimization would be to use the 2 SH2 in a SLI way, master computing even lines, and slave computing odd ones. But drawing lines (the whole step 2 thing) isn't that computive intensive. Quoique. Moreover, defining set points in step 1would fit really nicely in private RAM, then a primitive/CPU would also be nice.

Lot of ways of thinking, lot of ways of writing ...

ob1 · Post by **ob1** » Tue Feb 13, 2007 9:09 am

Load balancing is all the rage !!!

Stef · Post by **Stef** » Tue Feb 13, 2007 2:55 pm

ob1 wrote:Here's my way of rasterization. Let's suppose a triangle primitve. It's basically made of 3 points, or vertices A, B and C. This 3 points define 3 edges. They are [AB], [BC], and [CA]. First of all, let's compute the slope for theses edges. But this slope is the reverse of what I've learned in high-school.
I've learned y = ax + b, a and b constant, and "a" being (yM-yN)/(xM-xN) for any M and N.
Here, with computer, what I can do very easily is draw a horizontal line. So, for a given "y", I have to get the beginning "x" and the ending "x". Thus, my slope, is (xM-xN)/(yM-yN).
Step 1
For the [AB] segment, the slope is (xB-xA)/(yB-yA). Then, for every "y" between yA and yB, "x" is equal to xA + slope*y. I get a set of points.
Do the same for [BC] and [CA] (Step 1-b and 1-c)
Step 2
Then, for each segment, I take the corresponding set of points.
For every point M on this segment, I search a point in the 2 others segments, that has the same "y", and with abscisse being greater than xM.
If such a point exist, I draw a line between these 2 points.
And so on with the 2 others segments. (Step 2-b and 2-c)
That's a lot of job.
Does anybody have a better way of rasterization, or book/site to advise me ?

Nevertheless, there are ways of optimizing. Drawing an horizontal line, for example, can be done by hardware with 32X VDP FILL. But then, you've got to synchronize, or implement a stack. The most notable optimization would be to use the 2 SH2 in a SLI way, master computing even lines, and slave computing odd ones. But drawing lines (the whole step 2 thing) isn't that computive intensive. Quoique. Moreover, defining set points in step 1would fit really nicely in private RAM, then a primitive/CPU would also be nice.

Lot of ways of thinking, lot of ways of writing ...

Imo, the best way of rasterize a triangle : rasterize horizontal lines between [AB] and [AC] then between [BC] and [AC] where :
- A is the top Y most point
- B is the middle Y point
- C is the bottom Y most point

Use bresenham line algo type to determine you [AB] [AC] and [BC] segment.

I did some polygon draw routine in C in my mini dev kit, check the vdp_bmpx.c and vdp_bmpw.h files in the lib directory, maybe it can help you :
http://www.spritesmind.net/_GenDev/foru ... 14&start=0

ob1 · Post by **ob1** » Tue Feb 13, 2007 9:14 pm

Stef wrote:Use bresenham line algo

Great ressource !!! Thank you !!!

Stef · Post by **Stef** » Tue Feb 13, 2007 10:46 pm

Actually a single word : "bresenham" would have been suffisant here

The good point of Bresenham's algo is that it's very well suited for assembly code

Fonzie · Post by **Fonzie** » Tue Feb 13, 2007 11:17 pm

If this algo is quite new, would it mean you could get better performance than an actual released 32x game?

I mean, were the 32x games good coded in comparaison of what you can do?
Seeing two coders like stef and ob1 sharing some knowledge just get me groovy.

ob1 · Post by **ob1** » Wed Feb 14, 2007 5:54 am

http://en.wikipedia.org/wiki/Bresenham's_line_algorithm

Not so new. It was invented in 1962 !!! I guess the veleoppers back in 1993-94 knew it ;)

And for our skills, wait for my first demo !!! For now, you may think I have some, but it is in theory only :D
I still haven't written a single line of code yet ;)

ob1 · Post by **ob1** » Wed Feb 14, 2007 8:05 am

http://freespace.virgin.net/hugo.elias/ ... x_main.htm
Great ressource (yet another)

Stef · Post by **Stef** » Wed Feb 14, 2007 9:24 am

The bresenham line algo has always be used in drawing code as it's fast and simple. There is a lot of tricks to optimise it here and there. In fact it was just to give a start point to ob1 as he was searching stuff on line draw. There are already tons of good stuff about fast triangle rendering and we have very small chance to invent something new today about it

The 32X packed color mode can help here, it's used in Virtua Racing as Virtua Fighter...

http://freespace.virgin.net/hugo.elias/ ... x_main.htm
Great ressource (yet another)

vey nice resource indeed !

ob1 · Post by **ob1** » Mon Feb 19, 2007 9:24 pm

Hey, I've been thinking the whole week-end about my rasterization engine. It's getting bigger, but, as for now, I still have a nightmare. See :

Does anyone know how I could handle z datas ?

SpritesMind.Net

GLide 32x

GLide 32x

Rasterization

Re: Rasterization

Re: Rasterization

My biggest nightmare