3D on the Sega Genesis is possible

ob1 · Post by **ob1** » Sun Jan 05, 2014 11:58 pm

What you guy have done is incredible !

Congratulations !!!!

Edit : it's not incredible, it's friggin' magic. No one would have thought it would be possible, and you made it !
Fookin' awesome !!!

kubilus1 · Post by **kubilus1** » Mon Jan 06, 2014 10:58 pm

Is it possible to repost the VGZ of the Corneria theme? I'm working on the VGM driver and want to throw as many things at it that I can. Thanks!

gasega68k · Post by **gasega68k** » Fri Feb 07, 2014 9:28 pm

Hi, I had created this topic about two months ago to show some 3D demos I had done, well until now had only shown the starfox3d now going to show two more demos:

The first is a test of an engine to draw polygons with greater precision. I do not know if this has a technical name, so here it is:
http://www.mediafire.com/download/9d5fj ... _hprec.rar
In this file I included a normal version and version "hprec" for comparison.
With the Start button can change objects (they are 3), while more distant (or small) the object is, it shows a greater difference.

This is another demo of one of the first tests I made of textured polygons:
http://www.mediafire.com/download/ay7vt ... xtured.rar
With up, down, left, right, B, C, move the object, but if you hold A, the object rotates.

Orion_ · Post by **Orion_** » Sat Feb 08, 2014 11:14 am

I'm really impressed by your starfox demo because, showing a 3D cube is something, but navigating in a 3D world is a whole lot more
In all my attempts to do a 3D engine, I never managed to get rid of fixed point overflow when polygons starts to get out of screen
how do you manage to do proper 3D clipping without 3D coordinates doing overflow ? I heard about frustum culling but, it require lots of math and still your demo is pretty fast doing all of this on a single 7.6mhz processor + the tile constraint
I'm really not good at math >_<

Stef · Post by **Stef** » Sat Feb 08, 2014 11:38 am

Gasega68k>
I tested the different 3D tests and the only difference i could see with the hprec and normal is that hprec is less accurate when you zoom in on object !
I guess it has something to do with the global coordinates system, allowing far distance view but at the cost of low near accuracy.
The textured car is really impressive ! Imagine a game made of it (mixing flat and textures objects) on the sega genesis X'D

Orion>With the 68000 you are somehow limited to 16 bits numbers for fixed point operations as the native division and multiplication does 16x16=32 or 32:16=16. Hopefully 16 bits fixed point is not that bad to work with. In SGDK i use a fix16 type where:
- 1 bit is used for sign
- 9 bits for integer part
- 6 bits for frationnal part

This way you can have -512 to 512 range values with fractionnal part (1/64 step for fractionnal part). I believe is a nice compromise =)

gasega68k · Post by **gasega68k** » Tue Feb 11, 2014 9:36 pm

About the demo "hprec", well, I imagined that almost no one would see the difference, because as I said it shows a greater difference when the object is far or from certain angles, the error you mention, is that they need more accuracy in the calculation of the edges.

In this one, you will notice the difference (only rotate on the Z axis):

http://www.mediafire.com/download/v17c8 ... hprecb.rar

Orion_ wrote:I'm really impressed by your starfox demo because, showing a 3D cube is something, but navigating in a 3D world is a whole lot more
In all my attempts to do a 3D engine, I never managed to get rid of fixed point overflow when polygons starts to get out of screen
how do you manage to do proper 3D clipping without 3D coordinates doing overflow ? I heard about frustum culling but, it require lots of math and still your demo is pretty fast doing all of this on a single 7.6mhz processor + the tile constraint
I'm really not good at math >_<

To create a 3D engine, one of the best places I found information was here:
http://www.ffd2.com/fridge/chacking/
The article: "Issue # 16" "3D for the Masses: Cool 3D World and the Library", is the most complete that I found.

Chilly Willy · Post by **Chilly Willy** » Sat Feb 15, 2014 2:01 am

Stef wrote:Gasega68k>
I tested the different 3D tests and the only difference i could see with the hprec and normal is that hprec is less accurate when you zoom in on object !
I guess it has something to do with the global coordinates system, allowing far distance view but at the cost of low near accuracy.
The textured car is really impressive ! Imagine a game made of it (mixing flat and textures objects) on the sega genesis X'D

Orion>With the 68000 you are somehow limited to 16 bits numbers for fixed point operations as the native division and multiplication does 16x16=32 or 32:16=16. Hopefully 16 bits fixed point is not that bad to work with. In SGDK i use a fix16 type where:
- 1 bit is used for sign
- 9 bits for integer part
- 6 bits for frationnal part

This way you can have -512 to 512 range values with fractionnal part (1/64 step for fractionnal part). I believe is a nice compromise =)

That's not bad. I used 16.16 fixed point on my raycast example. I used inline assembly for the multiply:

Code: Select all

static fixed_t FIX_MUL( fixed_t a, fixed_t b )
{
    fixed_t res = 0, c = 0, d = 0, e = 0;
    asm volatile (
        "tst.l %1\n\t"
        "spl %5\n\t"
        "bpl.b 1f\n\t"
        "neg.l %1\n"
        "1:\n\t"
        "tst.l %2\n\t"
        "bpl.b 2f\n\t"
        "not.b %5\n\t"
        "neg.l %2\n"
        "2:\n\t"
        "move.w %1,%3\n\t"
        "swap %1\n\t"
        "move.w %2,%4\n\t"
        "move.w %2,%0\n\t"
        "swap %2\n\t"
        "mulu %3,%0\n\t"
        "mulu %1,%4\n\t"
        "mulu %2,%1\n\t"
        "mulu %3,%2\n\t"
        "swap %1\n\t"
        "move.w #0,%1\n\t"
        "move.w #0,%0\n\t"
        "swap %0\n\t"
        "add.l %4,%0\n\t"
        "addx.l %2,%0\n\t"
        "addx.l %1,%0\n\t"
        "tst.b %5\n\t"
        "bne.b 3f\n\t"
        "neg.l %0\n"
        "3:\n\t"
        : "=d" (res), "=d" (a), "=d" (b), "=d" (c), "=d" (d), "=d" (e)
        : "0" (res), "1" (a), "2" (b), "3" (c), "4" (d), "5" (e)
        : "cc"
    );
    return(res);
}

That ignores one of the multiplies you would do for 32x32->64 since we only need 48 bits for fixed point multiplication. So it's slightly faster than a general purpose 32x32 multiplication.

Stef · Post by **Stef** » Sat Feb 15, 2014 11:21 am

Oh seeing your code i am realizing that my fix32 multiplication does not work correctly as it uses the standard 32x32=32 multiplication code when it should use a complete 32x32=64 code :-/
16.16 is practical as you can use the fast swap instruction to get integer part but the multiplication is still very slow compared to 16 bits fixed points (and definitely too slow for use in a 3D engine where you need to do tons of multiplications :p).

Chilly Willy · Post by **Chilly Willy** » Sat Feb 15, 2014 6:32 pm

It's four multiplies instead of one, so yeah, it increases the load a bit.

In case anyone is wondering how you do larger multiplies with a smaller multiply command, use standard algebra. Consider each letter a word...

AB * CD = ?

Represent this as (A<<16+B)*(C<<16+D)
and now use the FOIL method

= (A<<16)*(C<<16) + (A<<16)*D + B*(C<<16) + B*D

Then pull out the shifts

= (A*C)<<32 + (A*D)<<16 + (B*C)<<16 + B*D

That gives a full 64 bit result using 16 bit multiplies. My code was optimized around the realization that you don't need the full 64 bits for a 16.16 * 16.16 -> 16.16 multiply operation; only the middle 32 bits needs to be calculated.

M-374 LX · Post by **M-374 LX** » Mon Apr 07, 2014 12:54 pm

Does the demo work at Genesis' full frame rate (60 FPS in NTSC, 50 in PAL)?

r57shell · Post by **r57shell** » Mon Apr 07, 2014 2:46 pm

How about that? http://en.wikipedia.org/wiki/Karatsuba_algorithm
It's very easy and using 3 multiply operations.

Chilly Willy · Post by **Chilly Willy** » Tue Apr 08, 2014 4:04 am

r57shell wrote:How about that? http://en.wikipedia.org/wiki/Karatsuba_algorithm
It's very easy and using 3 multiply operations.

Uh... I wouldn't say VERY easy, and there are certain things implied that may not hold on the 68000. From the article

A more efficient implementation of Karatsuba multiplication can be set as [5]

x y = (b^2 + b)x1y1 − b(x1 − x0)(y1 − y0) + (b + 1)x0y0,

where b is the weight of x1.

This assumes the (b^2 + b), b, and (b + 1) all are simpler to calculate and their implied multiplications in the above equation don't actually require multiplications. See, (b^2 + b)x1y1 is multiplying THREE terms, not two. Ditto for the rest.

Stef · Post by **Stef** » Tue Apr 08, 2014 7:49 am

Indeed i am definitely not sure that you can get a substantial improvement by using this formula on the 68000. Still that is the type of trick i like to learn

r57shell · Post by **r57shell** » Tue Apr 08, 2014 1:52 pm

Chilly Willy wrote:See, (b^2 + b)x1y1 is multiplying THREE terms, not two. Ditto for the rest.

lol

. b is shift. in our case (16.16) it is b = (1<<16).
For examlple, if x1*y1 = d0 then b*x1*y1 = swap.w d0
so, you have to calculate: x0y0, x1y1, and (x1 − x0)(y1 − y0)

Stef · Post by **Stef** » Tue Apr 08, 2014 4:26 pm

Yeah got it, but still with the shift and others extra operation i'm not sure you gain 1 multiplication (which is 70 cycles). Need to be tested !