3D on the Sega Genesis is possible
Moderator: Mask of Destiny
Hi, I had created this topic about two months ago to show some 3D demos I had done, well until now had only shown the starfox3d now going to show two more demos:
The first is a test of an engine to draw polygons with greater precision. I do not know if this has a technical name, so here it is:
http://www.mediafire.com/download/9d5fj ... _hprec.rar
In this file I included a normal version and version "hprec" for comparison.
With the Start button can change objects (they are 3), while more distant (or small) the object is, it shows a greater difference.
This is another demo of one of the first tests I made of textured polygons:
http://www.mediafire.com/download/ay7vt ... xtured.rar
With up, down, left, right, B, C, move the object, but if you hold A, the object rotates.
The first is a test of an engine to draw polygons with greater precision. I do not know if this has a technical name, so here it is:
http://www.mediafire.com/download/9d5fj ... _hprec.rar
In this file I included a normal version and version "hprec" for comparison.
With the Start button can change objects (they are 3), while more distant (or small) the object is, it shows a greater difference.
This is another demo of one of the first tests I made of textured polygons:
http://www.mediafire.com/download/ay7vt ... xtured.rar
With up, down, left, right, B, C, move the object, but if you hold A, the object rotates.
I'm really impressed by your starfox demo because, showing a 3D cube is something, but navigating in a 3D world is a whole lot more
In all my attempts to do a 3D engine, I never managed to get rid of fixed point overflow when polygons starts to get out of screen
how do you manage to do proper 3D clipping without 3D coordinates doing overflow ? I heard about frustum culling but, it require lots of math and still your demo is pretty fast doing all of this on a single 7.6mhz processor + the tile constraint
I'm really not good at math >_<
In all my attempts to do a 3D engine, I never managed to get rid of fixed point overflow when polygons starts to get out of screen
how do you manage to do proper 3D clipping without 3D coordinates doing overflow ? I heard about frustum culling but, it require lots of math and still your demo is pretty fast doing all of this on a single 7.6mhz processor + the tile constraint
I'm really not good at math >_<
Retro game programming !

 Very interested
 Posts: 3043
 Joined: Thu Nov 30, 2006 9:46 pm
 Location: France  Sevres
 Contact:
Gasega68k>
I tested the different 3D tests and the only difference i could see with the hprec and normal is that hprec is less accurate when you zoom in on object !
I guess it has something to do with the global coordinates system, allowing far distance view but at the cost of low near accuracy.
The textured car is really impressive ! Imagine a game made of it (mixing flat and textures objects) on the sega genesis X'D
Orion>With the 68000 you are somehow limited to 16 bits numbers for fixed point operations as the native division and multiplication does 16x16=32 or 32:16=16. Hopefully 16 bits fixed point is not that bad to work with. In SGDK i use a fix16 type where:
 1 bit is used for sign
 9 bits for integer part
 6 bits for frationnal part
This way you can have 512 to 512 range values with fractionnal part (1/64 step for fractionnal part). I believe is a nice compromise =)
I tested the different 3D tests and the only difference i could see with the hprec and normal is that hprec is less accurate when you zoom in on object !
I guess it has something to do with the global coordinates system, allowing far distance view but at the cost of low near accuracy.
The textured car is really impressive ! Imagine a game made of it (mixing flat and textures objects) on the sega genesis X'D
Orion>With the 68000 you are somehow limited to 16 bits numbers for fixed point operations as the native division and multiplication does 16x16=32 or 32:16=16. Hopefully 16 bits fixed point is not that bad to work with. In SGDK i use a fix16 type where:
 1 bit is used for sign
 9 bits for integer part
 6 bits for frationnal part
This way you can have 512 to 512 range values with fractionnal part (1/64 step for fractionnal part). I believe is a nice compromise =)
Last edited by Stef on Tue Feb 11, 2014 10:20 pm, edited 1 time in total.
About the demo "hprec", well, I imagined that almost no one would see the difference, because as I said it shows a greater difference when the object is far or from certain angles, the error you mention, is that they need more accuracy in the calculation of the edges.
In this one, you will notice the difference (only rotate on the Z axis):
http://www.mediafire.com/download/v17c8 ... hprecb.rar
http://www.ffd2.com/fridge/chacking/
The article: "Issue # 16" "3D for the Masses: Cool 3D World and the Library", is the most complete that I found.
In this one, you will notice the difference (only rotate on the Z axis):
http://www.mediafire.com/download/v17c8 ... hprecb.rar
To create a 3D engine, one of the best places I found information was here:Orion_ wrote:I'm really impressed by your starfox demo because, showing a 3D cube is something, but navigating in a 3D world is a whole lot more
In all my attempts to do a 3D engine, I never managed to get rid of fixed point overflow when polygons starts to get out of screen
how do you manage to do proper 3D clipping without 3D coordinates doing overflow ? I heard about frustum culling but, it require lots of math and still your demo is pretty fast doing all of this on a single 7.6mhz processor + the tile constraint
I'm really not good at math >_<
http://www.ffd2.com/fridge/chacking/
The article: "Issue # 16" "3D for the Masses: Cool 3D World and the Library", is the most complete that I found.

 Very interested
 Posts: 2849
 Joined: Fri Aug 17, 2007 9:33 pm
That's not bad. I used 16.16 fixed point on my raycast example. I used inline assembly for the multiply:Stef wrote:Gasega68k>
I tested the different 3D tests and the only difference i could see with the hprec and normal is that hprec is less accurate when you zoom in on object !
I guess it has something to do with the global coordinates system, allowing far distance view but at the cost of low near accuracy.
The textured car is really impressive ! Imagine a game made of it (mixing flat and textures objects) on the sega genesis X'D
Orion>With the 68000 you are somehow limited to 16 bits numbers for fixed point operations as the native division and multiplication does 16x16=32 or 32:16=16. Hopefully 16 bits fixed point is not that bad to work with. In SGDK i use a fix16 type where:
 1 bit is used for sign
 9 bits for integer part
 6 bits for frationnal part
This way you can have 512 to 512 range values with fractionnal part (1/64 step for fractionnal part). I believe is a nice compromise =)
Code: Select all
static fixed_t FIX_MUL( fixed_t a, fixed_t b )
{
fixed_t res = 0, c = 0, d = 0, e = 0;
asm volatile (
"tst.l %1\n\t"
"spl %5\n\t"
"bpl.b 1f\n\t"
"neg.l %1\n"
"1:\n\t"
"tst.l %2\n\t"
"bpl.b 2f\n\t"
"not.b %5\n\t"
"neg.l %2\n"
"2:\n\t"
"move.w %1,%3\n\t"
"swap %1\n\t"
"move.w %2,%4\n\t"
"move.w %2,%0\n\t"
"swap %2\n\t"
"mulu %3,%0\n\t"
"mulu %1,%4\n\t"
"mulu %2,%1\n\t"
"mulu %3,%2\n\t"
"swap %1\n\t"
"move.w #0,%1\n\t"
"move.w #0,%0\n\t"
"swap %0\n\t"
"add.l %4,%0\n\t"
"addx.l %2,%0\n\t"
"addx.l %1,%0\n\t"
"tst.b %5\n\t"
"bne.b 3f\n\t"
"neg.l %0\n"
"3:\n\t"
: "=d" (res), "=d" (a), "=d" (b), "=d" (c), "=d" (d), "=d" (e)
: "0" (res), "1" (a), "2" (b), "3" (c), "4" (d), "5" (e)
: "cc"
);
return(res);
}

 Very interested
 Posts: 3043
 Joined: Thu Nov 30, 2006 9:46 pm
 Location: France  Sevres
 Contact:
Oh seeing your code i am realizing that my fix32 multiplication does not work correctly as it uses the standard 32x32=32 multiplication code when it should use a complete 32x32=64 code :/
16.16 is practical as you can use the fast swap instruction to get integer part but the multiplication is still very slow compared to 16 bits fixed points (and definitely too slow for use in a 3D engine where you need to do tons of multiplications :p).
16.16 is practical as you can use the fast swap instruction to get integer part but the multiplication is still very slow compared to 16 bits fixed points (and definitely too slow for use in a 3D engine where you need to do tons of multiplications :p).
Last edited by Stef on Sat Feb 15, 2014 9:35 pm, edited 1 time in total.

 Very interested
 Posts: 2849
 Joined: Fri Aug 17, 2007 9:33 pm
It's four multiplies instead of one, so yeah, it increases the load a bit.
In case anyone is wondering how you do larger multiplies with a smaller multiply command, use standard algebra. Consider each letter a word...
AB * CD = ?
Represent this as (A<<16+B)*(C<<16+D)
and now use the FOIL method
= (A<<16)*(C<<16) + (A<<16)*D + B*(C<<16) + B*D
Then pull out the shifts
= (A*C)<<32 + (A*D)<<16 + (B*C)<<16 + B*D
That gives a full 64 bit result using 16 bit multiplies. My code was optimized around the realization that you don't need the full 64 bits for a 16.16 * 16.16 > 16.16 multiply operation; only the middle 32 bits needs to be calculated.
In case anyone is wondering how you do larger multiplies with a smaller multiply command, use standard algebra. Consider each letter a word...
AB * CD = ?
Represent this as (A<<16+B)*(C<<16+D)
and now use the FOIL method
= (A<<16)*(C<<16) + (A<<16)*D + B*(C<<16) + B*D
Then pull out the shifts
= (A*C)<<32 + (A*D)<<16 + (B*C)<<16 + B*D
That gives a full 64 bit result using 16 bit multiplies. My code was optimized around the realization that you don't need the full 64 bits for a 16.16 * 16.16 > 16.16 multiply operation; only the middle 32 bits needs to be calculated.
How about that? http://en.wikipedia.org/wiki/Karatsuba_algorithm
It's very easy and using 3 multiply operations.
It's very easy and using 3 multiply operations.

 Very interested
 Posts: 2849
 Joined: Fri Aug 17, 2007 9:33 pm
Uh... I wouldn't say VERY easy, and there are certain things implied that may not hold on the 68000. From the articler57shell wrote:How about that? http://en.wikipedia.org/wiki/Karatsuba_algorithm
It's very easy and using 3 multiply operations.
This assumes the (b^2 + b), b, and (b + 1) all are simpler to calculate and their implied multiplications in the above equation don't actually require multiplications. See, (b^2 + b)x1y1 is multiplying THREE terms, not two. Ditto for the rest.A more efficient implementation of Karatsuba multiplication can be set as [5]
x y = (b^2 + b)x1y1 − b(x1 − x0)(y1 − y0) + (b + 1)x0y0,
where b is the weight of x1.