Untitled 32X Super Scalar Project

Ask anything your want about the 32X Mushroom programming.

Moderator: BigEvilCorporation

pw_32x
Interested
Posts: 19
Joined: Thu Dec 16, 2021 12:26 am

Untitled 32X Super Scalar Project

Post by pw_32x » Sat Jan 01, 2022 3:33 pm

Hello,

I've been working on a Sega 32X project since October. Thought I'd post about it here.

It currently looks like this.
gens_L4pwf813QD.png
gens_L4pwf813QD.png (32.27 KiB) Viewed 19616 times
I mainly post on Twitter for updates, but I might post longer articles or status updates here.

https://twitter.com/pw_32x

Thanks!
-pw

danibus
Very interested
Posts: 135
Joined: Sat Feb 03, 2018 12:41 pm

Re: Untitled 32X Super Scalar Project

Post by danibus » Sat Jan 01, 2022 10:56 pm

Use to read your posts in TW, very interesting. Good luck!

PD: put <ironic> or some :D in your tweets when you are ironic!

PD2: Try to change plane with a Ferrari (not flying of course), put a highway and wow, Out run 32X :mrgreen:

pw_32x
Interested
Posts: 19
Joined: Thu Dec 16, 2021 12:26 am

Re: Untitled 32X Super Scalar Project

Post by pw_32x » Sun Jan 02, 2022 1:04 am

Super easy!

Almost arcade perfect! </ironic>
Fusion_Eqtk7GysYV.png
Fusion_Eqtk7GysYV.png (50.29 KiB) Viewed 19585 times

pw_32x
Interested
Posts: 19
Joined: Thu Dec 16, 2021 12:26 am

Re: Untitled 32X Super Scalar Project

Post by pw_32x » Sun Jan 02, 2022 1:07 am

You make me want to create a driving game. Maybe later! First I want to concentrate on whatever the plane game becomes.

saxman
Interested
Posts: 19
Joined: Mon Sep 15, 2008 6:35 am

Re: Untitled 32X Super Scalar Project

Post by saxman » Sun Jan 02, 2022 5:20 am

I was going through all those tweets. Looks like you're off to a good start. Reminds me a lot of Super Thunder Blade. Looking forward to seeing where this goes.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Re: Untitled 32X Super Scalar Project

Post by Chilly Willy » Sun Jan 02, 2022 3:39 pm

Looks good. I love the post on twitter where you say the FPS counter is totally accurate while it's showing 65535 FPS. :D

On the timer issue... the FRT must be used to support a certain revision of buggy SH2 processors that made it into early runs of the 32X. If you want support of all 32X models, you need a unified interrupt handler and to use the FRT to bump said int handler. If you look at the crt0.s from Doom 32X Resurrection, you'll see the latest code I came up with for proper handling of those buggy processors, along with handling interrupts for the FRT, DMA, and the WDT. Since the FRT is used in bumping interrupts on those buggy processors, we used the watch dog timer for high resolution timing. It works rather well on real hardware and in Fusion.

Remember that the 32X code we did in D32XR is all MIT license, so it's no problem being used on any type of project, from closed source to GPL. I always make my example code MIT so that it can help as many people as possible.

pw_32x
Interested
Posts: 19
Joined: Thu Dec 16, 2021 12:26 am

Re: Untitled 32X Super Scalar Project

Post by pw_32x » Sun Jan 02, 2022 4:06 pm

Chilly Willy wrote:
Sun Jan 02, 2022 3:39 pm
Remember that the 32X code we did in D32XR is all MIT license, so it's no problem being used on any type of project, from closed source to GPL. I always make my example code MIT so that it can help as many people as possible.
I will definitely check that out. Thanks so much!

saxman
Interested
Posts: 19
Joined: Mon Sep 15, 2008 6:35 am

Re: Untitled 32X Super Scalar Project

Post by saxman » Sun Jan 02, 2022 5:57 pm

Is that airplane polygon-based, or just a bunch of sprites at different angles? Looks spritey, but angle changes are very fluid like polygons

pw_32x
Interested
Posts: 19
Joined: Thu Dec 16, 2021 12:26 am

Re: Untitled 32X Super Scalar Project

Post by pw_32x » Sun Jan 02, 2022 6:04 pm

The plane is made out of sprites, yep. They're rendered from a 3d model I made in Blender.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Re: Untitled 32X Super Scalar Project

Post by Chilly Willy » Mon Jan 03, 2022 2:59 pm

Your animation of the plane is very smooth. Me likey!

Just wanted to add, picodrive also supports the WDT, but gives larger values for the times than Fusion. I'd guess it's not taking into account the system clock divisor you can set for the WDT. I'll have to check the code on that to see about a fix. Fortunately, picodrive is open source. Gotta love projects that are open source. :D

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Re: Untitled 32X Super Scalar Project

Post by ob1 » Mon Jan 03, 2022 6:05 pm

Chilly Willy wrote:
Sun Jan 02, 2022 3:39 pm
I always make my example code MIT so that it can help as many people as possible.
Slow clap

Thank you Sir!!

pw_32x
Interested
Posts: 19
Joined: Thu Dec 16, 2021 12:26 am

Re: Untitled 32X Super Scalar Project

Post by pw_32x » Wed Jan 05, 2022 12:08 am

I posted this in another thread, but I thought it useful to add it here.

Re: performance
In my 32X project, in a frame that looks like this:
gens_gzWgqbV7sg.png
gens_gzWgqbV7sg.png (31.37 KiB) Viewed 19452 times
There are
- a sky, horizon, and ground
- several dozen trees
- a dozen clouds
- the player
- five spheres
- five shadows for the spheres

According to the stats I'm tracking, I'm pushing about 105,000 to 112,000 pixels a frame, for a little more than 30 fps (32 - 35).

Out of those pixels:
- ~71k are from the hardware fill line function. This is when the sky, horizon and ground are drawn to clear the screen
- ~35 are from drawing sprites, which I'm doing by word (two pixels at a time)

Since the entire screen is 71680 pixels, my rule rough rule of thumb is I only get about a screen and a half of pixel bandwidth per frame.
I've got a few ideas to improve this. Hopefully at least one of them will work. :)

Things like:
- don't erase the entire screen, just dirty rectangles. If I'm wiping 71k pixels for only 35k of sprites, it just might be worth it.
- look at assembly for the drawing routines
- split rendering across both CPUs? One erases, one draws? No idea if splitting drawing chores is a good idea. Haven't even attempted to use the second CPU yet.

danibus
Very interested
Posts: 135
Joined: Sat Feb 03, 2018 12:41 pm

Re: Untitled 32X Super Scalar Project

Post by danibus » Wed Jan 05, 2022 12:41 am

I have no idea but, in case I could manage 2 cpus, I will use 1 cpu to draw background, trees, clouds... (+music/fm) and another one just to draw "sprites" (plane, enemies, spheres fired by player and enemies) and also manage colissions.

cero
Very interested
Posts: 338
Joined: Mon Nov 30, 2015 1:55 pm

Re: Untitled 32X Super Scalar Project

Post by cero » Wed Jan 05, 2022 11:39 am

Code: Select all

cpu 1 ======------
cpu 2 ------======
How useful ;)

pw_32x
Interested
Posts: 19
Joined: Thu Dec 16, 2021 12:26 am

Re: Untitled 32X Super Scalar Project

Post by pw_32x » Thu Jan 06, 2022 6:34 pm

I tried a few things that Vic suggested in Saxman's 32X thread:
vic wrote: 3) try different optimization settings: generally -Os works better, but also try -O2 to see if that improves performance
I was using -O3 for 31-33fps. I switched to -Os an I get 33-35fps for the same scene. Nice!

I also tried -O2 with those added flags Vic suggested and I get hang on start up. I basically appended these to the existing list.
This is what I currently have.

Code: Select all

release: SHEXTRA  = -O2 -fomit-frame-pointer -fshort-enums -flto -fuse-linker-plugin -fno-align-loops -fno-align-functions -fno-align-jumps -fno-align-labels


For adding
vic wrote: Generally that means declaring your function with the following attributes:

Code: Select all

__attribute__((section(".data"), aligned(16)))
You can call other functions from functions in SDRAM without any restrictions. Make sure that your interrupt handlers and all callees are in SDRAM as well.
I've set the attribute to my base drawing functions and their callees. I've verified from the symbols file that they're indeed in ram as well as the various 32X interrupt handlers. Unfortunately I've having trouble seeing performance difference. I seriously doubt my stuff was super optimized before! :) So I wonder what's up with that.
Half screen for tiles, half clipped rectangle for sprites. The former caches better, the latter ensures that both CPUs will draw an equal amount of pixels, regardless of the sprite's scale or size.
Is that left-right halves or top-bottom halves?

RE: DMA stuff
You can do it at any time, not necessarily during vblank, e.g. while the game logic is executing. It's just that setting up DMA transfers for each asset and handling the interrupt is going to take some cycles, probably negating the potential win. You'd probably be better off allocating a LRU cache in SDRAM and copying stuff on the fly using the CPU right before the draw call. Doom 32X Resurrection uses a similar approach.
One challenge that's always in the back of my mind is how I'm going to pull off screen rotation/tilt. I don't think I can do software sprite rotation fast enough and I have doubts I'll be able to fit all the sprites and their tilted versions in ram. (most of the sprites are asymmetrical so mirroring to save ram doesn't work). So I wonder if loading a new set of rotated sprites per frame is close enough to feasible.

Post Reply