Super VDP

Ask anything your want about the 32X Mushroom programming.

Moderator: BigEvilCorporation

TotOOntHeMooN
Interested
Posts: 38
Joined: Sun Jun 01, 2008 1:12 pm
Location: Lyon, France
Contact:

Post by TotOOntHeMooN » Sat May 16, 2009 5:33 pm

If fact, we just have to use AND operator.

Code: Select all

0 : skip
1 : write

Sprite Mask    Screen Mask    Merged Mask
00000000       01010000       00000000
01111110       00001111       00001110
01111110       10100010       00100010
00000000       01101100       00000000
The reason to do "front to back" was to avoid to write hidden pixels.
Sprites hide sprites, b-plane and a-plane, ...
Why drawing a-plane first, if you finaly can't view it ? ;)

We may also give access to the screen mask for the final "user".
So he can initialize it in accordance with the display of the megadrive. (osd, menu, ...)

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Sat May 16, 2009 9:09 pm

TotOOntHeMooN wrote:If fact, we just have to use AND operator.

Code: Select all

0 : skip
1 : write

Sprite Mask    Screen Mask    Merged Mask
00000000       01010000       00000000
01111110       00001111       00001110
01111110       10100010       00100010
00000000       01101100       00000000
The reason to do "front to back" was to avoid to write hidden pixels.
Sprites hide sprites, b-plane and a-plane, ...
Why drawing a-plane first, if you finaly can't view it ? ;)
Let's take the extreme example and say the lowest plane is solid and the plane above it is a checkerboard pattern. Every other pixel is transparent. If you were drawing individual pixels, then making an effort to avoid working on pixels that are already set is worth it. However, if you write multiple pixels using a mask, it's not... unless after masking the result is zero for ALL pixels... which it probably won't be much of the time. Like mentioned before, you're wasting time doing the screen mask test when you're normally just going to draw anyway.
We may also give access to the screen mask for the final "user".
So he can initialize it in accordance with the display of the megadrive. (osd, menu, ...)
OSD or menus will simply be overwriting the display. I can't think of an "OSD" that is UNDER the graphics on the screen. Same for menus. They're OVER the graphics so the user can read them.

Perhaps you meant that the user would initialize the screen mask so that areas cover by the OSD or menus would be cleared to start so they aren't drawn by the game.

Basically, a screen mask would only be worth it if the majority of the screen is covered by the topmost layer of graphics, or the frame buffer is incredibly slow. Anytime a significant amount of lower layers are drawn, it probably will COST more time than it save, given the framebuffer is (reasonably) fast. That's why I was asking about the cycle timing of the overwrite region - if it's the same as the frame buffer (3 to 5 cycles), then it's better to simply use it than to try to find ways around it. If the overwrite region was something like 5 best to 10 worst, then it might be worth finding ways to avoid drawing pixels. If it were 10 or more, you would definitely be using some other method of drawing.

Your idea of using the screen mask to avoid drawing lower layers seems to come from portals: you avoid drawing areas farther away because the time spent computing those areas (in 3D) makes drawing those pixels expensive. This is a 2D engine, not 3D. It's not nearly as expensive unless you make the actual drawing of the cells more complex (like having alpha blending or lighting or similar things).

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Registers to FrameBuffer

Post by ob1 » Wed Mar 03, 2010 3:44 pm

All Right.
I have one tile line (ie, 8 pixels from a tile) in 2 registers (for example, R1 and R2), and I want to put them in FrameBuffer (the address where I want to put them is, let's say, in R8).
This is needed if I intend to do some scrolling.
Here's how I can do, depending on the addresse modulo :

Modulo 0

Code: Select all

R1	0123
R2	4567
R8	0123
R9	4567
	ADD	#8,R8
	MOV.L	R2,@-R8
	MOV.L	R1,@R8		; 3 cycles	(2 Writes)
Modulo 1

Code: Select all

R1	0123
R2	4567
R8	X012
R9	3456
R10	7XXX
	ADD	#8,R8
	MOV.B	R2,@R8		; R10 = 7XXX
	SHLR8	R2		; R2 = -456
	MOV.L	R2,@-R8		; R9 = -456
	MOV.B	R1,@R8		; R9 = 3456
	SHLR8	R1		; R1 = -012
	MOV.W	R1,@-R8		; R8 = XX12
	SHLR16	R1		; R1 = ---0
	MOV.B	R1,@-R8		; R8 = X012	9 cycles (5 Write)
Modulo 2

Code: Select all

R1	0123
R2	4567
R8	XX01
R9	2345
R10	67XX
	ADD	#8,R8
	MOV.W	R2,@R8		; R10 = 67XX
	XTRCT	R1,R2		; R2 = 2345
	MOV.L	R2,@-R8		; R9 = 2345
	SHLR16	R1		; R1 = --01
	MOV.W	R1,@-R8		; R8 = XX01	6 cycles (3 Write)
Modulo 3 (this one is a bitch)

Code: Select all

R1	0123
R2	4567
R8	XXX0
R9	1234
R10	567X
	ADD	#10,R8
	MOV.B	R2,@R8		; R10 = XX7X
	SHLR8	R2		; R2 = -456
	MOV.W	R2,@-R8		; R10 = 567X
	SHLR16	R2		; R2 = ---4
	MOV.B	R2,@-R8		; R9 = XXX4
	MOV.B	R1,@-R8		; R9 = XX34
	SHLR8	R1		; R1 = -012
	MOV.W	R1,@-R8		; R9 = 1234
	SHLR16	R1		; R1 = ---0
	MOV.B	R1,@-R8		; R8 = XXX0	11 cycles (6 Write)
for modulo between 4 and 7, we just got to change the first offset.

We here have ~70k CPU cycles and ~40k FB access.
Last edited by ob1 on Tue Apr 06, 2010 8:39 am, edited 1 time in total.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Wed Mar 03, 2010 5:20 pm

One thing to remember about the frame buffer and bytes - byte writes ALWAYS act like they are to the overwrite area, even when they aren't. Writing a byte of 0x00 to the frame buffer is always ignored. I ran into that interesting tidbit with Wolf32X. Of course, that might be advantageous for your SuperVDP code if you plan things just right. :D

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Thu Mar 04, 2010 8:11 am

I didn't think about that, but intended to write no 0x00 on backgound anyway.

notaz
Very interested
Posts: 193
Joined: Mon Feb 04, 2008 11:58 pm
Location: Lithuania

Post by notaz » Thu Mar 04, 2010 11:46 am

Chilly Willy wrote:One thing to remember about the frame buffer and bytes - byte writes ALWAYS act like they are to the overwrite area, even when they aren't. Writing a byte of 0x00 to the frame buffer is always ignored.
True, several games even rely on this.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Thu Mar 04, 2010 5:22 pm

It's odd, but the "frame buffer" ignores BYTE WRITES that are 0, while the "over-write buffer" ignores WORD WRITES that are 0. Knowing that could be handy for certain games.

I think they did the first because they wanted to be able to use the Z80 for graphics. The frame buffer is one of the areas the Z80 can write, but only as bytes. So you wouldn't be able to do overlay style graphics unless they ignored 0 bytes.

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Sat Jan 15, 2011 7:58 pm

PAL
240 lines.
2 planes.
28 fps.

Now fixing ...

Edit : scrollable planes.
Last edited by ob1 on Sun Jan 16, 2011 7:53 am, edited 1 time in total.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Sat Jan 15, 2011 9:50 pm

ob1 wrote:PAL
240 lines.
2 planes
28 fps.

Now fixing ...
Awesome! So an NTSC screen, being shorter, may be able to handle 30 FPS. I think 30 FPS in NTSC and 25 in PAL is plenty fine for almost any platformer or shooter.

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Sun Jan 16, 2011 9:56 pm

NTSC : 31fps

Of course, everything would be faster without scrolling.
And it's only on GENS. On real hardware, I'd be slower.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Sun Jan 16, 2011 10:46 pm

ob1 wrote:NTSC : 31fps

Of course, everything would be faster without scrolling.
And it's only on GENS. On real hardware, I'd be slower.
Gens doesn't emulate RAM access timing, did you tested on Kega Fusion, you should get result closer to the real hardware :)
By the way, did you see Pitfall 32X game ? it's also doing a sort of "super VDP" but only for 1 plan and it displays at 30 FPS.

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Mon Jan 17, 2011 5:40 am

Stef wrote:Did you tested on Kega Fusion
I just did and sadly, it won't run. I guess I trust the 32X too much. Sadly, there isn't any debug mode in Kega and I can't easily see what's wrong. Maybe one day ...
Stef wrote:By the way, did you see Pitfall 32X game ? it's also doing a sort of "super VDP" but only for 1 plan and it displays at 30 FPS.
How did you measure 30 fps ?
Anyway, as soon as I get a stable version, I will post mine ;)

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Mon Jan 17, 2011 9:07 am

ob1 wrote:
Stef wrote:Did you tested on Kega Fusion
I just did and sadly, it won't run. I guess I trust the 32X too much. Sadly, there isn't any debug mode in Kega and I can't easily see what's wrong. Maybe one day ...
Stef wrote:By the way, did you see Pitfall 32X game ? it's also doing a sort of "super VDP" but only for 1 plan and it displays at 30 FPS.
How did you measure 30 fps ?
Anyway, as soon as I get a stable version, I will post mine ;)
Too bad for Kega, i guess it doesn't work on real hardware either :-/
About Pitfall 32X i can just say it runs at least at 30 FPS and below 60 FPS. I tested it on an NTSC 32X and you can easily see the 32X scrolling plan isn't refresh at 60 FPS but at 30 only...

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Mon Jan 17, 2011 4:31 pm

Stef wrote:Too bad for Kega, i guess it doesn't work on real hardware either :-/
You're right.
For now, I can't plan on coding with Kega, since it hasn't got a debugger.
But I'll definitely try to fix this issue as soon as my project will be stable.

BTW, PAL : 31fps, NTSC : 34fps

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Mon Jan 17, 2011 6:03 pm

Yep without debugger you can't find easily what is wrong with it...
Nice improvement over the initial FPS by the way, i think this is enough to get a stable 25 FPS in PAL or 30 FPS in NTSC :)

Post Reply