Super VDP

Ask anything your want about the 32X Mushroom programming.

Moderator: BigEvilCorporation

Mask of Destiny
Very interested
Posts: 555
Joined: Thu Nov 30, 2006 6:30 am

Post by Mask of Destiny » Fri Mar 23, 2007 6:16 pm

About the ram bus, i always thought it was full 32bit... I still think it is, else, why would the cartridge access be 3-4 times slower?
I believe access to the cartridge is split evenly betweenthe 68K and the 32X with the 32X getting one access per 68K bus cycle. The SH-2s run at 3 times the clock of the 68K, but a 68K bus operation takes 4 cycles for a total of 12 SH-2 cycles for every 68K bus operation. If the SH-2 is only allowed one bus operation per 68K bus operation, it will take the SH-2 up to 12 cycles or so to access the cartridge bus. I believe it was said that DRAM access takes ~3 cycles. 12/3 = 4

That said, even with a 16-bit bus, longword moves will still be more efficient than word or byte moves.

ob1
Very interested
Posts: 399
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Mon Mar 26, 2007 10:04 am

Hurray !!!
Using this algo

Code: Select all

currFB = FB + tile_row*8*320 + tile_col*8;
for (row=0;row<8;row++) {
	for (col=0;col<8;col++) {
		*(currFB++) = *(tile++);
	}
	currFB += 312;	// 320 - 8
}
, unrolling, and using long instead of bytes

Code: Select all

R2 = tile
R3 = currFB
repeat(8) {
	mov.l	@R2+,R0
	mov.l	R0,@R3
	mov.l	@R2+,R0
	mov.l	R0,@(4,R3)
	mov.l	VALUE_320,R4
	add	R4,R3
}
, I achieve up to 2500 tiles / CPU !!!
It will be slower on real 32X, and pipeline will stall, but there are things to do.

Fonzie
Genny lover
Posts: 323
Joined: Tue Aug 29, 2006 11:17 am
Contact:

Post by Fonzie » Mon Mar 26, 2007 1:04 pm

5000 tiles total ? O_o wow, that's great, almost twice the megadrive display power :D
I can't imagine the possibilities :D displaying big background & huge sprites :D

By using 16*16pixels tiles instead of 8*8, you should be able to get more speed :D

Btw, are you storing the tiles in RAM or in ROM?

ob1
Very interested
Posts: 399
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Mon Mar 26, 2007 1:19 pm

Tiles will be stored in RAM.
For now, I just call one tile, so 5000 tiles total is pretty much theorical. But choosing the right algorithm is all the rage.
Scrolling remains my main concern, since I really have no clue of how to handle it (except the screen-scroll you'll find at the beginning of this post).
Starting at 16x16 pixels, I guess I'll choose DMA instead of mov.l
I'll try to post the demo tonight (no FTP access from work).

Fonzie
Genny lover
Posts: 323
Joined: Tue Aug 29, 2006 11:17 am
Contact:

Post by Fonzie » Mon Mar 26, 2007 2:16 pm

If i remember, the 32x have the possibility to push bytes(pixels) of a single line left or right by hardware or maybe it is what you already said by "screen scrolling"

User avatar
Stef
Very interested
Posts: 2619
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Mon Mar 26, 2007 2:24 pm

Fonzie wrote:If i remember, the 32x have the possibility to push bytes(pixels) of a single line left or right by hardware or maybe it is what you already said by "screen scrolling"
As far i remember, you can do easy scrolling with frame buffer address variation but as the address is word boundary, you have to use a special VDP register for the byte wide scrolling :)

ob1
Very interested
Posts: 399
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Mon Mar 26, 2007 2:25 pm

Yes, I see what you mean,
and no, this is not what I meant.
But no, it's not exactly what I intend to do.
What I want to do is scroll a plane, not the whole screen. And that's the big deal !!!

User avatar
Stef
Very interested
Posts: 2619
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Mon Mar 26, 2007 2:32 pm

ob1 wrote:Yes, I see what you mean,
and no, this is not what I meant.
But no, it's not exactly what I intend to do.
What I want to do is scroll a plane, not the whole screen. And that's the big deal !!!
Doing plan scrolling will be a bit more complexe, in fact you'll need to do byte wide transfert :-/

Fonzie
Genny lover
Posts: 323
Joined: Tue Aug 29, 2006 11:17 am
Contact:

Post by Fonzie » Mon Mar 26, 2007 2:35 pm

Sure, actualy, you have a minimal horizontal scrolling precision of two pixels (if writting with words) or even four pixels (if using long)...

It isn't so bad if you want to make a fast scrolling, but to make a slow and smooth scrolling, its not perfect...
If its two pixel precision (write with words), i think its okay for slow scrolling (can scroll smooth-perfect 320 pixels in 2.5 seconds minimum, not so bad).

ob1
Very interested
Posts: 399
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Mon Mar 26, 2007 3:26 pm

Stef wrote:Doing plan scrolling will be a bit more complexe, in fact you'll need to do byte wide transfert :-/
Doing byte transfer, I'm afraid I won't reach more than 1200 tiles total :(

ob1
Very interested
Posts: 399
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Mon Mar 26, 2007 7:52 pm

By the way, does anyone has clue or links where I oculd find scrolling algorithm or concepts ?

User avatar
Stef
Very interested
Posts: 2619
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Tue Mar 27, 2007 7:14 am

The simpler method to handle horizontal scrolling is to use a buffer with 2 tiles overhead (one on the left side and one on the right side) :

displayed area
<--------------------------------->
| |-----------------------------------| |
| |-----------------------------------| |


Then before doing you tile copy, you adjust the destination address in this way :
dest -= hor_scroll & X;

where X + 1 = horizontal tile size
so X = 7 or 15.

and your source tilemap address in this way :
src += hor_scroll >> Y;

where 2^Y = horizontal tile size
so Y = 3 or 4;

hmm, ok that's the basic idea... vertical scrolling just need some source adjustement :)

ob1
Very interested
Posts: 399
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Project collapsed

Post by ob1 » Wed May 09, 2007 7:54 am

Unfortunately, I completed the SuperVDP project.
Unfortunately, it can handle horizontal mirror, vertical mirror and 90° rotation.
Unfortunately, I learned a lot of things, especially about SH2 pipeline and subroutine.
And unfortunately, displaying two frames does last more than 1/60 sec.

I can reach a bigger speed if I give up tile handling (mirror and rotation). But, it won't be a VDP (P stands for Processor) anymore, but rather a DMAC. Since 1) it was not what SuperVDP was intend for and 2) there's no challenge in it, I won't turn to this extend. Feel free to go for it.

So, there's no point in continuing this project. The source will be available on my site.

Anyway ... what a pity. :(

Fonzie
Genny lover
Posts: 323
Joined: Tue Aug 29, 2006 11:17 am
Contact:

Post by Fonzie » Wed May 09, 2007 10:29 am

nd unfortunately, displaying two frames does last more than 1/60 sec.
Two layers?
It is good performance to me :D

Did you know that the bus bandwidth (to framebuffer) do not allow much more anyway? :D
Maybe you can use 16*16tiles to increase a bit the performance.
Adding ultrabasic scrolling (long alignement) will not slowdown anyway :D

ob1
Very interested
Posts: 399
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Wed May 09, 2007 12:19 pm

I can't reach two layers. 2 planes does take more than 1/60 sec, so I can't reach 30fps :(

But actually, I can reach much higher speed.
But I must not process tiles. Just copying tiles, straight tiles, and using longwords, allows me to draw, let's say 4 planes !!!
By the way, no mirror, no scrolling or transparency.
So ?
So it's not what I want. In this case, I don't need a tile engine. In the end, I just get a still image, composed of 4 tile layers, yes, but just a still image. I'm not sure a still image does need two 23MHz 32-bit RISC CPU. The mere 68k would be more than enough for this. And being clever, you'd use the RLE mode.
There's no point in writing such a Super(loosy)VDP.

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest