wolfenstein demo for sega genesis

Announce (tech) demos or games releases

Moderator: Mask of Destiny

gasega68k
Very interested
Posts: 141
Joined: Thu Aug 22, 2013 3:47 am
Location: Venezuela - Caracas
Contact:

Re: wolfenstein demo for sega genesis

Post by gasega68k » Sat Apr 15, 2017 3:42 am

Hello all, I wanted to post a new "enanced" version of Wolf3d. :)

Although the game is complete (that is, it has all levels, music, sfx, etc), there are some things I want to add such as automap, I also increased the resolution of the game to 256x144 in the area of vision (before was 256x128), still works with a framerate slightly higher than the previous one since I used a code to draw columns of walls faster, but to make this change I had to modify many things.

In the version with the resolution of 256x128 (the one you know), I only needed a 16KB buffer in Vram, since the entire framebufer (which is in Ram) could be transferred to Vram in a frame, during the "Extended Vblank" of 86 scanlines.
In Vram are present all the frames of all the weapons and all the frames of the face of BJ and still there is a little of free space.

In the case of the version of 256x144 (which requires 18KB for the framebuffer in Ram), it is necessary to double buffer in Vram (well, not exact 2 buffer is required, I explain below), since 2 frames are needed to transfer the framebuffer To Vram, because now it is not possible to do it in one frame due to the increase of the resolution.
Since it requires double buffering in Vram (to avoid tearing), it really does not require 36KB of Vram, only half of the image requires double buffering, because one half (let's call it "L1") is transferred in one frame and the other half (we call it "L2") is seen on the screen, in the next frame the second half of the image (call it "R") is transferred to Vram and the first two halves are exchanged ("L1", "L2"), So only 27KB of Vram is required.
In this version only the weapons and all their frames are in Vram, the faces of BJ is only present the ones that is currently needed.
There are also some minor bug fixes.

Here are some pictures comparing the two versions. :)

BEFORE.......................................AFTER

Image Image

Image Image

Image Image

Image Image

Image Image

Image Image


And here is the rom, is a "demo version"(only first episode, sorry) with increased resolution.


Because there are some big changes I wanted to post this version, so everyone can test this on real hardware and/or on diferent versions of Genesis/MD. Enjoy. :D
Last edited by gasega68k on Mon Jun 25, 2018 6:04 am, edited 1 time in total.

BigEvilCorporation
Very interested
Posts: 209
Joined: Sat Sep 08, 2012 10:41 am
Contact:

Re: wolfenstein demo for sega genesis

Post by BigEvilCorporation » Sat Apr 15, 2017 9:17 am

This is EXCEPTIONAL stuff!

Are the enemies/weapons/items drawn with the same raycasting method as the environment? The style is consistent, they don't stand out as "sprites" like in other versions.
A blog of my Megadrive programming adventures: http://www.bigevilcorporation.co.uk

SegaTim
Very interested
Posts: 177
Joined: Thu Nov 19, 2015 1:59 pm
Location: East Prussia
Contact:

Re: wolfenstein demo for sega genesis

Post by SegaTim » Sat Apr 15, 2017 9:46 am

WOW, very fast moving!

Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: wolfenstein demo for sega genesis

Post by Sik » Sat Apr 15, 2017 5:25 pm

BigEvilCorporation wrote:Are the enemies/weapons/items drawn with the same raycasting method as the environment? The style is consistent, they don't stand out as "sprites" like in other versions.
That was always the case as far as I know @_@

What I did notice is that it seems a lot of sprites may have been reconverted (the lack of greens is hurting it, but again, 16 colors isn't helping). But I may just be misremembering. That would certainly explain this though.
Sik is pronounced as "seek", not as "sick".

Kabuto
Interested
Posts: 27
Joined: Sun Aug 25, 2013 6:56 pm

Re: wolfenstein demo for sega genesis

Post by Kabuto » Sun Apr 23, 2017 4:15 pm

Awesome progress!

Just wanted to point out the byte granularity DMA trick as explained in here, it might or might not give you a speed boost, depending on how your rendering works. All mega drives support it in one way or another, and some emulators already emulate it as well. MD1 and MD2 do as described in there (byte granularity), not 100% sure about Genesis 3 but it looks like it behaves like ordinary DMA with a huge speed boost.

SegaTim
Very interested
Posts: 177
Joined: Thu Nov 19, 2015 1:59 pm
Location: East Prussia
Contact:

Re: wolfenstein demo for sega genesis

Post by SegaTim » Mon Jul 24, 2017 12:46 pm

Commercial application of a engine:

https://www.youtube.com/watch?v=VQmHwAEOXl0

gasega68k
Very interested
Posts: 141
Joined: Thu Aug 22, 2013 3:47 am
Location: Venezuela - Caracas
Contact:

Re: wolfenstein demo for sega genesis

Post by gasega68k » Sat Jun 16, 2018 4:47 am

Hello everyone, I'm back! :)

It's been a little over a year since the last version I posted, but I wanted to post a new version of wolf3d.

Some of the changes are:

NTSC - PAL (50hz) compatible, this means that in PAL systems the game should run at the same speed.

Added the automap (finally :)), also includes statistics of the game, such as "kills", "secrets", "treasures" and time. I'll let you find out how to access this screen, it's a bit hidden. ;)

Removed all the "CRAM dots" artifacts (I think), especially during the fade-in, fade-out, although I have not tested this on real hardware, at least I have tested it with Blastem.

I have corrected a bug, where the content of the Sram is deleted, losing all the saved files. This happened when entering the "high-score" when you "save" the name, score and level. I do not know from which version this bug was present, but at least the last version had it.

I have also made some small optimizations graphics changes and minor code modifications.

Here is the new version:
http://www.mediafire.com/file/4w7xtsa6a ... r.rar/file

I noticed that almost all the images of my last post are gone, I'll see if I can fix it.


@kabuto, I'm still not sure if this "byte-wide" DMA has any use for this, because as I read in that document:
"just the lower byte of each word written to VRAM will get stored"
does that mean that there is no way for the high byte of each word to be read? or is not it?

Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: wolfenstein demo for sega genesis

Post by Sik » Sat Jun 16, 2018 6:56 am

gasega68k wrote:
Sat Jun 16, 2018 4:47 am
@kabuto, I'm still not sure if this "byte-wide" DMA has any use for this, because as I read in that document:
"just the lower byte of each word written to VRAM will get stored"
does that mean that there is no way for the high byte of each word to be read? or is not it?
I'll refer to the relevant effect (skip to 1:30)
https://youtu.be/gWVmPtr9O0g?t=90

Those clouds (which get kind of hidden by the rain, ugh) are rendered in vertical strips, much like you'd do in raycasting. The problem is that while each column is 2px wide (1 byte), DMA only works with words, and this means that even if you use autoincrement to load vertical stripes, you'll still need to store them in groups of two interleaved columns, which leads to drawing code like this:

Code: Select all

    move.b  d0, (a0)
    addq.w  #2, a0
    move.b  d0, (a0)
    addq.w  #2, a0
    move.b  d0, (a0)
    addq.w  #2, a0
    ...
When you switch to 128KB mode however, the VRAM bus becomes 16-bit wide... with half of it going nowhere. So the idea here is that now each stripe is word-wide (wasting every other byte). The downside is that of course now you spend twice as much RAM, but the upside is that now the drawing code can look like this:

Code: Select all

    move.w  d0, (a0)+
    move.w  d0, (a0)+
    move.w  d0, (a0)+
    ...
Note that 128KB mode also messes up with the VRAM address line arrangement, so you'll need to account for this when doing the DMA transfer.


This is basically the same kind of hack as that trick where you abuse autoincrement to convert a linear framebuffer into tiles in VRAM (that wastes half of every tile), but in reverse (being applied to RAM instead).
Sik is pronounced as "seek", not as "sick".

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Re: wolfenstein demo for sega genesis

Post by Stef » Sat Jun 16, 2018 8:51 pm

gasega68k wrote:
Sat Jun 16, 2018 4:47 am
Hello everyone, I'm back! :)

It's been a little over a year since the last version I posted, but I wanted to post a new version of wolf3d.

Some of the changes are:
...
Glad to see you back gasega :D
Well done with this almost final version compatible with NTSC/PAL system now =)
Always impress by how smooth it runs !
Last edited by Stef on Sun Jun 17, 2018 8:16 pm, edited 1 time in total.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Re: wolfenstein demo for sega genesis

Post by Chilly Willy » Sun Jun 17, 2018 3:49 pm

Yes, it's just incredible how well it runs on a stock Genny.

gasega68k
Very interested
Posts: 141
Joined: Thu Aug 22, 2013 3:47 am
Location: Venezuela - Caracas
Contact:

Re: wolfenstein demo for sega genesis

Post by gasega68k » Mon Jun 18, 2018 1:20 am

Sik wrote:
Sat Jun 16, 2018 6:56 am
The problem is that while each column is 2px wide (1 byte), DMA only works with words, and this means that even if you use autoincrement to load vertical stripes, you'll still need to store them in groups of two interleaved columns, which leads to drawing code like this:

Code: Select all

    move.b  d0, (a0)
    addq.w  #2, a0
    move.b  d0, (a0)
    addq.w  #2, a0
    move.b  d0, (a0)
    addq.w  #2, a0
    ...
You could do that using a7 (sp) reg like this:

Code: Select all

    move.b  d0, (a7)+
    move.b  d0, (a7)+
    move.b  d0, (a7)+
    ...
because sp reg. autoincrement by 2 even on bytes.
But you have to avoid using subroutines (bsr, jsr) and take care of disabling interrupts.
Sik wrote: When you switch to 128KB mode however, the VRAM bus becomes 16-bit wide... with half of it going nowhere. So the idea here is that now each stripe is word-wide (wasting every other byte). The downside is that of course now you spend twice as much RAM, but the upside is that now the drawing code can look like this:

Code: Select all

    move.w  d0, (a0)+
    move.w  d0, (a0)+
    move.w  d0, (a0)+
    ...
Note that 128KB mode also messes up with the VRAM address line arrangement, so you'll need to account for this when doing the DMA transfer.


This is basically the same kind of hack as that trick where you abuse autoincrement to convert a linear framebuffer into tiles in VRAM (that wastes half of every tile), but in reverse (being applied to RAM instead).
The way I do it, at least in the latest versions (256 x 144) is using the "movep.w" instruction, like this:

Code: Select all

    movep.w  d0, xxxx(a0)
    movep.w  d0, xxxx(a0)
    movep.w  d0, xxxx(a0)
    ...
why? I'll try to explain it:

If you have seen the 3d level of toy story, you should know that in reality only half of the screen is drawn and the other half is only the mirror. What I do is use movep.w to draw two "bytes" at a time, the idea is that one byte is for the top half and the other for the bottom, but the difference with the 3d level of toy story is that here you actually draw two different "bytes", so it's not mirrored.

With the 128KB mode I was hoping that something similar could be done, but using only "move.w", this would have meant a major improvement in the drawing code, but it seems that it is not possible. :(

Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: wolfenstein demo for sega genesis

Post by Sik » Mon Jun 18, 2018 1:37 am

gasega68k wrote:
Mon Jun 18, 2018 1:20 am
You could do that using a7 (sp) reg like this:

Code: Select all

    move.b  d0, (a7)+
    move.b  d0, (a7)+
    move.b  d0, (a7)+
    ...
because sp reg. autoincrement by 2 even on bytes.
But you have to avoid using subroutines (bsr, jsr) and take care of disabling interrupts.
Does that work when you need to write to odd bytes? (honestly I'm still not sure about all the details of the correction other than it seems to apply only on postincrement and predecrement, it may act weirdly if a7 is odd)
gasega68k wrote:
Mon Jun 18, 2018 1:20 am
With the 128KB mode I was hoping that something similar could be done, but using only "move.w", this would have meant a major improvement in the drawing code, but it seems that it is not possible. :(
Erm, the whole point is that you can do it o_O (and move.l should work too, by extension) It automatically throws away every other byte during the DMA transfer because they go nowhere (literally). Just make sure that 128KB mode is active only when the VDP is in blanking period (since otherwise of course it will be displaying garbage). Yes, you can safely toggle it at any point in the screen.
Sik is pronounced as "seek", not as "sick".

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Re: wolfenstein demo for sega genesis

Post by Stef » Mon Jun 18, 2018 8:41 am

I think what Gasega meant is that 128 KB would act more like movep instruction (and so all source data is used) but with 128 KB mode, you throw away half of the data so you lost the benefit from it. With movep instruction Gasega can process 2 bytes in 16 cycles, with 128 KB mode you do need 2 move.w instruction to process 2 bytes, so 16 cycles as well --> no gain.

@Gasega> I guess you already tried but, no way to use the movep.l instruction so you can process 4 bytes in 24 cycles ?

gasega68k
Very interested
Posts: 141
Joined: Thu Aug 22, 2013 3:47 am
Location: Venezuela - Caracas
Contact:

Re: wolfenstein demo for sega genesis

Post by gasega68k » Mon Jul 02, 2018 6:25 am

Sik wrote:
Mon Jun 18, 2018 1:37 am
gasega68k wrote:
Mon Jun 18, 2018 1:20 am
You could do that using a7 (sp) reg like this:

Code: Select all

    move.b  d0, (a7)+
    move.b  d0, (a7)+
    move.b  d0, (a7)+
    ...
because sp reg. autoincrement by 2 even on bytes.
But you have to avoid using subroutines (bsr, jsr) and take care of disabling interrupts.
Does that work when you need to write to odd bytes? (honestly I'm still not sure about all the details of the correction other than it seems to apply only on postincrement and predecrement, it may act weirdly if a7 is odd)
I tested this on emulators and it works, unless emulators are wrong. :?
On 68000 docs it only says that when the size is byte, the address is incremented by 2 to keep the sp aligned to a word boundary, don't say anything when a7 is odd :?: , so it will check if a7 is odd then autoinc by one instead? or bit 0 of a7 reg is ignored so always will be zero to keep aligned to a word boundary?
Stef wrote:
Mon Jun 18, 2018 8:41 am
I think what Gasega meant is that 128 KB would act more like movep instruction (and so all source data is used) but with 128 KB mode, you throw away half of the data so you lost the benefit from it. With movep instruction Gasega can process 2 bytes in 16 cycles, with 128 KB mode you do need 2 move.w instruction to process 2 bytes, so 16 cycles as well --> no gain.
Yes, that is what i mean, the idea is that using only one move.w instead of one movep.w would process 2 bytes, but that is not posible. :(
Stef wrote:
Mon Jun 18, 2018 8:41 am
@Gasega> I guess you already tried but, no way to use the movep.l instruction so you can process 4 bytes in 24 cycles ?
It can work on some circunstances, so will be not a great improvement. But there are others uses for move.l, like using 2 planes to get 31 colors, without a huge impact on perfomance, at least on draw routines, but it will need 2x more Ram for framebuffer, also 2x more time to transfer to Vram, also 2x more vram...
But I want to mention that I used movep.l to make the fast G-zero logo scaling, but in this case by procesing 4 bytes, one for each "corner", I meant, top letf, top right, down left, down right. :wink:

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Re: wolfenstein demo for sega genesis

Post by Chilly Willy » Mon Jul 02, 2018 2:41 pm

I suspect that using sp odd works with byte read/write, but you'd have to disable interrupts or the first int will crash the system. The strange stack operation for bytes is most useful when fetching data from odd locations as bytes while avoiding shifts.

Code: Select all

    move.b (d16,a0),-(sp)  ; 16
    move.w (sp)+,d0  ; 8
    move.b (d16,a0),d0  ; 12
    movep.w d0,(d16,a1)  ; 16
which is 52 cycles as opposed to

Code: Select all

    move.b (d16,a0),d0  ; 12
    lsl.w #8,d0  ; 22
    move.b (d16,a0),d0  ; 12
    movep.w d0,(d16,a1) ; 16
which is 62 cycles.

Post Reply