Page 14 of 25
Posted: Sun Apr 06, 2014 10:22 pm
by buricco
I guess that means it works on PAL, then =P, as iirc that concern came up. </slowpoke?>
music
Posted: Sat Apr 12, 2014 5:01 pm
by Mills
I don't know if somebody talk about this, this is a big thread ).
I realized the music sounds just like the original ym3812 music, how did they make the ym2612 to play the original files?
i'd love to use some modified ym3812 tunes for a port.
Posted: Sun Apr 13, 2014 5:20 pm
by Kuroto
I'm not entirely sure, but as the original Wolf3D music is in MIDI format, i'm guessing they used a converter.
Obviously the MIDI file would have to be modified a bit, and then an appropriate (TFI?) instrument would have to be made to create the correct YM2612 values.
I'm looking into these things as well, as i'm trying to create some YM2612 music using the FMDrive VST by Aly James.
Here's a converter i stumbled upon.
-Steve
Posted: Mon Apr 14, 2014 4:24 am
by gasega68k
Hello everyone.
About the music, the game actually uses the file format "imf" which is similar to midi and was created by id.
http://www.fileinfo.com/extension/imf
the method I do to make the musics is as follows:
I first extracted all "imf " files of the game, then download a plugin for Winamp to play this format , then with a program I downloaded , I converted the .imf to midi format , the result is really bad , but that's not important. Then I use Audacity (version 1.2.6) to import the midi file, just to "see " the notes, and then using Tfm Music Maker , I make the notes and instruments , just listening to the music played by Winamp.
This is how I do it , I really do not know if there will be a better way , but at least the resulting files ( tfc ) are really small (average between 2 and 4 kb).
About the port , right now I have added the final boss of the episode (Hans Groose ) , I've also added the final sequence, when you see BJ running and jumping , and just need to make the end of the episode screens .
I have also been doing a lot of testing to improve the engine, and I've got several ways to do it , and one of them is not really an improvement, but a better way to make use of the processor, then I 'll explain and also show you some examples ( it is also applicable to the 3d engine I did of starfox.

Posted: Tue Apr 15, 2014 8:32 am
by Stef
gasega68k wrote:
I have also been doing a lot of testing to improve the engine, and I've got several ways to do it , and one of them is not really an improvement, but a better way to make use of the processor, then I 'll explain and also show you some examples ( it is also applicable to the 3d engine I did of starfox.

Hey,
That last part sounds quite mysterious and interesting, you have to explain that

Posted: Wed Apr 16, 2014 5:43 am
by gasega68k
Well, first I want to show (roughly) how is the main loop of the game now and the new way I did:
current:
Code: Select all
main_loop:
- Clear buffer
- Raycast walls - draw to buffer
-Wait_for_line 192
- Disable display, move Buffer to Vram, enable display
- control player
- Move doors, objects Ai, ...
Jmp main_loop
new:
Code: Select all
clr.w (w_draw)
;enable hint - line 192
main_loop:
-Raycast walls - store wall information to a buffer: tilehit, texture(column), wallheight.
waitdr_:
tst.w (w_draw)
bne.s waitdr_
- Clear buffer
- draw walls to buffer
move.w #1,(w_draw) ;draw true
- control player
- Move doors, objects Ai, ...
Jmp main_loop
;----H-interrupt----
hint:
tst.w (w_draw)
beq nodrawframe
clr.w (w_draw)
- Disable display, move Buffer to Vram, enable display
nodrawframe:
rte
When I say it to make better use of the processor, it is because in the current engine, when it finishes drawing the buffer and before moving to the Vram, it waits to line 192, this is a waste of cycles procesdor, because in the worst case it can spend a full frame (262 lines) just waiting before moving to Vram buffer and this explains why in some cases the game suddenly drop from 15 to 12 fps.
In the new method, after drawing to the buffer, the flag "w_draw" is set to 1, indicating that it can be copied buffer to Vram (during the H-int), but instead of waiting, it continues the loop. Thus, the processor does not have to wait, because in this case only the "raycast" may take more than a frame (more than 262 lines), even tried removing the "tst.w (w_draw) - bne waitdr_" and even works because as I said, only the main engine (raycast) can take more than 262 lines.
Well, in this file I have included two roms (early versions, and also with the "filter2") for comparison, "wolf3d_test.bin" version "normal," and "wolf3d_test_hint" the new method.
http://www.mediafire.com/download/fa7r1 ... st_new.rar
Posted: Wed Apr 16, 2014 5:56 am
by TmEE co.(TM)
This is really smooth !
Posted: Wed Apr 16, 2014 6:40 pm
by cleeg
That IS smooth, great work.
Posted: Wed Apr 16, 2014 11:55 pm
by Stef
gasega68k wrote:Well, first I want to show (roughly) how is the main loop of the game now and the new way I did:
...
...
When I say it to make better use of the processor, it is because in the current engine, when it finishes drawing the buffer and before moving to the Vram, it waits to line 192, this is a waste of cycles procesdor, because in the worst case it can spend a full frame (262 lines) just waiting before moving to Vram buffer and this explains why in some cases the game suddenly drop from 15 to 12 fps.
In the new method, after drawing to the buffer, the flag "w_draw" is set to 1, indicating that it can be copied buffer to Vram (during the H-int), but instead of waiting, it continues the loop. Thus, the processor does not have to wait, because in this case only the "raycast" may take more than a frame (more than 262 lines), even tried removing the "tst.w (w_draw) - bne waitdr_" and even works because as I said, only the main engine (raycast) can take more than 262 lines.
Well, in this file I have included two roms (early versions, and also with the "filter2") for comparison, "wolf3d_test.bin" version "normal," and "wolf3d_test_hint" the new method.
http://www.mediafire.com/download/fa7r1 ... st_new.rar
Oh got it

Actually this is the method i am using in SGDK for the bitmap engine. If you look at the source code, you will see that i am using the H interrupt at line 192 (as you) to disable VDP then start the bitmap buffer conversion to VRAM. On NTSC system i need blank period of 3 frames to convert and transfer the whole 256x160 bitmap (so 20 FPS at best). As i'm extending blanking from line 192 to line 31 I need to be sure that my code will end before line 31. On PAL system i can do the whole transfer in only 2 frames (25 FPS at best) and the second frame still has lot of free blank time

Of course the good thing with this method is that you never wait for anything, all the bitmap conversion is done "asynchronously".
The code for the bitmap transfer is wrote in C but i optimized and verified that generated asm code is good enough so i don't need to convert it. At least with GCC 3.4.6 the generated code is almost optimal but it can be different (and worst) with newer version though.
Posted: Fri Apr 18, 2014 5:31 am
by gasega68k
Stef wrote:Oh got it Smile Actually this is the method i am using in SGDK for the bitmap engine. If you look at the source code, you will see that i am using the H interrupt at line 192 (as you) to disable VDP then start the bitmap buffer conversion to VRAM. On NTSC system i need blank period of 3 frames to convert and transfer the whole 256x160 bitmap (so 20 FPS at best). As i'm extending blanking from line 192 to line 31 I need to be sure that my code will end before line 31. On PAL system i can do the whole transfer in only 2 frames (25 FPS at best) and the second frame still has lot of free blank time Smile Of course the good thing with this method is that you never wait for anything, all the bitmap conversion is done "asynchronously".
I did not know what were you doing in that way

, this is definitely the best way to take advantage as much as possible the cpu usage, and it's something I should have done from the beginning.

I also tried this on starfox 3d engine and of course there is also an improvement. By the way, I recently did an improvement in the code of filling polygons, now in the worst case the maximum would be 628 cycles for a line (previous was 722), and also made a method "different "which is faster, because I use the DMA to transfer the buffer, but it has its pros and cons, about this I will talk in a few days (and show some demos) on the topic" 3d on the sega genesis. .. "

, but for now I want to concentrate on Wolf3D.
Posted: Fri Apr 18, 2014 9:45 am
by Stef
gasega68k wrote:
I did not know what were you doing in that way

, this is definitely the best way to take advantage as much as possible the cpu usage, and it's something I should have done from the beginning.
Yeah actually you avoid any wait from the CPU, which is definitely important here
I also tried this on starfox 3d engine and of course there is also an improvement. By the way, I recently did an improvement in the code of filling polygons, now in the worst case the maximum would be 628 cycles for a line (previous was 722), and also made a method "different "which is faster, because I use the DMA to transfer the buffer, but it has its pros and cons, about this I will talk in a few days (and show some demos) on the topic" 3d on the sega genesis. .. "

, but for now I want to concentrate on Wolf3D.
Again, that sound interesting ! To be honest i tried to rewrite a bit my rendering method to optimize them for triangle / quad rendering. But i always get back in the generic fashion because even if i want the code to be fast, the main purpose of SGDK is also to work in all different case (so being able to render 8 sided polygons for instance), also i want to keep the code "readable". Current one is really a mess and i will probably end with a simpler code but not really faster :-/
It's a good new you were finally able to use the DMA =) I made some calculations and indeed for flat 3D rendering DMA can definitely help ! It has some drawback as no more easy pixel access for line or single pixel draw but the speed boost in VRAM copy is so important that it is still the best choice to get the maximum performance ! I'm impatient to see that in action

Posted: Sat Apr 19, 2014 5:59 pm
by ICEknight
Just tried the test versions on real hardware:
-On the "regular" one, things look jaggy/pixellated when turning around, since surfaces are rendered with different pixels/squares for odd and even frames. Perhaps rendering alternating vertical lines as seen previously would look better than squares, in this context?
-The "new" one just hangs right after booting up.
=(
Posted: Sat Apr 19, 2014 8:12 pm
by Stef
I tested the rom with the new hint method, very smooth ! up to 21 FPS

Posted: Tue Apr 22, 2014 10:39 am
by twosixonetwo
Hey, great work once again! One thing though: is the new method supposed to work on a real console? I've tried it on my PAL Mega Drive (both, in 50Hz and in 60Hz mode) and all I could see was:
The old method works fine on the same console.
Posted: Tue Apr 22, 2014 5:06 pm
by Stef
Yeah ICEKnight also reported the issue. I guess there is something wrong with the HInt transfer code. I remember having some bad time in getting my code working too. It's definitely not easy to get the hint occuring on the desired scanline because you have to set it 2 step ahead (when the hint occurs, the hint counter has already been loaded for the next hint). But as it works in emulator i guess the problem is somewhere else here
