Questions on emulating the 32X

Ask anything your want about the 32X Mushroom programming.

Moderator: BigEvilCorporation

Post Reply
Near
Very interested
Posts: 109
Joined: Thu Feb 28, 2008 4:45 pm

Questions on emulating the 32X

Post by Near » Mon Mar 22, 2021 11:32 am

Hi all, I started on a 32X emulator. It's far enough to run Chaotix, but not much else.

Image

Problem is, it's feature-complete. I implemented the VDP, PWM, FIFOs, DREQ, vector 4 address (0x70-0x73) as RAM, and the half of the SH7804 games use: full cache emulation (including all CCR bits, the LRU, address read/write, purge, write-through, etc), 32-bit and 64-bit division (with the division register mirroring Virtua Fighter relies on and IRQs on overflow), DREQ DMAs (all sizes and address increment modes supported) and completed IRQs, 16-bit timer support along with frequency scalar and IRQ support, master<>slave SCI communication and transmit/receive IRQ support, and then for everything else I implemented reading and writing the internal SH7604 registers, but not the actual functionality yet (no way to test it if no games use it, and some of it would be very costly to do just for fun.

Throwing my questions / issues out there in the hopes someone recognizes one ^-^;

1. I fail the Mars Test HINT (2) tests. The manual says an HCOUNT of 0 means every line, but from the Mars Test, I assume an HCOUNT of 1 is equal to 0 as every single line? Mars Test sets a value of 5, and reports "actual: 0000" if I take it to mean 6, or "actual: 061e" if I take it to mean 5. The expected value seems to be ~0x4d0 - 0x520, which implies to me something is running too fast. But no matter how much I slow down the SH2s, the result doesn't change. I don't believe my 68K/VDP emulation timing is that far off. I do emulate the "HINTs during VINTs" part. Removing that (which is wrong, the bit is set) drops it to "actual: 0540" which still fails.

2. I fail the Mars Test SH2 DMA tests (#121 and 122.) They are expecting 68S to get set to zero when the length decrements to zero. It sets length to 0x800. If I clear 68S after 0x800 (writes to the FIFO register from the 68K -or- reads from the FIFO register from one of the SH2s), then the test hangs forever. If I don't clear 68S, then the tests fail expecting it to have been cleared. Mars Test #123 locks up. I know the test overwrites program memory, but my cache emulation is complete, so I suspect it's related to failing #121/122.

3. Virtua Fighter shows no polygons when you start a fight.

4. Virtua Racing Deluxe shows no graphics after the SEGA splash screen, but I can hear the music, and the game is playing.

5. Doom shows no in-game graphics, just a brown screen. The rest of the HUD all renders correctly.

6. I have to do the frame buffer swapping immediately, or lots of graphics render incorrectly. Whether I report the true framebuffer or delayed framebuffer state when reading the VDP status register, games break badly if I delay the frame buffer swapping until Vblank. I am reporting the Vblank status in the VDP status register.

7. I don't really understand DREQ DMA. What does the DMA bit (d1) do exactly? It states it's for ROM to VRAM, but do we need to do anything beyond disable SH2 access to the ROM during this time? I presume the 68K DMAs from ROM to the FIFO register in this mode. How would the 16-byte DMA transfer size work? The FIFO only holds 2x4 words, we'd need twice that or it'll run out halfway through.

srg320
Newbie
Posts: 6
Joined: Sat Feb 06, 2021 9:40 pm
Location: Ukraine

Re: Questions on emulating the 32X

Post by srg320 » Mon Mar 22, 2021 7:23 pm

Some games require proper FEN flag emulation (including DRAM refresh).
After Burner uses UBC registers as variables for sound processing.

About DMA flag. I think, 32X logic monitors the address bus (VA) for a DREQ source address register match and then captures data from the data bus (VD).

TascoDLX
Very interested
Posts: 262
Joined: Tue Feb 06, 2007 8:18 pm

Re: Questions on emulating the 32X

Post by TascoDLX » Tue Mar 23, 2021 5:31 am

Near wrote:
Mon Mar 22, 2021 11:32 am
1. I fail the Mars Test HINT (2) tests. The manual says an HCOUNT of 0 means every line, but from the Mars Test, I assume an HCOUNT of 1 is equal to 0 as every single line? Mars Test sets a value of 5, and reports "actual: 0000" if I take it to mean 6, or "actual: 061e" if I take it to mean 5. The expected value seems to be ~0x4d0 - 0x520, which implies to me something is running too fast.
Yes, HCOUNT=0 is every line, HCOUNT=5 is every 6th line. You're numbers have me thinking you're running in PAL res.

It's a 30-frame test, so upper bound is: 0x520 * 6 / 30 = 262.4 ... close enough?

cero
Very interested
Posts: 338
Joined: Mon Nov 30, 2015 1:55 pm

Re: Questions on emulating the 32X

Post by cero » Tue Mar 23, 2021 7:42 am

Uh, Near, have you considered actually finishing one emulator before starting 2058 new ones? It's like every month there's a notification you've started a new emulator.

Near
Very interested
Posts: 109
Joined: Thu Feb 28, 2008 4:45 pm

Re: Questions on emulating the 32X

Post by Near » Tue Mar 23, 2021 10:34 pm

I figured out the bug with the HINT(2) tests, a dumb logic bug I made refactoring ^-^;
I was incrementing the counter on both Hblank start and end by mistake.
It is indeed HCOUNT of 5 == every 6th line. Thank you!
Also, reading the HCOUNT register returns the set value and not the current counter, otherwise Virtua Fighter locks.
Still no polygons in Virtua Fighter for me just yet though.
Some games require proper FEN flag emulation (including DRAM refresh).
Oh that's bad. I was faking it as always being accessible currently.
Initial attempts at it were breaking games and tests. I'll pay closer attention to getting it working then.
But, DRAM refresh? Really? So every time there's a CPU DRAM refresh it has to go inaccessible for short bursts? That's ... epic.
Do you recall which games require the DRAM refresh? I do emulate it, but I have it disabled for now because I'm not sure if I implemented both of them correctly.

EDIT: well no matter if I always return 0 or always return 1, none of my bugs change at all.
Can you tell me how FEN should be emulated? I'd be happy to implement it now.
Documentation only says 0 = framebuffer accessible, 1 = framebuffer access prohibited. Doesn't say when that occurs.
I'm just giving it the value of FM for now.
After Burner uses UBC registers as variables for sound processing.
Clever. Pointless since there's enough RAM on the system, but still.
I do emulate them being readable and writable, just not the actual debug break events yet.
About DMA flag. I think, 32X logic monitors the address bus (VA) for a DREQ source address register match and then captures data from the data bus (VD).
Hmm, so it only works from a fixed source address then?

---

Update: got another one. Reading CHCR0 was falling through to SA1, whoops.
Clearing 68S now works without dead-locking. Now I pass all tests up to #123 before the test locks up.

---

Update 2: added DMA auto-request mode and made TCR decrement by 4 in 16-byte transfer mode.
All 161 tests now pass.

Image

Lot of good it does me for running games, though.
* Virtua Racing Deluxe and Virta Fighter show no polygons, only backgrounds
* After Burner Complete doesn't run
* Doom doesn't show the in-game graphics
Last edited by Near on Fri Mar 26, 2021 7:36 am, edited 1 time in total.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Re: Questions on emulating the 32X

Post by Stef » Wed Mar 24, 2021 8:58 am

You can also try to read others 32X emulators sources to sort some of your issues. To be honest in Gens i do not emulate a lot of stuff about the SH2 CPU, i only emulated parts required to make all 32X games (as far i remember) working.
Last edited by Stef on Wed Mar 24, 2021 11:01 am, edited 1 time in total.

srg320
Newbie
Posts: 6
Joined: Sat Feb 06, 2021 9:40 pm
Location: Ukraine

Re: Questions on emulating the 32X

Post by srg320 » Wed Mar 24, 2021 9:01 am

Near wrote:
Tue Mar 23, 2021 10:34 pm
Oh that's bad. I was faking it as always being accessible currently.
Initial attempts at it were breaking games and tests. I'll pay closer attention to getting it working then.
But, DRAM refresh? Really? So every time there's a CPU DRAM refresh it has to go inaccessible for short bursts? That's ... epic.
Do you recall which games require the DRAM refresh? I do emulate it, but I have it disabled for now because I'm not sure if I implemented both of them correctly.
This code from Metal Head:

Code: Select all

L00D174:
00D174 4F22     	STS.L   	PR,@-R15		;
00D176 D112     	MOV.L   	@(#048,PC),R1	;
00D178 6010     	MOV.B   	@R1,R0		;
00D17A CB80     	OR      	#80,R0		;
00D17C 2100     	MOV.B   	R0,@R1		;1->IMR.FM
00D17E D111     	MOV.L   	@(#044,PC),R1	;
L00D180:
00D180 841B     	MOV.B   	@(#0B,R1),R0	;
00D182 C902     	AND     	#02,R0		;
00D184 8800     	CMP/EQ  	#00,R0		;
00D186 89FB     	BT      	L00D180		;FBCR.FEN == 0
00D188 E340     	MOV     	#40,R3		;
00D18A D10F     	MOV.L   	@(#03C,PC),R1	;
00D18C D20F     	MOV.L   	@(#03C,PC),R2	;
00D18E 6015     	MOV.W   	@R1+,R0		;
00D190 2201     	MOV.W   	R0,@R2		;
00D192 4310     	DT      	R3			;
00D194 8FFB     	BF/S    	#1F6			;
00D196 7202     	ADD     	#02,R2		;
00D198 7FF0     	ADD     	#F0,R15		;
00D19A D10D     	MOV.L   	@(#034,PC),R1	;
00D19C 8419     	MOV.B   	@(#09,R1),R0	;
00D19E 1F00     	MOV.L   	R0,@(#0,R15)	;
00D1A0 841A     	MOV.B   	@(#0A,R1),R0	;
00D1A2 1F01     	MOV.L   	R0,@(#4,R15)	;
00D1A4 841B     	MOV.B   	@(#0B,R1),R0	;
00D1A6 1F02     	MOV.L   	R0,@(#8,R15)	;
00D1A8 841C     	MOV.B   	@(#0C,R1),R0	;
00D1AA 1F03     	MOV.L   	R0,@(#C,R15)	;
00D1AC BEAA     	BRS     	#1D54		;
00D1AE 0009     	NOP     				;
00D1B0 7F10     	ADD     	#10,R15		;
00D1B2 D108     	MOV.L   	@(#020,PC),R1	;
00D1B4 E000     	MOV     	#00,R0		;
00D1B6 2100     	MOV.B   	R0,@R1		;
00D1B8 4F26     	LDS.L   	@R15+,PR		;
00D1BA 000B     	RTS     				;
00D1BC 0009     	NOP     				;
00D1BE 0000     	
00D1C0 20004000
00D1C4 20004100
Without DRAM refreshing, the code loops on waiting for FEN. I am working on emulating the 32X on FPGA and it was enough for me to emulate only period B (40 SCLK).
Attachments
111.jpg
111.jpg (244.75 KiB) Viewed 179735 times

srg320
Newbie
Posts: 6
Joined: Sat Feb 06, 2021 9:40 pm
Location: Ukraine

Re: Questions on emulating the 32X

Post by srg320 » Wed Mar 24, 2021 9:24 am

Near wrote:
Tue Mar 23, 2021 10:34 pm
About DMA flag. I think, 32X logic monitors the address bus (VA) for a DREQ source address register match and then captures data from the data bus (VD).
Hmm, so it only works from a fixed source address then?
Looks like an address match serves as a trigger to start the DMA. You can find the source code of WTOV test in DDK (32X-DDK\SEGADTS\32X\HWDIAG\SOURCE\MD\SOURCE\SHDMATST.ASM, label sh_kasume_m_test).

Near
Very interested
Posts: 109
Joined: Thu Feb 28, 2008 4:45 pm

Re: Questions on emulating the 32X

Post by Near » Wed Mar 24, 2021 10:18 am

Not having much luck with rendering.

Take Amazing Spider-Man: Web of Fire.

This is the Mega Drive layer:
Image

And this is the Super 32X layer.
Image

The combination is supposed to look like this:
Image

Throughbit is never set on a single pixel for the entire screen, and it's always in priority 1 mode (32X on top.)
If I draw the 32X area where it's not transparent, it covers up the MD layer's spider and text.
The only way to get those to draw is to treat the backdrop color (or palette color 0) as transparent, and let the MD layer overlap the 32X layer wherever the MD layer isn't transparent.

But if I do that, then this happens to Chaotix:
Image

Again, it's priority 1 mode. The same logic above results in the MD background layer overriding the 32X sprite layer because the pixels are not transparent there, and the sprites have throughbit=0. The background area gets throughbit=1, and I could probably make that work, I just didn't bother since there's already a contradiction with throughbit=0.

---

I can't imagine the video cable passthrough can detect the backdrop color anyway, right? It's supposed to only be checking for black color output, post-CRAM lookup. But there's no way to render Spider-Man properly otherwise that I can see.

Could anyone who understands this provide a detailed explanation of how to render the layers for both priority and throughbit modes please? ^-^;

The information my renderer has for the MD layer: 3-bit{R,G,B},2-bit mode,1-bit backdrop/palette 0. For the 32X layer: 1-bit throughbit,5-bit{R,G,B}.

Perhaps the VDP priority setting only takes effect on the next entire display frame and can't be changed mid-frame?
Without DRAM refreshing, the code loops on waiting for FEN.
I gave it a shot, but I can't get past the Sega splash screen with or without setting FEN=1 during both CPU DRAM refreshes (external + CPU RAM.) Guess I have to fix other bugs before I can get to that.
You can also try to read others 32X emulators sources to sort some of your issues.
I tried to look at Picodrive but I had a really hard time following the code for it (my fault.) I'll try Gens and see if that's easier for me.
Looks like an address match serves as a trigger to start the DMA. You can find the source code of WTOV test in DDK (32X-DDK\SEGADTS\32X\HWDIAG\SOURCE\MD\SOURCE\SHDMATST.ASM, label sh_kasume_m_test).
It worked from my end to just use SAR+DAR+TCR when in auto-request mode, and DREQ0 when in DRCR=0 mode. I haven't seen any DRCR=1/2 (SCI transmit/receive) DMA requests yet, so I left those unimplemented. I didn't need to monitor any address bus values for matches on my end so far.

RV mode isn't about the SH2 DMA or DREQ at all. The MD uploads code into CPU RAM to run a VDP ROM to VRAM DMA, so it obviously is just pulling the ROM off the SH2 bus and then mapping it into 0x100-0x3fffff on the 68K bus during that window. It seems that the rest of the 68K bus has to be left in-tact (0x00-0xff vector ROM, 0x400000+ 32X stuff), or games lock up right away.

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Re: Questions on emulating the 32X

Post by TmEE co.(TM) » Wed Mar 24, 2021 10:38 am

VDP outputs a signal called !YS which is active when nothing/backdrop color is being output, it is present on cartslot along with !HSYNC and !VSYNC signals and 32X uses them for image alignment and composition purposes.

There are 3 signals involved in controlling if 32X or MD RGB signal is switched to the final output. !YS, top bit of pixel color (either from palette or direct color pixel) and the PRI bit. I'm not completely sure what the logic is, the documentation is a bit difficult to understand due to broken english...
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Near
Very interested
Posts: 109
Joined: Thu Feb 28, 2008 4:45 pm

Re: Questions on emulating the 32X

Post by Near » Thu Mar 25, 2021 5:42 am

I found the key to most of my issues: I was failing to type-extend DMULS/DMULU first to 64-bits, so they were being truncated to 32-bits. With that fix, compatibility is now around 80-90% or so. Not bad!
VDP outputs a signal called !YS which is active when nothing/backdrop color is being output
Is this the case only when it's the backdrop color (no background/sprites), or when it's the backdrop color OR palette color 0 (from a background or sprite)?

I'm still not able to render both Spider-Man and Chaotix at the same time. Spider-Man seems to require the opposite settings as Chaotix to work.

Aside from that, some issues with games like After Burner, Metal Head, etc still. Virtua Fighter's camera goes wild right after a fight starts.

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Re: Questions on emulating the 32X

Post by ob1 » Thu Mar 25, 2021 7:43 am

Hello Near,
and thank you so much for your investment in this 32X emulator.
If there is a wishlist,
and it is still open,
would you consider :
- a debugger (a must. 'definitely would pay for it)
- throwing DEI when DMA finishes (see bit CHCR0/1.IE and register VCRDMA0/1). It looks like no game made use of this, but I think this could be very useful.
?

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Re: Questions on emulating the 32X

Post by TmEE co.(TM) » Thu Mar 25, 2021 8:37 am

Near wrote:
Thu Mar 25, 2021 5:42 am
Is this the case only when it's the backdrop color (no background/sprites), or when it's the backdrop color OR palette color 0 (from a background or sprite)?
VDP never renders color 0 of any palette when drawing BGs and sprites on the screen, it is the backdrop color that is output when a particular pixel doesn't contain any BG or sprite colors, and the color can be any of the 64 from the CRAM. !YS signal is not tied to any color per se, it only means "no BG or sprite pixel was rendered".
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Near
Very interested
Posts: 109
Joined: Thu Feb 28, 2008 4:45 pm

Re: Questions on emulating the 32X

Post by Near » Thu Mar 25, 2021 10:30 am

- a debugger (a must. 'definitely would pay for it)
I have a pseudo-debugger. You can see, edit, and export any RAM contents, create trace logs (with optional masking and automatic loop elision) of all chips and their associated interrupts, certain debugging information like DMA requests (limited so far), and toggle individual sound sources on and off. It does not allow you to set breakpoints or single-step instructions. It has a graphics viewer but so far only hooked up for the SNES and PS1.
- throwing DEI when DMA finishes (see bit CHCR0/1.IE and register VCRDMA0/1). It looks like no game made use of this, but I think this could be very useful.
I do that already, but my DMA timing is not going to be very accurate. I just do one transfer between each instruction in the background.

I'm willing to emulate anything I am able to within the SH7604 peripheral space, with the caveat that I will probably disable the parts no games use in official releases for the sake of performance. The accuracy level of my Mega Drive emulator gives me very little additional headroom if I want to maintain 60fps. Right now you need a 5800X and an SH2 downclock. I should be able to get rid of the latter (I hope) with a priority queue for events and moving from a cached interpreter to true dynamic recompiler, but we'll see. If I can get my MD VDP scanline renderer going then I'll definitely pull it off at full SH2 speeds.
VDP never renders color 0 of any palette when drawing BGs and sprites on the screen
I'm sorry, I've been jumping around so much I missed this super obvious thing ^-^;
In that case, I'm outputting /YS properly. But it seems the logic to handle Chaotix and Spider-Man are at odds.
Short of another CPU bug that might somehow be setting the throughbit incorrectly on 32X VRAM, I'm stumped.

Near
Very interested
Posts: 109
Joined: Thu Feb 28, 2008 4:45 pm

Re: Questions on emulating the 32X

Post by Near » Fri Apr 02, 2021 4:35 am

For whatever it's worth, I posted ares v119 with my 32X emulation source to https://ares.dev

I really appreciate everyone's help here. If anyone wants to work on 32X for their emulator, I would be most happy to return the favor.
Feel free to ping me with any questions and I'll try to answer them. Thank you all again for making this possible! ^-^

Post Reply