MD hangs after playing any PCM file

Hosted forum for UMDK related questions

Moderators: BigEvilCorporation, prophet36

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: MD hangs after playing any PCM file

Post by prophet36 » Sat Apr 08, 2017 4:38 pm

amushrow wrote:So if we take the common example of Sonic 1, as soon as the game starts you get to hear the wonderful "SEGAAA" sound clip, after which the game crashes (Illegal instruction $00071FC6 in this case).
This is a known issue that several people have identified[1][2][3]. The crash is in the "@busyloop" of Sonic's PCM-player code[4], and happens because of some observed instability in the address bus after /OE asserts during the handover of the bus from the Z80 to the 68000 (see the logic-analyzer traces in [1] & [2]). I'm beginning to suspect it's not due to board revision, but to some kind of electrical degradation which affects some machines but not others (e.g PSU noise, degradation of electrolytic capacitors, etc). I have not been able to investigate it because I don't have access to an affected machine.

[1]viewtopic.php?f=20&t=2239&hilit=address ... 225#p28597
[2]viewtopic.php?f=20&t=2239&start=210#p28454
[3]https://groups.google.com/d/msg/umdkv2- ... NnP1rTHAAJ
[4]https://github.com/sonicretro/s1disasm/ ... #L683-L703

Jorge Nuno
Very interested
Posts: 374
Joined: Mon Jun 11, 2007 3:09 am
Location: Azeitão, PT

Re: MD hangs after playing any PCM file

Post by Jorge Nuno » Sat Apr 08, 2017 7:21 pm

This looks like metastability to me, aka not double-registering inputs. I've never experienced this kind of problem with my carts (DMA oh yes).
Also prophet36, any special reason for not using CE_0?

HardWareMan
Very interested
Posts: 750
Joined: Sat Dec 15, 2007 7:49 am
Location: Kazakhstan, Pavlodar

Re: MD hangs after playing any PCM file

Post by HardWareMan » Sun Apr 09, 2017 6:17 am

Yes, it is bus metastability. Since M68K itself can't do that after AS asserting (@design), so this is chipset related problem. Oh, deja vu, is we already discussed it before? Anyway, yes, there is a problem. Since SRAM or ROM can handle with this by that fact that those metastabilities ends before guiaranteed SRAM/ROM data setup time after address change. With SDRAM we must to know access address as early as we can, becuase we must go thru RAS-CAS cycles and this take some time. Also we must refresh SDRAM (some more cycles) and SDRAM ready to accept new access not at that time when MD want to. It is like sampling aliasing in sound digitizing. The only way out is increase SDRAM frequency to simulate SRAM/ROM address setup delay (or use delayed signals, such CAS2). May be even switch to DDR (dreaming)?

I know this is hard way. Not for development new product. Hard way to support users that already have old one.

amushrow
Interested
Posts: 44
Joined: Mon Jan 02, 2017 12:56 pm

Re: MD hangs after playing any PCM file

Post by amushrow » Sun Apr 09, 2017 10:36 am

Ah probe, 1X - 6Mhz, 10x - 100Mhz so the signal was being heavily attenuated. I've never had a scope before so you'll have to forgive me.

However I have found some more useful information.
The UMDK's bridge board triggers signals at ~2v (for me anyway), probing /OE (on both the MD side and the FPGA side) and playing a sound effect in MJ's Moonwalker induces a large amount of noise on /OE causing it to go high for UMDK when it shouldn't be. I had the scope set to trigger for pulse widths less than 100ns.

I'll check using sonic as well to see if it has the same behaviour, and also on the MDII to see why it doesn't have an issue. I'll also have to see if other lines are also picking up noise. I'm no expert, but I don't think we can have things randomly asserting (or deasserting in this case I believe).

EDIT: Same noise picked up on Sonic 1. The MDII is clean as a whistle, it seems completely impervious.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: MD hangs after playing any PCM file

Post by prophet36 » Sun Apr 09, 2017 11:48 am

amushrow wrote:playing a sound effect in MJ's Moonwalker induces a large amount of noise on /OE
Nice work. Is it possible for you to repeat that test without UMDK? With Moonwalker on a dumb flash-cart, maybe?
amushrow wrote:The MDII is clean as a whistle
So does that MD2 exhibit any problems with your UMDK cart? Did you use the same power-supply that you used with the noisy-OE machine? Can you do a similar comparison between these two machines with the +5V line?

amushrow
Interested
Posts: 44
Joined: Mon Jan 02, 2017 12:56 pm

Re: MD hangs after playing any PCM file

Post by amushrow » Sun Apr 09, 2017 12:04 pm

I already checked with a regular cart and there was no noise. I don't have any other flash cart, if you wanted a comparison for that.

MDII has no problems at all (well, I haven't encountered any).
They're both using the same power supply, actually a bench power supply. My Mega Drives 10v brick was putting out 13.5V so it was getting quite toasty and the heat-sink kept burning me.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: MD hangs after playing any PCM file

Post by prophet36 » Sun Apr 09, 2017 12:43 pm

OK so in summary, on the question of whether you see noise on /OE during PCM playback, keeping eveything else the same (bench PSU, same ROM, same UMDK cart, same way of measuring, same testgear etc):

MD1 + Regular cart: NO
MD2 + UMDK: NO
MD1 + UMDK: YES

At risk of jumping to conclusions from an observed correlation, that implies that the UMDK cart is causing the noise on the MD1's /OE. At a guess I'd say that was down to UMDK drawing current-spikes, which the MD2 is able to supply without difficulty but which are causing some bounce on MD1 +5V supply rail. So, what does the +5V rail on the bridge-board look like? How does it compare between the MD1 and MD1? Maybe try triggering on /OE and sampling +5V on the other 'scope channel?

amushrow
Interested
Posts: 44
Joined: Mon Jan 02, 2017 12:56 pm

Re: MD hangs after playing any PCM file

Post by amushrow » Sun Apr 09, 2017 3:04 pm

5v rail is completely stable, as is the 3.3v on the UMDK.

Just to eliminate another possibility I also checked a little closer and the MD is triggering the UMDK, which was already assumed I assume. This just means that there isn't noise on the 3.3v side of the bridge board which is getting sent back the other way.

EDIT: For your delectation
BlueOE_Yellow5V.png
BlueOE_Yellow5V.png (11.26 KiB) Viewed 25919 times
Yellow line is the 5V from the cart slot. Blue line is /OE. Sometimes there's more noise (more peaks, similar voltage) sometimes there's just a couple of peaks.

EDIT2: No other pin on the cart has noise that matches /OE (Hooray?)
However the data lines look like they might have something going on, which is giving an extra blip MD side. I don't reckon it's causing an issue (Present on both MD1 & MDII) but it isn't pretty. I've not looked at the signal on the other side of the bridge board yet.

D03 (A29 on the cart) random sample using a regular cart:
D03_MD.png
D03_MD.png (7.45 KiB) Viewed 25917 times
D03 with UMDK running something (another random sample):
D03_UMDK.png
D03_UMDK.png (7.96 KiB) Viewed 25917 times
Before the UMDK menu boots the signal is clean, once in the menu it has some ringing which falls off very rapidly and when running a rom it looks like the screenshot, with the extra peak after the signal goes low.

EDIT3: Taken from the MDII, the peaks on D03 after the signal goes low are from a 7v peak (oh my) coming from the UMDK. I'm measuring the normal signal voltage at 3.7v

Jorge Nuno
Very interested
Posts: 374
Joined: Mon Jun 11, 2007 3:09 am
Location: Azeitão, PT

Re: MD hangs after playing any PCM file

Post by Jorge Nuno » Sun Apr 09, 2017 5:21 pm

It looks like CAS0 (You people like to call it OE for some freaking reason) has either:

1: Disturbance from nearby signals toggling /ground bounce.
2: Resonance due to signal driver, trace lengths + parasitic elements both on PCB and component pad.
3: Buffer itself on the "addon" board is driving the signal erroneosly due to component being subjected to undershoots or dirty voltages.

Solutions?
1: Not much can be done here besides reducing the slewrate on the FPGA, although it's only inderect interaction (as the buffer is driving it towards the console). This is possible to do via HDL or UCF project files.

2: Cutting trace and adding series resistor of 50-500 Ohm between OE and buffer. Reduces Q factor of RLC "filter". Driver of CAS0 is the VDP itself (and possibly IO chip) (can't change of course) which means strong sinking capability and weaker high driver (high side is probably an NMOS transistor maybe in depletion mode).

3: Replace buffer by another kind like the regular 74xx16245 (which is not a level converter), 5V rail needs to be chopped and connected to 3.3V.

1&2 are obviously motherboard dependant. Minute characteristics like copper etching irregularites or IC fabrication tolarances will also affect such things. The thing to retain is that in a normal scenario this situation doesn't occur on CAS0.

Data bit line disturbances are normal, since so many devices are able to drive the bus (and sometimes none are, so the line will float towards 5V), although the tiny-duration 5V pulse is not good because it has too high bandwidth and could create some problems.

To further check, you should try to trigger CAS0 in these glitches and check if any address bit or otherwise nearby signal is toggling.


I think a HW redesign might be the ultimate solution though.

EDIT: BTW one day I was checking the rise time of a spartan6 FPGA, and it has 600picosecond rise/fall time AND I think the probe (2GHz, active probe, single ended) was limiting bandwitdh.
-- just sayin --

EDIT2: After giving it some more thought, it could also be a ground bounce issue due to the data transceiver driving the bus with too much "force", specially if many databits are changing to zero.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: MD hangs after playing any PCM file

Post by prophet36 » Mon Apr 10, 2017 9:59 pm

An interesting thing occurred to me when I looked at your 'scope traces, concerning the instruction-trace Montserrat recorded last year (when one of his consoles was crashing in a similar place, on the "SEGA!" PCM on Sonic1's intro). We saw two adjacent reads of the same PCM address:

Code: Select all

167355419 C RD 071FC4 4E71  ; nop
167355444 C RD 071FC6 51C8  ; dbra d0,
167355469 C RD 071FC8 FFFC  ;         -4
167355514 C RD 079852 7581  ; <--- PCM read
167355583 C RD 071FC4 4E71  ; nop
167355608 C RD 071FC6 51C8  ; dbra d0, ... etc
  :
167358463 C RD 071FC4 4E71  ; nop
167358488 C RD 071FC6 51C8  ; dbra d0,
167358514 C RD 071FC8 FFFC  ;         -4
167358545 C RD 079852 7581  ; <--- PCM read
167358560 C RD 079852 7581  ; <--- PCM read (same address!)
167358615 C RD 071FC6 51C8  ; pre-fetch dbra d0, ...
167358673 C WB FFFDAC 1FC4  ; save PC, low word
167358699 C WB FFFDA8 2608  ; save SR
167358724 C WB FFFDAA 0007  ; save PC, high word
167358741 C RD 000010 0000  ; illegal instruction vector, high word
167358766 C RD 000012 03E6  ; illegal instruction vector, low word
Usually the PCM reads from the 0x079688 region are far apart, but at timestamp 167358545ns, there are two reads from the same address, adjacent to eachother. Whilst the FPGA is busy processing the latter read, it misses the expected 68000 read from 0x071FC4, so rather than reading 0x4E71 (NOP) it reads 0x7581 which throws an exception and crashes the 68000.

When we later got a logic-analyzer trace, it looked like this:

Image

At the time I failed to put these two pieces of information together, and always assumed the region pointed to was the gap between two reads, very close together. But in retrospect, looking at your 'scope trace, I think it might be just a bit of noise part-way through what's supposed to be a single, long read. The FPGA incorrectly interprets this noise as the beginning of another read, hence the double-adjacent-read from the same PCM address seen in the instruction trace. As for the source of the noise, I guess it could be due to slew-rate issues as Jorge pointed out (I'm not a real electronic engineer, I just make it up as I go along, so I defer to others with more experience of such things).

I can certainly reduce the slew-rate of the FPGA I/Os, and I can add some debounce logic to ensure that short, logic-high (i.e deasserted) pulses on /OE are ignored so they don't erroneously trigger a new read.

amushrow
Interested
Posts: 44
Joined: Mon Jan 02, 2017 12:56 pm

Re: MD hangs after playing any PCM file

Post by amushrow » Mon Apr 10, 2017 10:06 pm

Yes, I'd seen those posts, which is what prompted me to poke in that spot (and also read up on more than I actually wanted to know about the Mega Drive)
You could add logic to cover up the issue, but it doesn't fix the real problem (and I become less sure of the problem the more I look at it), I've also noticed an 80ns period just before the noise where it's held high, which also looks suspicious.
Jorge Nuno wrote:CAS0 (You people like to call it OE for some freaking reason)
Well I see the pin is CAS0 on the VDP, but on cart and expansion slot diagrams around the place it seems to be listed as OE. Maybe I'll randomly switch between the two and everyone will have to guess what I'm on about.

Anyway, I'm seeing a little more than noise on CAS0 during the crash. Aside of the one screen grab I posted of the noise, there is usually a ~80ns period where it is high, followed by the noise. When things are running normally (UMDK when it's not crashing, MDII, regular carts) the smallest period where it is high is ~130ns
I'm not sure what to make of that, but there it is.

A slight change in direction, does anybody know what CAS0 is supposed to be doing during the Bus Grant & Acknowledge malarkey?
Because this noise and whatnot always happens just after BG and BGAK, and every time I look at BG/BGAK when not triggering on the noise CAS0 is high until the procedure is over.
Whenever I trigger on the noise it still seems to be doing its thing.

HardWareMan
Very interested
Posts: 750
Joined: Sat Dec 15, 2007 7:49 am
Location: Kazakhstan, Pavlodar

Re: MD hangs after playing any PCM file

Post by HardWareMan » Tue Apr 11, 2017 1:47 am

amushrow wrote:Well I see the pin is CAS0 on the VDP, but on cart and expansion slot diagrams around the place it seems to be listed as OE. Maybe I'll randomly switch between the two and everyone will have to guess what I'm on about.
Here, I'll put it closer to you.
Image
Image
Now you see the difference? Also you can't see CAS0 on the VDP, since it don't use it. CAS0 fan out from arbiter.

Jorge Nuno
Very interested
Posts: 374
Joined: Mon Jun 11, 2007 3:09 am
Location: Azeitão, PT

Re: MD hangs after playing any PCM file

Post by Jorge Nuno » Tue Apr 11, 2017 8:27 am

HardWareMan VDP pin 118 :wink:

HardWareMan
Very interested
Posts: 750
Joined: Sat Dec 15, 2007 7:49 am
Location: Kazakhstan, Pavlodar

Re: MD hangs after playing any PCM file

Post by HardWareMan » Tue Apr 11, 2017 10:15 am

Jorge Nuno wrote:HardWareMan VDP pin 118 :wink:
Oh, you're right. My bad.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: MD hangs after playing any PCM file

Post by prophet36 » Tue Apr 11, 2017 10:13 pm

amushrow wrote:You could add logic to cover up the issue, but it doesn't fix the real problem (and I become less sure of the problem the more I look at it)
OK, there are a few things we can try, to identify (and hopefully fix) the "real problem". Many of these ideas are just shots in the dark so don't be too upset if they don't work. The first thing is a build of the "testing" FPGA binary, with the slew rate on the MD data bus set to "QUIETIO", which is the slowest slew-rate available on this FPGA. You can try it like this (assuming your sonic1.bin is in your $HOME):

Code: Select all

$ wget -q https://www.dropbox.com/s/ch8sa01l34j2f76/fpga-test-20170411.xsvf
$ flcli -v 1d50:602b -p J:A7A0A3A1:fpga-test-20170411.xsvf
Attempting to open connection to FPGALink device 1d50:602b...
Connected to FPGALink device 1d50:602b (firmwareID: 0xFFFF, firmwareVersion: 0x20140311)
Programming device...
$ loader -w $HOME/sonic1.bin:0 -x 2
UMDKv2 Loader Copyright (C) 2014 Chris McClelland

Putting MD in reset...
Writing SDRAM...
Releasing MD from reset...
$
Does this make any difference to the /OE noise situation?

Post Reply