UMDK manufacturing, part 2: Software

Hosted forum for UMDK related questions

Moderators: BigEvilCorporation, prophet36

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Sun Dec 27, 2015 3:59 pm

oh god....i have good/bad news.

Bad news first...i was doing some testing on the mega turrican games so tested it on my other megadrives...got surprised that i got same behaviour of burbruees one. I cant load most games on the pal-G (original) and pal-I (moded) i experienced, black screen, red screen, freezes, weird behaviour, color palettes screwed, and some more shit. Also got those errors on sonic's sega logo, illegal instrucion and emulator bla bla...

Pal-1 asian moded (i do all testing in there normaly) has some loading problems like mega turrican eu version (black screen), i've tried a US version, it loads, but has graphical glitches.

BTW nomad worked just fine, US rom.


So thats the bad news, having different behavior depending on the version of hardware means we're screwed since md has a ton of versions.

The good news are that burbruee, your umdk has nothing wrong probably, it has more to do with your console, and also it means UMDK manufacturing is more reliable now.

These are some test logs i did with all consoles and mega turrican both EU and US roms. Pal-G and NTSC-nomad are not modded so there is only EU and US rom logs respectively. For Pal-i and Pal-1(asian) i did both EU and US logs.

NTSC-nomad its the only one that works fine.
Pal-1 (asian) can load the US version but there is grapical glitches.EU wont load.
The rest of the consoles wont load.

https://mega.nz/#F!2ItUlAoB!y5bXiuziGRexsvZMux9FSw

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Sun Dec 27, 2015 5:12 pm

It looks to me like you're trying to load an .smd file in your EU logs (they all show reading 0xAABB at 0x000008, which is a pretty good indication that it's in .smd format). UMDK supports only .bin files. That explains why Mega Turrican fails on all your EU machines.

Are the Sonic glitches something you've noticed only since upgrading to the 20151220 release? Or have you just not tried Sonic on those machines before? And you've definitely eliminated the possibility of a dirty cart-slot? Unless the UMDK bridge-board's edge-connector is gold-plated, it will gradually tarnish, making for bad connections. And UMDK's synchronous design makes it rather more susceptible to bad connections than a regular flash-cart would be.

The only possibility I can think of is the older machines have a slower signal rise and fall times, resulting in glitches on /C_OE. Fixing that would be tricky enough even if I could reproduce it here. Then again, so pervasive a bug with older hardware would surely have shown up on Minty's MD1, but as far as I'm aware it has not.

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Sun Dec 27, 2015 11:04 pm

Both mega turrican ROMs are .bin files. Send me your dump if you wish and i will try it (fenix_awing@hotmail.com).

On the sonic glitches, i did not tried anything on the other consoles (except for nomad) since we cleared the A21 bridge, you told me to focus on the asian, so i stored the others. Sonic works well on the asian version (all regions) and nomad.

I've cleaned all my md thoroughly, using compresed air and a little tool used in toothbrushing (very efective to extract fluff inside the slots). Also nomad is working 100% as expected, so i dont think its a dirty slot problem, contacts or a wrong rom.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Re: UMDK manufacturing, part 2: Software

Post by Nemesis » Mon Dec 28, 2015 10:28 am

Ok, dug my UMDK out of storage today, and had a crack at initializing it. I'm getting a failure when trying to verify the FPGA operation, specifically at this step (from the new wiki page):
Next, write 512KiB of random data to the FPGA, and compare the FPGA-calculated and CPU-calculated checksums
I'm often getting a slightly different checksum returned, which is obviously a major problem. I can see from use that running the checksum tool multiple times without reflashing will combine the data blocks and so isn't valid to do (might be worth noting in the docs), but after reprogramming and sending the same data, I get a different checksum than the time before, so there's some randomness in the error. I get the right checksum about half the time, other times it's usually off by less than 0x10. Here's an example of a failure:

Code: Select all

ubuntu@ubuntu:~$ flcli -v 1d50:602b -a 'w0 "random.bin";r1;r2' -b
Attempting to open connection to FPGALink device 1d50:602b...
Connected to FPGALink device 1d50:602b (firmwareID: 0xFFFF, firmwareVersion: 0x20140311)
Executing CommFPGA actions on FPGALink device 1d50:602b...
Wrote 524288 bytes (checksum 0xB4E8) to channel 0 at 44.547398 MiB/s
Read 1 bytes (checksum 0x00B4) from channel 1 at 0.014901 MiB/s
Read 1 bytes (checksum 0x00DE) from channel 2 at 0.013821 MiB/s
     00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0000 B4 DE                                           ..
I don't have access to basically any of my equipment right now, but I can't spot any problems from a quick visual inspection. I never seem to have trouble detecting or programming the FPGA. Any suggestion on what could be at fault here?

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Mon Dec 28, 2015 10:48 am

Montserrat wrote:Both mega turrican ROMs are .bin files
They may well both be files with .bin extensions, but your logs clearly indicate the EU dump you have is in .smd format, irrespective of its file extension. It may be possible to download a converter to make it into a genuine .bin format file.
Montserrat wrote:so i dont think its a dirty slot problem, contacts or a wrong rom.
OK, let's work on the assumption that you're correct, and this is a genuine problem, but one affecting only some MegaDrive models.

So far as I know, from the last batch of five, only you and Burbruee have actually tried your UMDKs. So far, these Sonic glitches do not appear on my EU MD2, Minty's EU MD1, Minty's EU MD2, your JP MD1 or your US Nomad, but they do appear on your two EU MD1s (PAL-I & PAL-G) and on Burbruee's EU MD1. Is that correct, to the best of your knowledge?

Let's concentrate on one of your crashing machines. Can you get a trace of Sonic crashing with an Address Error or Line 1111 Emulator exceptions, preferably a trace of an early crash (e.g a crash in the initial "Sega" banner would be ideal). Also, if you can get the basic signal test to fail, your compressed result.txt.bz2 file would be useful. Ultimately, so solve this I'll need to implement a better logic analyzer, capable of getting a clear picture of what's going on at the signal level, and that will take some time.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Re: UMDK manufacturing, part 2: Software

Post by Nemesis » Mon Dec 28, 2015 10:59 am

Jumping ahead a bit in the steps, I should add to my problem that tests on the SDRAM consistently pass. I'm unable to do further tests though, since the gordon tool always crashes with a segmentation fault when I try and program the FPGA flash. Is my FPGA hosed? I did hit it with a contunuity test across some random pins by mistake instead of resistance test during assembly. I was hoping I'd got away with it unscathed, but maybe not....

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Mon Dec 28, 2015 11:29 am

Nemesis wrote:running the checksum tool multiple times without reflashing will combine the data blocks and so isn't valid to do (might be worth noting in the docs)
Good point, thanks.
Nemesis wrote:I get the right checksum about half the time, other times it's usually off by less than 0x10
I would check for dry solder-joints on the 15 signals routed from the FX2 to the FPGA. You could also try the test with a file that is smaller but still a multiple of 512 bytes, and with a small file that is +/- a few bytes from being a multiple of 512 bytes. If you can reproduce the checksum mismatch for say 4096 bytes, we can make another .xsvf file with just a big FIFO that can be written to and read back from, allowing us to do a bitwise compare of what went in with what came out.

With the swled.xsvf file loaded, can you check the voltages again? The output of the two regulators should be close to 3.3V and 1.2V.

Another thing you can try is to make a small data file (e.g 4096 bytes) and just do writes in a tight loop, like this:

Code: Select all

while [ 1 ]; do flcli -v 1d50:602b -a 'w0 "rand4096.bin";r1;r2'; done
The thing to do is to leave it running for several hours and watch to see if it eventually hangs. That will tell us whether there's a problem sending to the FPGA or reading from it.

I forget, did you receive the components from me or did you source them yourself? Do you remember what kind of decoupling capacitors you used (all the 0805s on the back of the board)?
Nemesis wrote:I did hit it with a contunuity test across some random pins by mistake instead of resistance test during assembly
I think these FPGAs are fairly resilient, so unless you hit it with a large voltage or let your cat sit on it or did the soldering wearing a billowy nylon dress(!) you're probably still OK.
Nemesis wrote:tests on the SDRAM consistently pass
That's useful information. Don't proceed further though until we know what's up.
Nemesis wrote:gordon tool always crashes with a segmentation fault
That should never happen, irrespective of what's up with your hardware. Is your Linux machine x86 or x64? I can give you a debug build of gordon which will help track that problem down.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Re: UMDK manufacturing, part 2: Software

Post by Nemesis » Mon Dec 28, 2015 12:18 pm

I would check for dry solder-joints on the 15 signals routed from the FX2 to the FPGA. You could also try the test with a file that is smaller but still a multiple of 512 bytes, and with a small file that is +/- a few bytes from being a multiple of 512 bytes. If you can reproduce the checksum mismatch for say 4096 bytes, we can make another .xsvf file with just a big FIFO that can be written to and read back from, allowing us to do a bitwise compare of what went in with what came out.
I don't have any magnifying gear, but I'll scrutinize the joins more closely when I can. I'll probably re-flow all the joints connecting the FPGA to the FX2 when I get my soldering iron back, if I haven't resolved it by then.
With the swled.xsvf file loaded, can you check the voltages again? The output of the two regulators should be close to 3.3V and 1.2V.
Yeah... I, erm, skipped that part :). My multimeter is still in storage, couldn't track it down immediately. I'll see if I can get my hands on one and check the voltages.
I forget, did you receive the components from me or did you source them yourself? Do you remember what kind of decoupling capacitors you used (all the 0805s on the back of the board)?
I sourced the parts myself. I can't remember what rating I used for those caps off the top of my head. I took notes though about what I was ordering and what for, I'll try and dig them up, I should be able to identify what they are.
I think these FPGAs are fairly resilient, so unless you hit it with a large voltage or let your cat sit on it or did the soldering wearing a billowy nylon dress(!) you're probably still OK.
That's good to hear, fingers crossed it's just a minor flaw.
Nemesis wrote:gordon tool always crashes with a segmentation fault
That should never happen, irrespective of what's up with your hardware. Is your Linux machine x86 or x64? I can give you a debug build of gordon which will help track that problem down.
x64. More specifically, I'm running Ubuntu 15.10 off USB stick on a Dell Precision M6700 laptop, in case any of that is relevant in any way.


Also, while I've been writing this reply, I've had the UMDK running that loop test you sent through, no lockups yet. Thanks for your help on this.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Re: UMDK manufacturing, part 2: Software

Post by Nemesis » Mon Dec 28, 2015 1:42 pm

Tracked down my meter, it wasn't packed away in storage where most of my gear is, it was on the shelf in my garage. Here's my results:
USB: 5.104V
Reg 3.3: 3.311V
Reg 1.2: 1.188V
The USB being at 5.1V isn't unusual, so I don't expect that's a cause for concern. The 3.3V reg is basically spot on. The 1.2V reg is a touch low at 1.188V, but well within tolerance I would have thought (only 1% below spec). Any concerns here?

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Mon Dec 28, 2015 3:55 pm

prophet36 wrote:They may well both be files with .bin extensions, but your logs clearly indicate the EU dump you have is in .smd format, irrespective of its file extension. It may be possible to download a converter to make it into a genuine .bin format file.
Yep euro rom dump was weird. I've loaded another mega turrican eu .bin file and now works on the asian pal-1.

prophet36 wrote:So far as I know, from the last batch of five, only you and Burbruee have actually tried your UMDKs. So far, these Sonic glitches do not appear on my EU MD2, Minty's EU MD1, Minty's EU MD2, your JP MD1 or your US Nomad, but they do appear on your two EU MD1s (PAL-I & PAL-G) and on Burbruee's EU MD1. Is that correct, to the best of your knowledge?
Note that asian pal-1 ITS NOT a japanese NTSC model. The rest is correct.
prophet36 wrote:Let's concentrate on one of your crashing machines. Can you get a trace of Sonic crashing with an Address Error or Line 1111 Emulator exceptions, preferably a trace of an early crash (e.g a crash in the initial "Sega" banner would be ideal). Also, if you can get the basic signal test to fail, your compressed result.txt.bz2 file would be useful.
Ok, i've chosed the Pal-G version because its not modified and its the most common megadrive in europe IIRC.

Did test to some games:

Alissia dragoon and Sor 1 worked well played whole stage 1 on both games.

Sonic 1, has the illegal error, or address error always.

Sonic 3, and Streets of rage 3, Freeze and randomly resets the console.

Mega Turrican just freezes.

Here's the tests:

Sigtest result is ok : d5a670db486de8597e214c6849fdfacb

https://mega.nz/#F!rJk2CKyI!cuQbjeOQDw5XFvxqrVQBAQ

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Mon Dec 28, 2015 5:01 pm

Nemesis wrote:The USB being at 5.1V isn't unusual, so I don't expect that's a cause for concern. The 3.3V reg is basically spot on. The 1.2V reg is a touch low at 1.188V, but well within tolerance I would have thought (only 1% below spec). Any concerns here?
Nope, that all looks fine to me. If there was a short somewhere I'd expect the 3.3V rail to be low, because it drives the FPGA's I/O.

One thing which does strike me as a difference is this, from your message from 18 months ago:
Nemesis wrote:Swapped XC6SLX9-2TQG144C for XC6SLX9-3TQG144C. Substituted part has a better speed rating
I wonder if the faster FPGA is causing problems? Possibly the faster rise and fall times is causing some kind of ringing on the PCB? A more likely explanation is that programming an FPGA with speed grade -3 with a file intended for the speed grade -2 part is causing it to misbehave. I built you a cksum.xsvf for the -3 part. Give it a whirl.

Also, I made you a release and debug build of both flcli and gordon. I assume because you're running Linux from a USB stick you normally run Windows, so in addition to the x64 Linux build I also made an x64 Windows build. You can get the tarball from here. To use the Windows build, you'll need to install a driver for your UMDK cart, which zadig.exe (supplied) will help you with (it's mostly self-explanatory, but see here for docs). It's worth programming the FPGA with cksum.xsvf and running the test again from Windows, to see if that makes any difference to the checksum reliability. I admit I'm at a bit of a loss with the checksum failure; flcli is part of FPGALink, and has been used by hundreds of people on all sorts of different hardware for many years. The only hardware on which I've known it to fail was an ugly ratsnest of wires bridging two boards.

To debug the gordon problem, install gdb and run the debug build:

Code: Select all

$ sudo apt-get install gdb
  :
$ wget -q https://dl.dropboxusercontent.com/u/80983693/umdkv2/tools-bin-20151220.tar.gz
$ tar zxf tools-bin-20151220.tar.gz
$ cd tools-bin-20151220
$ gdb gordon/usb/lin.x64/dbg/usb
  :
(gdb) run -v 1d50:602b -t indirect:1 -w ${HOME}/umdkv2-bin/cksum.bin:0
  :
(gdb) where
  :
So I need to see the stack-trace you get from "where". Notice that the "gordon" executable is called "usb" here; this is just an artifact of how the code gets built and I couldn't be bothered to rename it.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Mon Dec 28, 2015 6:08 pm

Montserrat wrote:Sonic 1, has the illegal error, or address error always
The trace shows an illegal instruction exception in pretty much exactly the same place as Burbruee's.

Nemesis, can you help with this? There's a loop in Sonic 1 at 0x071FC4:

Code: Select all

0x071FC4  nop
0x071FC6  dbra d0, 0x071FC4
I can see this executing in Montserrat's trace log, but interleaved in the sequence of 68000 fetches is a bunch of fetches from the region starting at 0x079688, about once every 63μs (±150ns). Every word-aligned location is read twice, so the reads go 0x079688, 0x079688, 0x07968A, 0x07968A, 0x07968C, 0x07968C etc, as if something is reading single bytes from an incrementing address. Is this the Z80 or PSG or something?

This continues until we get to the expected pair of reads from 0x079852:

Code: Select all

167355419 C RD 071FC4 4E71  ; nop
167355444 C RD 071FC6 51C8  ; dbra d0,
167355469 C RD 071FC8 FFFC  ;         -4
167355514 C RD 079852 7581  ; ???
167355583 C RD 071FC4 4E71  ; nop
167355608 C RD 071FC6 51C8  ; dbra d0, ... etc
  :
167358463 C RD 071FC4 4E71  ; nop
167358488 C RD 071FC6 51C8  ; dbra d0,
167358514 C RD 071FC8 FFFC  ;         -4
167358545 C RD 079852 7581  ; ???
167358560 C RD 079852 7581  ; ??? ??? ???
167358615 C RD 071FC6 51C8  ; pre-fetch dbra d0, ...
167358673 C WB FFFDAC 1FC4  ; save PC, low word
167358699 C WB FFFDA8 2608  ; save SR
167358724 C WB FFFDAA 0007  ; save PC, high word
167358741 C RD 000010 0000  ; illegal instruction vector, high word
167358766 C RD 000012 03E6  ; illegal instruction vector, low word
Usually the reads from the 0x079688 region are far apart, but at timestamp 167358545, there are two reads from the same address, adjacent to eachother. Worse still, immediately after this "phantom" read, I had expected to see the 68000 read a NOP instruction (0x4E71) from 0x071FC4, but that never happens; it jumps straight to reading 0x071FC6 instead.

I think what's happening is that the /C_OE signal is shared by two internal subsystems, the 68000 and something else (let's guess PSG for the sake of argument), using some kind of internal bus arbitration. On modern MegaDrives, this bus arbitration is smooth. But on older model consoles, when the bus changes ownership, say from PSG to 68000, there's a small chance of a glitch on /C_OE on the rising edge of the last PSG read, which the UMDK FPGA interprets as a new read. So the UMDK memory controller goes off and starts executing what it thinks is another read from the same address, meanwhile it misses the genuine 0x071FC4-read from the 68000. Meanwhile the 68000 thinks it has correctly issued a read of 0x071FC4, but the word it gets back is not a NOP opcode 0x4E71 (that we'd expect to get from that address); instead it gets the result of the "phantom" PSG read, 0x7581, resulting in an illegal instruction exception raised at 0x071FC4. The final read from 0x071FC6 is just the 68000 pre-fetch.

Is this plausible, do you think?

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Mon Dec 28, 2015 6:20 pm

Actually, I got the Sonic 1 disassembly, ran build.bat, and sure enough, there in the resulting sonic.lst file:

Code: Select all

00079688                                        even
00079688                            
00079688                                        cnop ($8000-Size_of_SegaPCM),$8000
00079688                            SegaPCM:    incbin  "sound/dac/segapcm.bin"
So that's that mystery solved: it's the Z80 reading PCM samples to feed into the PSG.

But it would be good to get someone else who knows about MD internals to "approve" my analysis, above.

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Mon Dec 28, 2015 7:00 pm

prophet36 wrote:Actually, I got the Sonic 1 disassembly, ran build.bat, and sure enough, there in the resulting sonic.lst file:

Code: Select all

00079688                                        even
00079688                            
00079688                                        cnop ($8000-Size_of_SegaPCM),$8000
00079688                            SegaPCM:    incbin  "sound/dac/segapcm.bin"
So that's that mystery solved: it's the Z80 reading PCM samples to feed into the PSG.

But it would be good to get someone else who knows about MD internals to "approve" my analysis, above.
Since the error apears just in time when the "Seeegaa" sound must start, it has sense but PSG (SN76489) has no DAC capabilities, (it can be tricked but its not the case)it must be the FM (YM2612).

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Mon Dec 28, 2015 8:29 pm

Montserrat wrote:it must be the FM (YM2612).
Yup, you're right. My Z80 knowledge is limited but it looks like the "ix" register gets loaded with the YM2612 register address 0x4000, then the sample loop uses it to write each byte of sample data to YM2612 register 0x2A, which apparently is the DAC register:

Code: Select all

Z80Driver_Start:
    di                             ; Disable interrupts. Interrupts will never be reenabled
    di                             ; for the z80, so that no code will be executed on V-Int.
    di                             ; This means that the sample loop is all the z80 does.
    ld   sp, z80_stack             ; Initialize the stack pointer (unused throughout the driver)
    ld   ix, zYM2612_A0            ; ix = Pointer to memory-mapped communication register with YM2612

    :

Play_SegaPCM:	
    ld   de, zmake68kPtr(SegaPCM)  ; de = bank-relative location of the SEGA sound
    ld   hl, SegaPCM_End-SegaPCM   ; hl = size of the SEGA sound
    ld   c, 2Ah                    ; c = Command to select DAC output register

PlaySEGAPCMLoop:
    ld   a, (de)                   ; a = next byte from SEGA PCM
    ld   (ix+0), c                 ; Select DAC output register
    ld   (ix+1), a                 ; Send current data

    ld   b, SEGA_Pitch             ; b = pitch of the SEGA sample
    djnz $                         ; Pitch loop

    inc  de                        ; Point to next byte of DAC sample
    dec  hl                        ; Decrement remaining bytes on DAC sample
    ld   a, l                      ; a = low byte of remainig bytes
    or   h                         ; Are there any bytes left?
    jp   nz, PlaySEGAPCMLoop       ; If yes, keep playing sample

Post Reply