UMDK manufacturing, part 2: Software

Hosted forum for UMDK related questions

Moderators: BigEvilCorporation, prophet36

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Re: UMDK manufacturing, part 2: Software

Post by Nemesis » Mon Dec 28, 2015 10:31 pm

I may have located the problem. I spent a fair bit of time inspecting the fine solder joints to the leads of the IC's, but what I obviously didn't remember to inspect was the bigger simpler joints to the SMD caps. I've spotted some very suspect looking solder work on C3 and C19, which could be the problem here. Once I can unearth my soldering iron, I'll fix them up and see if that makes the difference.

I tried out the speed grade 3 chsum.xsvf file you sent through, on Windows this time since my USB linux install self-destructed last night, no change in results. I'll try again once I've fixed the identified soldering issues though.

On another note, thanks for that debug build of gordon, I've analysed the crash. The issues center around flash_chips.cpp, in the findChip function. Basically, what's happening on my device right now is that the vendorID and deviceID are being calculated as 0. This is causing the flashChips array to iterate to the end, then get an apparent match on the null entry which is meant to terminate the list. This then causes the if statement at line 332 to pass, and attempt to call the selectorFunc on the target chip, which is null, hence we get a read attempt from address 0 and the program goes boom. I've also spotted some entries in the flashChips array which have NULL for their selector functions, which would cause the same problem if they were ever detected. Were these supposed to be nullSelector? At any rate, the if statement at line 332 should be changed from this:

Code: Select all

if ( thisChip->vendorID == vendorID && thisChip->deviceID == deviceID )
to this:

Code: Select all

if ( thisChip->deviceName && thisChip->vendorID == vendorID && thisChip->deviceID == deviceID )
which will prevent the terminating entry from being inspected any further, and cause it to pass through to the else clause in all cases.

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Mon Dec 28, 2015 11:30 pm

prophet36 wrote:
Montserrat wrote:it must be the FM (YM2612).
Yup, you're right. My Z80 knowledge is limited but it looks like the "ix" register gets loaded with the YM2612 register address 0x4000, then the sample loop uses it to write each byte of sample data to YM2612 register 0x2A, which apparently is the DAC register:
Yes, DAC uses 2 registers, 0x2A as data register and 0x2B as control register, 0x2B is used to determine if DAC channel (nº6) is behaving as DAC or as FM.

Theres a third register that controls stereo panning, but it doest matter.

Code: Select all

cnop ($8000-Size_of_SegaPCM),$8000
Also 0x8000 seems the start address that z80 uses for bank switching, that is probably needed because that PCM was famous for being huge (1/8 total rom size).

So may be the problem is something related, dont know if this helps.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Mon Dec 28, 2015 11:41 pm

Nemesis wrote:very suspect looking solder work on C3 and C19
Yes, soldering the ICs is surprisingly easy, but all those capacitors is just drudgery.
Nemesis wrote:on Windows this time since my USB linux install self-destructed last night
Do you want to be the Windows guinea-pig? I can build everything for Windows x64 and you can try it. I still have not found a good front-end to GDB on Windows, so having you kick the tyres would be useful. It's up to you, if you prefer the more tried & tested Linux route you can go that way if you prefer.
Nemesis wrote:the if statement at line 332 should be changed
Great, thanks for the patch!
Nemesis wrote:on my device right now...the vendorID and deviceID are being calculated as 0
Notwithstanding the bug you fixed, I'd quite like to understand why that is happening in the first place. One thing you can do is ask the flash chip for its JEDEC identifier using flcli (sorry it's a little cryptic...I can explain if you're interested):

Code: Select all

$ flcli -v 1d50:602b -p J:A7A0A3A1:${HOME}/umdkv2-bin/spi-talk.xsvf
Attempting to open connection to FPGALink device 1d50:602b...
Connected to FPGALink device 1d50:602b (firmwareID: 0xFFFF, firmwareVersion: 0x20140311)
Programming device...
$ flcli -v 1d50:602b -a 'w1 07;w0 9F;w1 05;w0 FFFFFF;w1 00;r0 3'
Attempting to open connection to FPGALink device 1d50:602b...
Connected to FPGALink device 1d50:602b (firmwareID: 0xFFFF, firmwareVersion: 0x20140311)
Executing CommFPGA actions on FPGALink device 1d50:602b...
     00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0000 20 20 13                                          .
$
Note the three JEDEC ident bytes 20 20 13. What do you get?

BTW, can you take a look at my above post analyzing Montserrat's Sonic 1 crash?

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Re: UMDK manufacturing, part 2: Software

Post by Nemesis » Tue Dec 29, 2015 12:47 am

Do you want to be the Windows guinea-pig? I can build everything for Windows x64 and you can try it. I still have not found a good front-end to GDB on Windows, so having you kick the tyres would be useful. It's up to you, if you prefer the more tried & tested Linux route you can go that way if you prefer.
Yep, very happy to do that. Linux is a strange universe for me, I have to google for the most basic things (IE, I had to google how to open a shell).
Note the three JEDEC ident bytes 20 20 13. What do you get?
From Windows again, here's what I get:

Code: Select all

D:\Archives\Drivers\UMDK\Tools\umdkv2-bin>"D:\Archives\Drivers\UMDK\tools-bin-20
151220.tar\tools-bin-20151220\tools-bin-20151220\flcli\msvc.x64\rel\flcli.exe" -
v 1d50:602b -a "w1 07;w0 9F;w1 05;w0 FFFFFF;w1 00;r0 3"
Attempting to open connection to FPGALink device 1d50:602b...
Connected to FPGALink device 1d50:602b (firmwareID: 0xFFFF, firmwareVersion: 0x2
0140311)
Executing CommFPGA actions on FPGALink device 1d50:602b...
     00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
0000 00 00 00                                        ...
BTW, can you take a look at my above post analyzing Montserrat's Sonic 1 crash?
There's only two other devices in the Mega Drive that can drive the M68K bus, the VDP (during a DMA write), and the Z80, which does it indirectly through the bus arbiter. The Z80 never formally requests and releases the M68K bus, but rather, when accessing the banked memory region at 0x8000-0xFFFF in the Z80 memory space, the bus arbiter requests the bus, and once it gets ownership, it performs the read/write on behalf of the Z80, forwards the result onto the Z80 bus, and releases the M68K bus when it's done. This can occur at potentially any time during system operation. The exact timing of this process is currently undocumented as far as I'm aware, and seeing the number of revisions that device went through, I wouldn't be surprised to find there are differences between models in the behaviour. The VDP will request the bus for DMA in response to fairly recent access from the M68K (and potentially even the Z80 through banked access), and will release it when done. In the case of a currently executing VDP DMA operation, if the Z80 attempts to access the banked memory area, the bus arbiter will wait until the VDP releases the bus before it requests it and forwards the Z80 access attempt, effectively keeping the Z80 suspended waiting for the bus to become available.

From the cartridge port, you may have trouble detecting all of this. the BR, BG, and BGACK lines aren't exposed over any external ports. It's certainly plausible for there to be a glitch on one of the signal lines during "hand-over" of the bus to/from the Z80. It might actually be 68000 related though. When you say the "C_OE" line, which line exactly are you referring to? Is that coming from the "AS" line on the 68000? Because if it is, according to the M68000 User's Manual, under the "3-Wire Bus Arbitration Timing Diagram" (section 5-13), it appears the AS line is left floating for a few cycles during bus hand-over. DTACK however is valid the entire time, and that line is available on the cartridge port. Based on my reading of the timing diagrams, I think you should be able to use DTACK to ignore AS when it's not valid during bus arbitration. Does that look possible to you? Or on a less extreme route, and my knowledge of electronics is unfortunately lacking here, but maybe you can safely "pull up" the AS signal yourself if it's floating, without affecting the rest of the system?

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Tue Dec 29, 2015 4:30 pm

I've opened all my consoles to identify the revision, its kinda difficult since some components depend not only the revision but the manufacturing country too so its a bit chaotic.

Dont know if its usesful but got some data on the Z80 memory bank components, seems they have diferent latencies:

This is the asian pal, here UMDK works perfect. (some weird glitches, but im sure its due to old capacitors, nothing related to umdk)

Code: Select all

ASIAN PAL-1 1601-11 VA1

Z80 memory: NEC D4168C-15-SG (8Kx8 XRAM, 150 ns)
This is the non-modded pal-g, we are currently testing, this gets freezes, resets, and the illegal errors.

Code: Select all

EURO PAL-G 1601-18 VA6 

Z80 memory: Sanyo LC3664RL (8Kx8 SRAM, 120 ns)

And finaly this is the modded pal-i, any game loaded via SD-card or terminal returns black screen, SOR series gives a red screen.

Code: Select all

EURO PAL-i 1601-05 VA6

Z80 memory: MB8464A-10 (8Kx8 SRAM, 100 ns)
Other revision has other components with diferent latencies, dont know but may be its some sort of clue.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Tue Dec 29, 2015 6:43 pm

Nemesis wrote:Yep, very happy to [be the Windows guinea-pig]
OK, well if you download the Windows build infrastructure, unzip it to C:\makestuff and then run C:\makestuff\setup.exe, it will create a desktop shortcut for you which will give you a Linux-like command-prompt that *should* accept most of the commands in the Getting Started wiki (e.g all the "dd" commands), once I've uploaded the full set of Windows .exe and .dll files. When you run setup.exe you can select an MSVC compiler and Python installation if you wish, or you can just ignore them and create a "vanilla" desktop shortcut.
Nemesis wrote:From Windows again...00 00 00
Hmmm. I would check the connections between the flash chip and the FPGA. It's pointless trying to run gordon until you have a sane JEDEC ID (hopefully 20 20 13).
Nemesis wrote:It's certainly plausible for there to be a glitch on one of the signal lines during "hand-over" of the bus to/from the Z80
Yes, fundamentally I think this is the problem. Both the 68000 and the bus arbiter must both drive /C_OE at different times (/C_OE is what I've always called it; it goes to the /OE pin of the ROM chips in the game carts), which implies there must be short periods during which nobody is driving it, making it susceptible to noise.

It should be straightforward to fix, but I just need time to think about it. Initially I thought I could just reject duplicate reads from the same address, but there's at least one genuine scenario where that does actually happen: when switching the 68000 from supervisor to user mode, the prefetch pipeline is purged and re-read, because in general user code and supervisor code occupy separate address-spaces.
Montserrat wrote:its kinda difficult since some components depend not only the revision but the manufacturing country too so its a bit chaotic
I think I've tracked down the problem now, I just need some time to fix it. Meanwhile I filed it as a bug here.

Hey, quick question: on your modded PAL-I machine, does the menu program itself ever crash? Does it ever fail the basic signal test?

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Tue Dec 29, 2015 9:30 pm

prophet36 wrote:Hey, quick question: on your modded PAL-I machine, does the menu program itself ever crash? Does it ever fail the basic signal test?
About the pal-i machine:

Just did the signal test, its ok:

d5a670db486de8597e214c6849fdfacb

The menu works fine. But any rom i try to load results in a black screen. SD-card and terminal methods.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Tue Dec 29, 2015 10:02 pm

Montserrat wrote:About the pal-i machine...any rom i try to load results in a black screen
OK can you get a Sonic 1 trace? I suspect it's the same problem as the PAL-G machine, it just happens earlier.

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Tue Dec 29, 2015 10:51 pm

prophet36 wrote:
Montserrat wrote:About the pal-i machine...any rom i try to load results in a black screen
OK can you get a Sonic 1 trace? I suspect it's the same problem as the PAL-G machine, it just happens earlier.
Sure! Here is it:

https://mega.nz/#!zUEhiBrQ!xmt57yxexEGg ... q0SgrnrBiU

mikejmoffitt
Very interested
Posts: 86
Joined: Fri Sep 25, 2015 4:16 pm

Re: UMDK manufacturing, part 2: Software

Post by mikejmoffitt » Wed Dec 30, 2015 12:29 am

Argh! My package containing most of my belongings is still trapped in the post. On the bright side, another package containing my MD1 with no TMSS has arrived, so once my cartridge is here I can finally run tests on the no-TMSS unit to see what results are.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Wed Dec 30, 2015 3:02 pm

mikejmoffitt wrote:Argh! My package containing most of my belongings is still trapped in the post. On the bright side, another package containing my MD1 with no TMSS has arrived, so once my cartridge is here I can finally run tests on the no-TMSS unit to see what results are.
I wonder what happened to Grind and mswan? Weren't they the other two people who bought in Montserrat's batch?

Montserrat
Very interested
Posts: 115
Joined: Fri Sep 18, 2015 2:56 pm

Re: UMDK manufacturing, part 2: Software

Post by Montserrat » Wed Dec 30, 2015 3:11 pm

prophet36 wrote: I wonder what happened to Grind and mswan? Weren't they the other two people who bought in Montserrat's batch?
Mswan is friend of moffitt i think, and i believe grind its moving now too.

Have you checked on the sonic trace?

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Wed Dec 30, 2015 4:18 pm

Montserrat wrote:Have you checked on the sonic trace?
Yes; it's a complete mystery. The 68000 just seems to stop executing in the middle of an innocuous bit of code:

Code: Select all

0x00148A  move.w d6, d7
0x00148C  subq.w #8, d7
0x00148E  move.w d5, d1
0x001490  lsr.w d7, d1
0x001492  cmpi.b #-4, d1
0x001496  bcc.s 0x14D6
0x001498  andi.w #255, d1 <---
0x00149C  add.w d1, d1
The marked instruction is the last one to be fetched. There are no nearby DMA or Z80 cycles that could explain it, and no duplicate fetches you'd expect from noise on the /C_OE signal. The most likely cause of the hang is that the actual instructions seen by the 68000 somehow differ from those in the trace, causing it to go off and try to read from somewhere with no DTACK generation. Unfortunately the trace log will only tell us what the FPGA "said", not what the 68000 "heard", so there's no way to know for sure. My first guess would be an unreliable connection, but that possibility is eliminated by the fact that the menu program works, and the basic signal test passes.

Can you try grabbing four or five similar traces, to see if it fails in different places?

Also a trace from the game that gives a red screen? Streets of Rage was it?

This PAL-I machine is modded? What is the mod? Where does the mod connect to the circuit board?

Grind
Very interested
Posts: 69
Joined: Fri Jun 13, 2014 1:26 pm
Location: US
Contact:

Re: UMDK manufacturing, part 2: Software

Post by Grind » Wed Dec 30, 2015 5:26 pm

prophet36 wrote:I wonder what happened to Grind and mswan? Weren't they the other two people who bought in Montserrat's batch?
Still watching the thread, but have not had time to try the newer firmware/tests just yet. Worst case I will still have a solid chunk of free time this weekend.

prophet36
Very interested
Posts: 234
Joined: Sat Dec 13, 2008 6:58 pm
Location: London, UK
Contact:

Re: UMDK manufacturing, part 2: Software

Post by prophet36 » Wed Dec 30, 2015 6:04 pm

Grind wrote:Still watching the thread, but have not had time to try the newer firmware/tests just yet. Worst case I will still have a solid chunk of free time this weekend.
Great! I was just worried you'd gotten disheartened by all the problems we've been finding.

Post Reply