New Documentation: An authoritative reference on the YM2612

For anything related to sound (YM2612, PSG, Z80, PCM...)

Moderator: BigEvilCorporation

Eke
Very interested
Posts: 884
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by Eke » Sun Nov 25, 2018 9:09 am

Nemesis tested the address/data ports relation some years ago

viewtopic.php?f=24&t=386&start=410
It turns out that writing to an address register stores both the written address, and the part number of the address register you wrote to. You can then write to either the data port at $A00001, or the data port at $A00003, and the write will go to the register number you wrote, within the part of the address register you wrote to. This means you can, for example, write an address to $A00000, then write the data to $A00003, and the data will in fact be written to the part 1 register block, not the part 2 register block.
Not sure if he tested also with global registers ($2x) though

jotego
Interested
Posts: 22
Joined: Sat Jan 28, 2017 8:30 am
Location: Valencia (Spain)
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by jotego » Sun Nov 25, 2018 5:29 pm

nukeykt wrote:
Sat Nov 24, 2018 7:36 pm
One FM clock in my code is 6 master clocks. For example in Genesis Plus GX emulator OPN2_Clock is called 1.27(7.67/6) million times per second(for comparison MAME code works at 53.267 KHz). Like real hardware my implementation updates registers within 12 FM ticks. Look carefully to OPN2_DoRegWrite function.
I see, I agree with that approach. However, I have another question.

Are you really implementing the pipeline? For instance, in function OPN2_PhaseCalcIncrement you perform a lot operations for the data of the current slot, without holding any intermmediate value in a latch. Can that be real? Of course the chip has a lot of parallel things going on that speed up things but it still looks like a lot to be done in a clock cycle.

Then you have the result stored in pg_inc[slot], but data couldn't be actually stored in the same "slot" as it was passing through.

You use pg_inc in the OPN2_PhaseGenerate but you calculate slot as

Code: Select all

    slot = (chip->cycles + 20) % 24;
My interpretation of this is that the previous function actually had 20 pipeline stages (20 latches) in hardware. Your writing to the same slot you were using as input does not affect accuracy of emulation because that position will not be read until 20 cycles later. So, in general, it looks like you are respecting the pipeline delays but just not writting a function (latch) for each pipeline stage.

Is my interpretation correct?

Another question is about your function OPN2_DoRegWrite. chip->write_fm_data is read at the beginning of the function (in the if statement) but it is modified at the end of the function. Are you representing a latch operation in this way? Because the data written at the end will not be read until the next call to the function.

nukeykt
Interested
Posts: 12
Joined: Thu Sep 20, 2018 10:42 am

Re: New Documentation: An authoritative reference on the YM2612

Post by nukeykt » Sun Nov 25, 2018 6:24 pm

jotego wrote:
Sun Nov 25, 2018 5:29 pm
nukeykt wrote:
Sat Nov 24, 2018 7:36 pm
One FM clock in my code is 6 master clocks. For example in Genesis Plus GX emulator OPN2_Clock is called 1.27(7.67/6) million times per second(for comparison MAME code works at 53.267 KHz). Like real hardware my implementation updates registers within 12 FM ticks. Look carefully to OPN2_DoRegWrite function.
I see, I agree with that approach. However, I have another question.

Are you really implementing the pipeline? For instance, in function OPN2_PhaseCalcIncrement you perform a lot operations for the data of the current slot, without holding any intermmediate value in a latch. Can that be real? Of course the chip has a lot of parallel things going on that speed up things but it still looks like a lot to be done in a clock cycle.

Then you have the result stored in pg_inc[slot], but data couldn't be actually stored in the same "slot" as it was passing through.

You use pg_inc in the OPN2_PhaseGenerate but you calculate slot as

Code: Select all

    slot = (chip->cycles + 20) % 24;
My interpretation of this is that the previous function actually had 20 pipeline stages (20 latches) in hardware. Your writing to the same slot you were using as input does not affect accuracy of emulation because that position will not be read until 20 cycles later. So, in general, it looks like you are respecting the pipeline delays but just not writting a function (latch) for each pipeline stage.

Is my interpretation correct?
All calculations performed in OPN2_PhaseCalcIncrement function actually take 4 cycles on real hardware. I've simplified it as all data used in this function enter to the PG block at the same time. Replicating all latches would be big headache for linear C emulator. Probably there are some other optimizations like this in my code that should not screw up core accuracy.
jotego wrote:
Sun Nov 25, 2018 5:29 pm
Another question is about your function OPN2_DoRegWrite. chip->write_fm_data is read at the beginning of the function (in the if statement) but it is modified at the end of the function. Are you representing a latch operation in this way? Because the data written at the end will not be read until the next call to the function.
Yes. write_fm_data actually is register and it does exist on YM3438. So it should be consistent with real hardware.

jotego
Interested
Posts: 22
Joined: Sat Jan 28, 2017 8:30 am
Location: Valencia (Spain)
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by jotego » Mon Nov 26, 2018 12:03 pm

nukeykt wrote:
Sun Nov 25, 2018 6:24 pm
All calculations performed in OPN2_PhaseCalcIncrement function actually take 4 cycles on real hardware. I've simplified it as all data used in this function enter to the PG block at the same time. Replicating all latches would be big headache for linear C emulator. Probably there are some other optimizations like this in my code that should not screw up core accuracy.
I have started to translate your code to Verilog so we can have this level of accuracy in FPGA too. I don't know whether it will make an audible difference to current JT12 or not, but once the knowledge of the real implementation is published, I think we ought to use it. I understand the original chip well because I wrote a clone so I may be able to spot a bug in your code -if there is one. Thus, this exercise of mine may also be useful to you as you get a second pair of eyes going through your code.

If you have information about how many cycles -pipeline length- took each function, could you share it? That would help me as in digital design these things have more implications than in software.

By the way, I see the timers do operations in cycles 1 and 2. I wonder, how do timers know the cycle in real hardware? In my implementation, I did find it useful to have a signal indicating the first stage of the pipeline. I called it zero and it goes to many places. However, your code suggests that there are signals to indicate other cycles, not only the first one. Was there another circular shift register (CSR) to indicate the current cycle in hardware? Something like a one-bit CSR feeding different circuits so Yamaha designers could have more freedom to decide when things happened?

This would be a departure from YM2203 as if the timer counts on cycle 1 as in your code, it would count at twice the speed in YM2203 because there are half the number of operators (i.e. total cycles per output)

nukeykt
Interested
Posts: 12
Joined: Thu Sep 20, 2018 10:42 am

Re: New Documentation: An authoritative reference on the YM2612

Post by nukeykt » Mon Nov 26, 2018 1:56 pm

jotego wrote:
Mon Nov 26, 2018 12:03 pm
I have started to translate your code to Verilog so we can have this level of accuracy in FPGA too. I don't know whether it will make an audible difference to current JT12 or not, but once the knowledge of the real implementation is published, I think we ought to use it. I understand the original chip well because I wrote a clone so I may be able to spot a bug in your code -if there is one. Thus, this exercise of mine may also be useful to you as you get a second pair of eyes going through your code.
Nice to hear it. Good luck with it :)
jotego wrote:
Mon Nov 26, 2018 12:03 pm
If you have information about how many cycles -pipeline length- took each function, could you share it? That would help me as in digital design these things have more implications than in software.
I think i can't help much here. More than one year has passed since i worked on it.
jotego wrote:
Mon Nov 26, 2018 12:03 pm
By the way, I see the timers do operations in cycles 1 and 2. I wonder, how do timers know the cycle in real hardware? In my implementation, I did find it useful to have a signal indicating the first stage of the pipeline. I called it zero and it goes to many places. However, your code suggests that there are signals to indicate other cycles, not only the first one. Was there another circular shift register (CSR) to indicate the current cycle in hardware? Something like a one-bit CSR feeding different circuits so Yamaha designers could have more freedom to decide when things happened?

This would be a departure from YM2203 as if the timer counts on cycle 1 as in your code, it would count at twice the speed in YM2203 because there are half the number of operators (i.e. total cycles per output)
YM3438 actually has internal cycle counter which outputs 24 signals for each cycle value. Cycle counter is actively used throughout entire chip. For example operator unit uses it to determine channel's operator number which further used for FM algorithm control.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by Stef » Tue Jan 17, 2023 1:14 am

I finally took sometime to make some tests about the minimum safe write timings for the YM2612 / YM3438 chip.
I made a lot of tests but still I couldn't test every cases and every registers but hopefully that is enough to confirm (or not) data we got from official documents.
I made all my tests using the 68K CPU (I initially tried with the Z80 but it was really too painful honestly) and i assumed there is no extra delay cycle in writing the YM2612 chip from it (to be honest I had no idea but it definitely seems that this is not the case).
I tested both on YM2612 (MD1 VA0) and the ASIC YM3438 (Nomad and MD3 clone) and so far the write timing seems roughly identical between the 2 (contrary to my initial belief) but I still need to do some tests with the key register on YM2612 to confirm that.
Edit: I just completed my tests on the YM2612 and I can confirm that timings are identical.

So far here are the timing given in official docs:
- 17 YM cycles (8 Z80 cycles) between writing the address and data
- no wait between writes to addresses $21-$2F
- 83 YM cycles (39 Z80 cycles) between writes to addresses $30-$9E
- 47 YM cycles (22 Z80 cycles) between writes to addresses $A0-$B6

The good news is that I can confirm almost all these numbers :)
Still there is some minor but important differences in some parts:
- The documentation said 17 cycles between address / data write.
That really seems pessimist and I couldn't got any problems using 12 cycles (5-6 Z80 cycles).
With 10 cycles or less you indeed got some missed or corrupted writes quickly though.
This is interesting to know as it means we can chain 2 fast Z80 writes (7 cycles + 7 cycles) to write a register.
- The documentation said there is no wait for all $2x registers, and so far that is true except for the $28 register (key on/off) !
As i though this register really need some delay between writes otherwise you will got some unwanted results as completely missed writes or even altered envelop phase (!?!) when you chain key operations too quickly on the same channel (note play but quieter than expected).
The minimum delay for safe operation is about 112-116 cycles (52-54 Z80 cycles) which is quite a lot (but I expected even more here). That is more than the 83 cycles required for writing $30-$9E registers.
- The 83 cycles / 47 cycles for $30-$9E / $A0-$B6 registers seems to perfectly matches what i measured.

There is some quirks about the ASIC YM3438 to be aware of as it makes it behave a bit differently when we read the status and the busy flag in particular :
- unlike the YM2612 the status can only be read on the port #0.
- the busy bit isn't immediately set to 1 when writing register. I dunno how much time it takes exactly to be set but waiting 20 YM cycles after the register write is enough to avoid any problem with that.
Last edited by Stef on Tue Jan 17, 2023 9:18 am, edited 5 times in total.

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by TmEE co.(TM) » Tue Jan 17, 2023 1:42 am

I remember needing a larger than normal delay on key on/off registers when I made my sound driver many years ago, the envelope thing you mention is because the 4bits react differently and sometimes one or more won't do same as others and this can dramatically alter the sound. I also remember getting away with a little shorter address delay, but in the end it got wrapped into using convenient but slow IX and IY instructions lol
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by Stef » Tue Jan 17, 2023 9:27 am

Yeah the address / register write delay seems quite short, still not short enough to use a Z80 16 bit write as
LD (YMPORT), HL (16 cycles)
or
LD (YMPORT), DE (20 cycles)

I really wanted to use it as even if high byte is written first (at addr + 1), it's not much of an issue as both address ports share the same register (it doesn't make any difference writing the address to $4001 or $4003).
Unfortunately it's too fast as you have only 3 Z80 cycles between the address / register write..
At best you can do the following sequence:
LD (BC), A // write address (7 cycles)
LD (HL), E // write value (7 cycles)

but it's unlikely to happen as you need to waste / initialize both BC and HL registers for that..
In the end myself I often use something as:
LD (HL), D // write address (7 cycles)
INC L // (4 cycles)
LD (HL), E // write value (7 cycles)

About the key registers, indeed the envelop alteration with too short delay maybe caused by only having part of the operators key ON/OFF state being properly updated. That make some sense after all..

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by TmEE co.(TM) » Tue Jan 17, 2023 12:01 pm

One thing is that the write itself happens at the end of instruction so the delay really starts then and what happens next depends on what comes after. In the end I gave up trying to optimize access speeds and use IX/IY with YMPORT in them permanently, since I don't use them for anything else due to their slowness.
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

HardWareMan
Very interested
Posts: 745
Joined: Sat Dec 15, 2007 7:49 am
Location: Kazakhstan, Pavlodar

Re: New Documentation: An authoritative reference on the YM2612

Post by HardWareMan » Wed Jun 21, 2023 4:26 pm

As I already said here we opened the MD decapping project. One of it branch is was YM2612/YM3438 FM chip. Now the new dedicated to FM chip project is started and almost die perfect verilog model is available here. Testbench:
Image
Resource usage on DE10-nano:
Image
Some WIP recordings. YM only (no PSG).
https://soundcloud.com/user-656259515/test11
https://soundcloud.com/user-656259515/test14
https://soundcloud.com/user-656259515/test13

jotego
Interested
Posts: 22
Joined: Sat Jan 28, 2017 8:30 am
Location: Valencia (Spain)
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by jotego » Thu Jun 22, 2023 8:50 pm

Congratulations on such a feat!

I’ll be comparing your implementation to mine for fine details. Thanks for sharing it.

Post Reply