New Documentation: An authoritative reference on the YM2612

For anything related to sound (YM2612, PSG, Z80, PCM...)

Moderator: BigEvilCorporation

Eke
Very interested
Posts: 829
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by Eke » Sun Nov 25, 2018 9:09 am

Nemesis tested the address/data ports relation some years ago

viewtopic.php?f=24&t=386&start=410
It turns out that writing to an address register stores both the written address, and the part number of the address register you wrote to. You can then write to either the data port at $A00001, or the data port at $A00003, and the write will go to the register number you wrote, within the part of the address register you wrote to. This means you can, for example, write an address to $A00000, then write the data to $A00003, and the data will in fact be written to the part 1 register block, not the part 2 register block.
Not sure if he tested also with global registers ($2x) though

jotego
Interested
Posts: 21
Joined: Sat Jan 28, 2017 8:30 am
Location: Valencia (Spain)
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by jotego » Sun Nov 25, 2018 5:29 pm

nukeykt wrote:
Sat Nov 24, 2018 7:36 pm
One FM clock in my code is 6 master clocks. For example in Genesis Plus GX emulator OPN2_Clock is called 1.27(7.67/6) million times per second(for comparison MAME code works at 53.267 KHz). Like real hardware my implementation updates registers within 12 FM ticks. Look carefully to OPN2_DoRegWrite function.
I see, I agree with that approach. However, I have another question.

Are you really implementing the pipeline? For instance, in function OPN2_PhaseCalcIncrement you perform a lot operations for the data of the current slot, without holding any intermmediate value in a latch. Can that be real? Of course the chip has a lot of parallel things going on that speed up things but it still looks like a lot to be done in a clock cycle.

Then you have the result stored in pg_inc[slot], but data couldn't be actually stored in the same "slot" as it was passing through.

You use pg_inc in the OPN2_PhaseGenerate but you calculate slot as

Code: Select all

    slot = (chip->cycles + 20) % 24;
My interpretation of this is that the previous function actually had 20 pipeline stages (20 latches) in hardware. Your writing to the same slot you were using as input does not affect accuracy of emulation because that position will not be read until 20 cycles later. So, in general, it looks like you are respecting the pipeline delays but just not writting a function (latch) for each pipeline stage.

Is my interpretation correct?

Another question is about your function OPN2_DoRegWrite. chip->write_fm_data is read at the beginning of the function (in the if statement) but it is modified at the end of the function. Are you representing a latch operation in this way? Because the data written at the end will not be read until the next call to the function.

nukeykt
Newbie
Posts: 7
Joined: Thu Sep 20, 2018 10:42 am

Re: New Documentation: An authoritative reference on the YM2612

Post by nukeykt » Sun Nov 25, 2018 6:24 pm

jotego wrote:
Sun Nov 25, 2018 5:29 pm
nukeykt wrote:
Sat Nov 24, 2018 7:36 pm
One FM clock in my code is 6 master clocks. For example in Genesis Plus GX emulator OPN2_Clock is called 1.27(7.67/6) million times per second(for comparison MAME code works at 53.267 KHz). Like real hardware my implementation updates registers within 12 FM ticks. Look carefully to OPN2_DoRegWrite function.
I see, I agree with that approach. However, I have another question.

Are you really implementing the pipeline? For instance, in function OPN2_PhaseCalcIncrement you perform a lot operations for the data of the current slot, without holding any intermmediate value in a latch. Can that be real? Of course the chip has a lot of parallel things going on that speed up things but it still looks like a lot to be done in a clock cycle.

Then you have the result stored in pg_inc[slot], but data couldn't be actually stored in the same "slot" as it was passing through.

You use pg_inc in the OPN2_PhaseGenerate but you calculate slot as

Code: Select all

    slot = (chip->cycles + 20) % 24;
My interpretation of this is that the previous function actually had 20 pipeline stages (20 latches) in hardware. Your writing to the same slot you were using as input does not affect accuracy of emulation because that position will not be read until 20 cycles later. So, in general, it looks like you are respecting the pipeline delays but just not writting a function (latch) for each pipeline stage.

Is my interpretation correct?
All calculations performed in OPN2_PhaseCalcIncrement function actually take 4 cycles on real hardware. I've simplified it as all data used in this function enter to the PG block at the same time. Replicating all latches would be big headache for linear C emulator. Probably there are some other optimizations like this in my code that should not screw up core accuracy.
jotego wrote:
Sun Nov 25, 2018 5:29 pm
Another question is about your function OPN2_DoRegWrite. chip->write_fm_data is read at the beginning of the function (in the if statement) but it is modified at the end of the function. Are you representing a latch operation in this way? Because the data written at the end will not be read until the next call to the function.
Yes. write_fm_data actually is register and it does exist on YM3438. So it should be consistent with real hardware.

jotego
Interested
Posts: 21
Joined: Sat Jan 28, 2017 8:30 am
Location: Valencia (Spain)
Contact:

Re: New Documentation: An authoritative reference on the YM2612

Post by jotego » Mon Nov 26, 2018 12:03 pm

nukeykt wrote:
Sun Nov 25, 2018 6:24 pm
All calculations performed in OPN2_PhaseCalcIncrement function actually take 4 cycles on real hardware. I've simplified it as all data used in this function enter to the PG block at the same time. Replicating all latches would be big headache for linear C emulator. Probably there are some other optimizations like this in my code that should not screw up core accuracy.
I have started to translate your code to Verilog so we can have this level of accuracy in FPGA too. I don't know whether it will make an audible difference to current JT12 or not, but once the knowledge of the real implementation is published, I think we ought to use it. I understand the original chip well because I wrote a clone so I may be able to spot a bug in your code -if there is one. Thus, this exercise of mine may also be useful to you as you get a second pair of eyes going through your code.

If you have information about how many cycles -pipeline length- took each function, could you share it? That would help me as in digital design these things have more implications than in software.

By the way, I see the timers do operations in cycles 1 and 2. I wonder, how do timers know the cycle in real hardware? In my implementation, I did find it useful to have a signal indicating the first stage of the pipeline. I called it zero and it goes to many places. However, your code suggests that there are signals to indicate other cycles, not only the first one. Was there another circular shift register (CSR) to indicate the current cycle in hardware? Something like a one-bit CSR feeding different circuits so Yamaha designers could have more freedom to decide when things happened?

This would be a departure from YM2203 as if the timer counts on cycle 1 as in your code, it would count at twice the speed in YM2203 because there are half the number of operators (i.e. total cycles per output)

nukeykt
Newbie
Posts: 7
Joined: Thu Sep 20, 2018 10:42 am

Re: New Documentation: An authoritative reference on the YM2612

Post by nukeykt » Mon Nov 26, 2018 1:56 pm

jotego wrote:
Mon Nov 26, 2018 12:03 pm
I have started to translate your code to Verilog so we can have this level of accuracy in FPGA too. I don't know whether it will make an audible difference to current JT12 or not, but once the knowledge of the real implementation is published, I think we ought to use it. I understand the original chip well because I wrote a clone so I may be able to spot a bug in your code -if there is one. Thus, this exercise of mine may also be useful to you as you get a second pair of eyes going through your code.
Nice to hear it. Good luck with it :)
jotego wrote:
Mon Nov 26, 2018 12:03 pm
If you have information about how many cycles -pipeline length- took each function, could you share it? That would help me as in digital design these things have more implications than in software.
I think i can't help much here. More than one year has passed since i worked on it.
jotego wrote:
Mon Nov 26, 2018 12:03 pm
By the way, I see the timers do operations in cycles 1 and 2. I wonder, how do timers know the cycle in real hardware? In my implementation, I did find it useful to have a signal indicating the first stage of the pipeline. I called it zero and it goes to many places. However, your code suggests that there are signals to indicate other cycles, not only the first one. Was there another circular shift register (CSR) to indicate the current cycle in hardware? Something like a one-bit CSR feeding different circuits so Yamaha designers could have more freedom to decide when things happened?

This would be a departure from YM2203 as if the timer counts on cycle 1 as in your code, it would count at twice the speed in YM2203 because there are half the number of operators (i.e. total cycles per output)
YM3438 actually has internal cycle counter which outputs 24 signals for each cycle value. Cycle counter is actively used throughout entire chip. For example operator unit uses it to determine channel's operator number which further used for FM algorithm control.

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest