New Documentation: An authoritative reference on the YM2612

AamirM · Post by **AamirM** » Thu Feb 26, 2009 1:55 pm

Damn, I've got no free time time to read all that

. But thanks to everyone who is contributing to these new discoveries

.

Eke · Post by **Eke** » Thu Feb 26, 2009 5:30 pm

ok, I've finished the new SSG-EG implementation in the MAME core.(as well as some other stuff you mentionned, like correct CSM mode)

Seems fine so far with the test programs...

Nemesis, if you want to have a look and doublecheck it (I tried to optimize it a little bit and avoid unecessary recalculation as most as I could), please check herefor fm.c... any advice/corrections appreciated

Nemesis · Post by **Nemesis** » Thu Feb 26, 2009 10:41 pm

Snake wrote:[edit]Here we go. As you can see it looks like a sine wave at the same frequency as the rest of the wave. I'm not sure your data explains this, or the volume level, but if you can explain it, please do - I've been stupidly busy and haven't had much sleep these last few days, so my brain may just be misfiring

Yep, I can explain what's happening. The recording you've taken is at 44100Hz. The native output of the YM2612 in the Mega Drive is around 52847Hz. With the output being inverted each sample, that creates a wave with a frequency of around 26423.5Hz, but since you need to be able to capture at least two full samples in order to record a wave at that frequency, the sampling rate you're recording at simply isn't high enough to capture the inversion. The samples are averaging together, which is why you're being left with a regular sine wave at mid-level. You'll also note the strange distortion on the sine waves you posted as the phase of the output causes the output to approach 0. That's because the difference between the normal and inverted output samples is being reduced by the attenuation from the phase generator, so your sound card is able to more accurately observe the true slope of the carrier wave as the output approaches 0.

If you check the massive sampled recording I made on the hardware (which is recorded at 96KHz), you'll see the inversion occurring. Measure the frequency of the generated wave (spectral view is good for that) and you'll see that it matches the expected output frequency caused by the inversion state toggling each sample. You'll also note in the recording I took that there is still some sample averaging taking place. The only way to get a true, clean sampling of the output with such a high frequency wave would be to run at a multiple of the native output sample rate from the YM2612. Running at 192KHz doesn't make things clearer, since the multiplexed output from the YM2612 becomes visible at that sample rate.

Eke wrote:Nemesis, if you want to have a look and doublecheck it (I tried to optimize it a little bit and avoid unecessary recalculation as most as I could), please check here for fm.c... any advice/corrections appreciated

I can see one thing that might be a problem, if I'm following your code correctly. When SSG-EG is active and the envelope is in the release phase, the attenuation level is still forced directly to 0x3FF when the release phase reaches 0x200. It looks like you're excluding all SSG-EG update steps when release phase is active, which would also exclude this behaviour.

And this is nitpicking, but there are also some old comments under the heading "SSG-EG envelope shapes" which should be looked at. Some of the info under this comment block will no longer match what the code actually does, eg, the bit about a restart of the phase generator.

Snake · Post by **Snake** » Thu Feb 26, 2009 11:19 pm

Nemesis wrote:The samples are averaging together, which is why you're being left with a regular sine wave at mid-level.

I did consider that, but since the sample rate is not divisible by the hardware output rate, I would have expected to see some clear distortion rather than an almost perfect wave - I can see this in other samples. Also - granted, inverted 0x200 will give you 0x000, but every value above 0x200 will also give you a value above 0x200 - I would have expected something much quieter than mid level if the two were just being averaged.

But as I said, it's not clear from this sample alone and I'd have to do more tests.

I do know that a lot of the info you posted is correct, because I confirmed some of it myself a long time ago. But back then 44KHz was "the business", and I knew it wasn't good enough. I paid a lot of money for a true 96KHz card right before I lost all my equipment anyway :-/

I'd love to be able to confirm all this stuff for myself, and see it we can't come up with a simpler or more logical way of implementing it. Not that it's complicated, just... odd. In my experience, odd usually means wrong, but I'm prepared to accept that in this case, it is probably just a strange piece of hardware hackery. Would like to be sure, though, because that's just "how I roll"

[edit] oh - while I'm here - a lot of games write to the upper 4 bits of the SSG-EG registers. I'm not sure what the programmers think they were doing this for, and I didn't test extensively, but I believe they don't do anything. Did you test this?

Nemesis · Post by **Nemesis** » Fri Feb 27, 2009 1:32 am

Snake wrote:Also - granted, inverted 0x200 will give you 0x000, but every value above 0x200 will also give you a value above 0x200 - I would have expected something much quieter than mid level if the two were just being averaged.

Ahh, but remember that the attenuation isn't moving past 0x200 during a repeating SSG-EG wave. In the pattern you posted, the attenuation decays until it reaches 0x200. When this happens, if AR was 0x1F, the attenuation would be forced back to 0. Since this isn't the case, what happens is the attenuation sits on 0x200 until the attack phase advances. When it does advance, it will drop the attenuation to 0x1DF. From this point on, you get a valid, advancing attack phase, but all the samples that were generated while the attenuation value was sitting on 0x200 will be inverting each sample.

One point where you do get an attenuation level over 0x200 being toggled each sample is when you first key-on an SSG-EG envelope with an attack rate less than 0x1F, if the current attenuation level is over 0x200 that is. Examine the hardware recording I made at the start of patterns 0xA and 0xE. You want to zoom in on the vertical axis a lot and have a really good look at the low-volume lead-in for the attack curve. The DAC starts doing horrible things with this much attenuation, but you'll be able to see the same high-frequency oscillation occurring on this wave. You'll also be able to see a "dead patch" a little before half-way along this block, where the attenuation level is around 0x300, and 0x300 inverted is still 0x300, so the high frequency oscillation reduces as the attenuation approaches 0x300, then increases again as it approaches 0x200. This constantly toggling inversion state just after the key-on event is what causes the initial state of the ATT bit to be overridden.

But as I said, it's not clear from this sample alone and I'd have to do more tests.

I do know that a lot of the info you posted is correct, because I confirmed some of it myself a long time ago. But back then 44KHz was "the business", and I knew it wasn't good enough. I paid a lot of money for a true 96KHz card right before I lost all my equipment anyway :-/

If you're in the market for a new sound card, this is the one I use for my tests:
http://www.emu.com/products/product.asp?product=9872
It's got the best ADC of any card on the market, at least that I've seen. You can pick this card up on ebay for under $200USD.

For my hardware tests, I do everything with a Tototek flash cart. When I need to sample digital data, I use test roms to dump data to RAM, and I use a Pro Action Replay to read the contents of the RAM on demand (use the trainer feature, "Slow but sure" mode, then "List possibilities").

I'd love to be able to confirm all this stuff for myself, and see it we can't come up with a simpler or more logical way of implementing it. Not that it's complicated, just... odd. In my experience, odd usually means wrong, but I'm prepared to accept that in this case, it is probably just a strange piece of hardware hackery. Would like to be sure, though, because that's just "how I roll"

I understand exactly what you mean. I'm definitely interested in hearing about any simplified, more logical implementations.

Personally, the think that irks me the most is the behaviour when the effective attack rate is 62 or 63. If it was just a matter of forcing the attenuation directly to 0 when the attack phase is entered with these rate values, well, that's one thing, but the "stalling" behaviour when changing to this rate during the attack phase bothers me. It just doesn't make any sense for that to occur. When I did the test, I expected to find that the attack curve would proceed with an increment value of 8. I'm sure I'm missing something there.

[edit] oh - while I'm here - a lot of games write to the upper 4 bits of the SSG-EG registers. I'm not sure what the programmers think they were doing this for, and I didn't test extensively, but I believe they don't do anything. Did you test this?

I can quite confidently say they do absolutely nothing. Most of my test roms that iterated through patterns, including the test ROM I just posted, use an incrementing counter which allows the upper bits of the SSG-EG mode register to be set. I left some of these roms recording for a very long time (over a day once) to generate large blocks of reference data, and I never saw any change in behaviour based on what these upper bits would have been set to. I also did do some specific tests awhile back on forcing all the upper bits of the SSG-EG mode register to set, and it had no effect.

Snake · Post by **Snake** » Fri Feb 27, 2009 2:04 am

Nemesis wrote:Ahh, but remember that the attenuation isn't moving past 0x200 during a repeating SSG-EG wave.

Ah, this is what I meant, but I didn't explain it very well

I was assuming the attenuation starts at 0x3FF, and is decremented until it hits 0x200 - at which point it will immediately switch to decay, then immediately to sustain (provided the sustain level doesn't prevent that) - and then (almost) immediately back to attack. In this case, during the attack it inverts every sample. Values in the 0x201-0x3FF range inverted are still in the 0x201-0x3FF range, which is why I found it a bit odd to see what I was seeing, and for it to have no apparent change in amplitude. Also I was half asleep and can see the errors in this thinking now.

Nemesis wrote:what happens is the attenuation sits on 0x200 until the attack phase advances.

Right. So the very first time after key on, it should look different. I'll check that again.

Nemesis wrote:One point where you do get an attenuation level over 0x200 being toggled each sample is when you first key-on an SSG-EG envelope with an attack rate less than 0x1F, if the current attenuation level is over 0x200 that is.

Ah, I see we are actually on the same page

As for sound cards and tototek flash cards - I'm pretty stuck at the moment. Don't have anything with PCI slots nor anything with a parallel port, so I need to find something decent in USB flavour.

Nemesis wrote:I can quite confidently say they do absolutely nothing.

Exactly what I thought, thanks.

Nemesis · Post by **Nemesis** » Fri Feb 27, 2009 3:30 am

As for sound cards and tototek flash cards - I'm pretty stuck at the moment. Don't have anything with PCI slots nor anything with a parallel port, so I need to find something decent in USB flavour.

Ouch. Yeah, that cuts down the options somewhat. Personally, I use an old crappy P4 box for all this stuff. I do most of my work on my laptop, and just use the test rig to transfer ROM data to the flashcart, or capture audio/video from the console. I think that's the best way to go. If you don't have an old PC lying around, you could pick one up really cheap if you look in the right place.

Eke · Post by **Eke** » Fri Feb 27, 2009 8:54 am

I can see one thing that might be a problem, if I'm following your code correctly. When SSG-EG is active and the envelope is in the release phase, the attenuation level is still forced directly to 0x3FF when the release phase reaches 0x200. It looks like you're excluding all SSG-EG update steps when release phase is active, which would also exclude this behaviour.

well, the attenuation level is forced to max during EG update and I considered that, in RELEASE phase, once the attenuation level has reached 0x200 and been forced to max, there is no need to check for SSG-EG update anymore until next KEY ON event...

see, in advance_eg_channel, I have this:

Code: Select all



            if (SLOT->volume >= 0x200)
            {
              SLOT->volume = MAX_ATT_INDEX;
              SLOT->state = EG_OFF;
            }

The use of EG_OFF state permits to avoid unecessary EG updates checking since the only way the EG output could change anymore after that is to issue a KEY ON.

However, to be more corect, I should also force the attenuation level as soon as KEY OFF occurs, in case the attenuation level is already above 0x200... there are some cases where this could happen, especially when an attack phase is programmed.

thanks for looking

PS: that's make me think to another "extreme" case
What would happen if a fast KEY-OFF / KEY ON occurs at the very start of the Decay phase(or Substain Phase with SL=0) , before any EG update have occur, so attenuation level is still 0 ? Doesn't that mean that the EG will immediately restart in the decay phase (or substain if SL =0) ?

Snake · Post by **Snake** » Fri Feb 27, 2009 10:55 pm

Nemesis wrote:Personally, the think that irks me the most is the behaviour when the effective attack rate is 62 or 63. If it was just a matter of forcing the attenuation directly to 0 when the attack phase is entered with these rate values, well, that's one thing, but the "stalling" behaviour when changing to this rate during the attack phase bothers me. It just doesn't make any sense for that to occur.

I think that the attack phase simply doesn't happen at all when the rate is >=62, it's just skipped over. So when you force it to stay inside the attack state with a rate of 62, by modifying registers afterwards, it just does nothing, and has no way of getting out of the attack state.

The table used for the envelope timings is undoubtedly stored in a small 4-bit ROM, and will be shared for A/D/S/R. But it never goes higher than 8, which means an immediate attack is impossible. An immediate attack is also desirable, even forgetting for a moment SSG-EG and CSM. It would have been awkward to implement for a few reasons - you'd need a seperate attack table to do this without also changing the D/S/R settings, and you'd need an extra bit in the table. Much easier to just special case it.

vedge · Post by **vedge** » Thu Apr 02, 2009 5:46 pm

Hi Guys

New here (but not new to FM hardware hacking). Ive been doing lots of research, and maybe this is new to you

http://www7.plala.or.jp/kikekike/fm/YM3438_APL.pdf

Now, this is another PDF that requires translation

Kind of awesome since thats the CMOS version of the 2612!

Even in Japanese, this will help me finish up my HW 2612 VGM/MIDI synth with amplifyer output setting.

Nemesis · Post by **Nemesis** » Thu Apr 02, 2009 11:41 pm

Holy crap! How long has that been online? I looked really hard for documentation on these chips, including the YM3438, and never came across this file. Thank you for posting this link, it's a major find.

Well, this is quite simply exactly what I was after at the start. The YM3438 is the YM2612 for all intensive purposes. I've just taken a look through the file, and most of what this document contains is also covered in the YM2608 document (as expected), but this document still contains valuble information which we haven't got. For one thing, it confirms what Steve suspected; they've removed the accumulator. It also formally identifies pin 10 as the TEST pin, shows the difference in the CH3 mode flags between the YM2608 and the YM3438, gives us busy flag timing information, and gives us a description of the CH6 DAC mode.

I'll start a translation of this document the same way I did the YM2608 document. Thank you again for sharing this.

Snake · Post by **Snake** » Thu Apr 02, 2009 11:42 pm

vedge wrote:Now, this is another PDF that requires translation
Kind of awesome since thats the CMOS version of the 2612!

I don't know if this document will tell us anything we didn't already know, but it does indeed appear to be hardware and software 100% identical to the YM2612. So this is a very nice thing to have, regardless. Thanks

It does confirm the 9 bit DAC which I've been saying since the dawn of time, though.

[edit] heh - Nemesis beat me to reply and seems to have noticed other nice little bits of info. Get to that translation, kind sir

vedge · Post by **vedge** » Fri Apr 03, 2009 12:17 am

Hi Guys

Was quite lucky and surprised to find it.. Just google imaged "ym3438" and that pdf was at the bottom of a page that contained a nice picture of a 3438 on a board.

Just a few days ago i read through this whole thread with great interest.
Cant wait for your translation magic.

Cheers

EDIT, btw for completeness, theres another document (hidden in the underneath directory structure)

http://www7.plala.or.jp/kikekike/fm/YM3438_Cat.pdf

Paul Jensen · Post by **Paul Jensen** » Mon Apr 06, 2009 5:00 pm

Hi everybody. Some of you might remember me as the author of (the not updated for a long, long time) GYM2MID and VGM2MID.

Anyway, I'm very interested in this new documentation about the YM3438. I've skimmed through the latest PDF, and have a little input on the matter, mostly in the way of translation. It's too bad the scan quality isn't higher. Makes OCR pretty difficult.

Nemesis wrote:For one thing, it confirms what Steve suspected; they've removed the accumulator.

Are you sure about this? An accumulator is mentioned on page 33 (page 35 of the PDF):

"The accumulator totals up the (9-bit) output sent to each slot of each channel one at a time and sends it to the D/A converter. Therefore, no special attention is required in terms of sound creation."

I don't know if this is already common knowledge, but the document also mentions some sort of overflow protection for rate calculation (page 28 of the document, page 30 of the PDF):

"'Rate' has a maximum value of 63. When a calculated value is higher than 63, the value becomes 63."

"Rate" = 2R (for example, the attack rate) * Rks (the key-scaling value).

Cool stuff, everybody. This thread is satisfying a lot of my long-standing curiosity with the YM2612.

Snake · Post by **Snake** » Tue Apr 07, 2009 8:31 pm

This is probably an error copied over from some other document. A lot of the documentation was probably copied then edited where necessary, same is probably true of most of Yamaha documentation, given the huge similarities with a lot of their chips. But we know this stuff in the YM2612 works differently from other Yamaha chips.

The Rate thing was mentioned early on in the thread, and was well known way before that