Well, what I do in my emulator probably wouldn't be very adaptable to Regen. I designed the structure of my emulator from the ground up to allow precise time comparisons to be made between cores. My emulator is driven by execution "steps" called timeslices. Each timeslice has a known length, and each core can determine its own progress through the current timeslice, and the progress of any core which it interacts with, at any point in time. A timestamp is a combination of a reference to a timeslice (which gives the length and position of the block of time in which the access occurred), and a time displacement (which gives the current progress of the core which is performing the operation through the timeslice). In this way, the timestamp isn't relative to any other core. A timestamp is represented in "real time" or "system time", with which the YM2612 is able to compare its own progress with.
You shouldn't need a system like that to solve this problem though. I really haven't looked at how other emulators like MAME and Gens manage synchronization issues, but all you should need is a running counter of some kind which you can directly relate to the Z80, M68000, and YM2612, and is high resolution enough to record the times you need. It could be as simple as a running cycle counter based on the system clock, as the clock for every chip will be a clean division from this clock source. You could store a copy of the counter indicating what it was up to the last time you advanced the YM2612 core, and from that stored value, you'll know when each buffered write should be committed.
In terms of the buffering itself, well, in my emulator I spent months designing and optimizing a generic thread-safe container to solve this problem. The container is like an array, except that it is able to store and return the correct value for any address over a range of time. When reading from or writing to the buffer, you pass a timestamp. The buffer itself has a "committed" baseline, which is just an array. On top of this, it stores a sorted list of writes that have occurred past the baseline. The YM2612 audio thread works from the baseline, and advances it in steps as it executes. As buffered writes are committed, the new value is written into the array. Reading committed data is as fast as reading from a raw array. Adding new writes to the buffer only requires adding a small structure to a linked list. Ad-hoc reads past the committed state are relatively slow (requires iteration of the buffer), but can be performed. I've only had to use them so far for the VDP when the external chip reads data from VRAM, which is a rare operation.
I only designed a system like this because I wanted to be able to solve the "unsolvable" timing problems. It's allowed me to implement support for things like mid-line changes to VRAM, and even emulate the flicker bug on CRAM writes. It is VERY accurate, but it's also VERY slow. At this point in time, my emulator doesn't run full speed on a core 2 duo E6750, and that's with each active core running in a dedicated thread. You wouldn't want to try playing a game in it. My emulator is entirely focused on debugging, reverse engineering, hardware prototyping, etc. Being able to play games full speed would just be an added bonus. Maybe in a few years time.
Again, this is overkill for this particular problem. For DAC in Regen, I'd suggest a running cycle counter like I mentioned above, and a simple linked list which for each entry stores the cycle counter at the time of the write, and the new value written. The Z80 or M68000 add new entries to the list when they write to the DAC data register, and when you advance the YM2612, you read the entries in the list and apply them. The only things you'll need to do are probably sort the linked list by write time (in case M68000 and Z80 both try and access it), and deal with the case where the cycle counter overflows.