Yet Another Z80 PCM Compression Example
Posted: Fri Apr 06, 2012 6:13 am
Posts by another user about using TADPCM in BEX got me to thinking about PCM compression again. TADPCM is VERY similar to BTC - Block Truncation Coding. BTC is used in video compression; it breaks an image into 4x4 blocks of pixels, then computes the mean and standard deviation. The purpose of BTC is to preserve the standard deviation of a block of pixels while compressed. However, the standard deviation is a little more complex than people want to spend their time on, so a variation called AMBTC was derived.
AMBTC is Absolute Moment Block Truncation Coding. Instead of preserving the standard deviation, it preserves the absolute moment since that is MUCH easier to calculate. Encoding is done like this:
First compute the mean of the 16 values:
Then find the number of inputs greater than or equal to the mean (this is called k):
Now you find the moment; note that the moment has a high and low value - the average of the values above and below the mean:
and
Note that the number of values below the mean (16-k) can be 0 if all the values are the same since k is >= the mean.
Now encoding the pixels is simple - just go though and compare to the mean; output a 1 if it is greater than or equal to the mean, and a 0 if it is less than the mean.
And you're done! While this is meant for images, it CAN be applied to audio as well. Just take 16 consecutive samples using 8-bit unsigned samples for the audio... just like the YM2612 DAC uses. The compressed data consists of packets of 16 samples compressed to a high byte, a low byte, and two bytes of bits representing the samples. That's two bits per sample, or 4:1 compression compared to the original 8-bit samples. You could also do the same thing over 8 samples instead of 16. That gives packets of 8 samples compressed to a high byte, a low byte, and one byte of bits representing the samples. That gives 3 bits per sample, and as you expect, sounds better.
Decoding is ridiculously easy - just go through the bits, and when you find a set bit output the high byte, otherwise output the low byte. That's it. Here's the core of the Z80 decompressor:
The key part is
You shift the byte and jump based on the bit shifted into the carry flag. I do a VERY simple filter to make it sound slightly better - I output the average of the current and the last samples. Heavier filtering can make it sound better, especially the 2 bits per sample output, but would take more time. Even just storing the high and low without any averaging isn't too bad.
One difference from my CVSD examples is that I include a delay loop in the sample output so you can vary the sample rate of the playback. The three bits per sample is
Fs ~= 8 * 3.58M / (860 + 112 * N)
while the two bits per sample is
Fs ~= 16 * 3.58M / (1589 + 224 * N)
Using N=4 gives a rate of about 22kHz for the 3-bit decompressor, and about 23kHz for the 2-bit decompressor.
Here's the archive with both examples, including rom images, source, and linux binaries on the compressor and decompressor (for previewing how the compressed audio sounds).
audio-ambtc.7z
To make your own compressed audio clips, first convert the sounds to mono 22 or 23 kHz 8-bit unsigned raw data:
For the 3-bit example, or
For the 2-bit example. Then use my compressor program to make the compressed files used by the examples:
To decompress on the PC to preview the sound, just do
Then you can use something like mplayer to listen to it:
AMBTC is Absolute Moment Block Truncation Coding. Instead of preserving the standard deviation, it preserves the absolute moment since that is MUCH easier to calculate. Encoding is done like this:
First compute the mean of the 16 values:
Code: Select all
for (i=0, mean=0; i<16; i++)
mean += input[i];
mean >>= 4;
Code: Select all
for (i=0, k=0; i<16; i++)
if (input[i] >= mean)
k++;
Code: Select all
for (i=0, high=0; i<16; i++)
if (input[i] >= mean)
high += input[i];
high /= k;
Code: Select all
if (16-k)
{
for (i=0, low=0; i<16; i++)
if (input[i] < mean)
low += input[i];
low /= (16-k);
}
Now encoding the pixels is simple - just go though and compare to the mean; output a 1 if it is greater than or equal to the mean, and a 0 if it is less than the mean.
Code: Select all
for (i=0, array = 0; i<16; i++)
if (input[i] >= mean)
array |= (1 << (15-i));
Decoding is ridiculously easy - just go through the bits, and when you find a set bit output the high byte, otherwise output the low byte. That's it. Here's the core of the Z80 decompressor:
Code: Select all
; best time in code outside sample loop is 175 cycles
outer_loop
LD A, (PAUSE) ; 13
OR A ; 4
JP NZ, pause ; 10 playback paused
resume
LD D, (IY+0) ; 19 X high
INC IY ; 10
DEC XH ; 8
CALL Z, expired ; 10/17
LD E, (IY+0) ; 19 X low
INC IY ; 10
DEC XH ; 8
CALL Z, expired ; 10/17
LD C, (IY+0) ; 19 sample flags
INC IY ; 10
DEC XH ; 8
CALL Z, expired ; 10/17
LD B,8 ; 7
; total time of this loop is (675 + 112*DELAY) cycles
sample_loop1
SLA C ; 8 check flag
JP C, out_high1 ; 10 flag set
LD A, XL ; 8 A = last sample
LD XL, E ; 8 last sample = current sample
ADD E ; 4 current sample + last sample
RRA ; 4 sample = (current sample + last sample) / 2
LD (HL), A ; 7 set DAC
JP next1 ; 10 next sample
out_high1
LD A, XL ; 8 A = last sample
LD XL, D ; 8 last sample = current sample
ADD D ; 4 current sample + last sample
RRA ; 4 sample = (current sample + last sample) / 2
LD (HL), A ; 7 set DAC
JP next1 ; 10 next sample - this jump is to keep the timing the same
next1
LD A, (DELAY) ; 13 get sample rate delay count
delay1
DEC A ; 4
JP NZ,delay1 ; 10
DJNZ sample_loop1 ; 13*8-5 for all 8 samples
; best time is 54 cycles
LD C, (IY+0) ; 19 sample flags
INC IY ; 10
DEC XH ; 8
CALL Z, expired ; 10/17
LD B,8 ; 7
; total time of this loop is (675 + 112*DELAY) cycles
sample_loop2
SLA C ; 8 check flag
JP C, out_high2 ; 10 flag set
LD A, XL ; 8 A = last sample
LD XL, E ; 8 last sample = current sample
ADD E ; 4 current sample + last sample
RRA ; 4 sample = (current sample + last sample) / 2
LD (HL), A ; 7 set DAC
JP next2 ; 10 next sample
out_high2
LD A, XL ; 8 A = last sample
LD XL, D ; 8 last sample = current sample
ADD D ; 4 current sample + last sample
RRA ; 4 sample = (current sample + last sample) / 2
LD (HL), A ; 7 set DAC
JP next2 ; 10 next sample - this jump is to keep the timing the same
next2
LD A, (DELAY) ; 13 get sample rate delay count
delay2
DEC A ; 4
JP NZ,delay2 ; 10
DJNZ sample_loop2 ; 13*8-5 for all 8 samples
JP outer_loop ; 10
Code: Select all
SLA C ; 8 check flag
JP C, out_high1 ; 10 flag set
LD A, XL ; 8 A = last sample
LD XL, E ; 8 last sample = current sample
ADD E ; 4 current sample + last sample
RRA ; 4 sample = (current sample + last sample) / 2
LD (HL), A ; 7 set DAC
JP next1 ; 10 next sample
out_high1
LD A, XL ; 8 A = last sample
LD XL, D ; 8 last sample = current sample
ADD D ; 4 current sample + last sample
RRA ; 4 sample = (current sample + last sample) / 2
LD (HL), A ; 7 set DAC
JP next1 ; 10 next sample - this jump is to keep the timing the same
One difference from my CVSD examples is that I include a delay loop in the sample output so you can vary the sample rate of the playback. The three bits per sample is
Fs ~= 8 * 3.58M / (860 + 112 * N)
while the two bits per sample is
Fs ~= 16 * 3.58M / (1589 + 224 * N)
Using N=4 gives a rate of about 22kHz for the 3-bit decompressor, and about 23kHz for the 2-bit decompressor.
Here's the archive with both examples, including rom images, source, and linux binaries on the compressor and decompressor (for previewing how the compressed audio sounds).
audio-ambtc.7z
To make your own compressed audio clips, first convert the sounds to mono 22 or 23 kHz 8-bit unsigned raw data:
Code: Select all
sox -v 0.95 BadAppleEn.ogg -t raw -u -b 8 -c 1 -r 21893 BadAppleEn.raw
Code: Select all
sox -v 0.95 BadAppleEn.ogg -t raw -u -b 8 -c 1 -r 23047 BadAppleEn.raw
Code: Select all
./pcm2ambtc BadAppleEn.raw BadAppleEn.amb
Code: Select all
./ambtc2pcm BadAppleEn.amb BadAppleEn-preview.raw
Code: Select all
mplayer -af volume=-5 -rawaudio samplesize=1:channels=1:rate=21893 -demuxer rawaudio BadAppleEn-preview.raw