CVSD Compression of Sounds
Posted: Fri Jun 26, 2009 8:32 pm
As part of my work of Wolf32X, I've been experimenting with various ways to handle the music. One thing would be to simply sample or synthesize the music to raw data, then compress it for storage in the rom. It could then be played back later by decompressing the data on the fly. The trick is to find a compression format that compresses the data enough to make it worthwhile while still being simple enough to decompress in real time with the Z80 or 68000. Doing that AND keeping the noise down is a real chore.
MP3 or Ogg-Vorbis or other such modern compression schemes are clearly out from the get-go. MP3 takes more power to decode than ogg-vorbis, and that takes a minimum of a 40-45 MHz ARM processor to decode in real-time using specially coded versions of Tremor. One of the alternatives I've been looking at is CVSD (Continuously Variable Slope Delta modulation) - this is a codec used primarily in voice communications, and features 8 to 1 compression (typically).
We start with a raw stream of 16 bit mono samples at a certain frequency - 11025 for this example. You can generate these from a MIDI file using timidity like so:
You then compress that to 1 bit per sample using the CVSD algorithm. That seems it would yield 16:1 compression - the issue is that noise is too great if you do that, so what you do instead is use each sample twice for an effective sample rate of 22050 samples per second. This gives us 8:1 compression.
To retrieve the data, you decompress the data. With the app I provide to generate raw unsigned 8 bit mono samples, you do:
While the algorithm generates 16 bit samples, I convert those to 8 bit before outputting them since that's the format they'd need to be for use by the Genesis. This app is just so you can see how the compressed data would sound once decompressed.
Let's look at the compressor...
pcm2cvsd.c
Notice that we read 4 samples, then use them twice in the compressor loop. The compression is done in the function pcm2cvsd(). It takes an input sample and bit position for the output bit. The compression is slightly different than standard CVSD algorithms - this is my own modification. Instead of changing the delta value after three or four 1s or 0s in a row, I change it as long as the current bit is the same as the previous. It doesn't seem to effect the sound quality, and it's easier to code for. The maxDelta variable is actually the most important one in this algorithm: too low and it filters out all the high frequencies; too high and you get excessive granular noise.
Now let's look at the decompressor...
cvsd2pcm.c
Notice how it reads a byte, then outputs eight samples. Each bit in the stream is a separate sample decoded by cvsd2pcm(). This routine just does the reverse of the compressor - adding the delta to the last value to give our output value. Note that I do a simple two value average to filter the granular noise a little. You could go with a more powerful filter if you have more cpu time available, or eliminate the filter for the quickest decoding possible. I then convert the 16 bit sample into an unsigned 8 bit sample.
So how does it sound? Pretty decent. Not as clean as the original, obviously, but fairly good for 8:1 compression using such a simple algorithm. You can get all the files here:
cvsd.7z
Note that only the cvsd2pcm() function needs to be converted for use in the Genesis. The raw compressed data would be embedded in the rom. Note that RLE could also be applied to this for a little more compression with virtually no extra slow down.
MP3 or Ogg-Vorbis or other such modern compression schemes are clearly out from the get-go. MP3 takes more power to decode than ogg-vorbis, and that takes a minimum of a 40-45 MHz ARM processor to decode in real-time using specially coded versions of Tremor. One of the alternatives I've been looking at is CVSD (Continuously Variable Slope Delta modulation) - this is a codec used primarily in voice communications, and features 8 to 1 compression (typically).
We start with a raw stream of 16 bit mono samples at a certain frequency - 11025 for this example. You can generate these from a MIDI file using timidity like so:
Code: Select all
timidity -Or1lM -o wolf_24.raw -s 11025 wolf_24.mid
Code: Select all
./pcm2cvsd ./wsw_music/wolf_24.raw wolf_24.bin
Code: Select all
./cvsd2pcm wolf_24.bin wolf_24.raw
Let's look at the compressor...
pcm2cvsd.c
Code: Select all
#include <string.h>
#include <stdio.h>
#include <fcntl.h>
#ifndef O_BINARY
#define O_BINARY 0
#endif
static int first = 1;
static int last_sample;
static int last_bit;
static int delta;
#define minDelta 2
#define maxDelta 1024
#define minValue -32768
#define maxValue 32767
unsigned char pcm2cvsd(short int input, int bit)
{
int this_bit = ((input - last_sample) >= 0) ? 1 : 0;
if (this_bit == last_bit)
{
delta = delta << 1;
if (delta > maxDelta)
delta = maxDelta;
}
else
{
last_bit = this_bit;
delta = delta >> 1;
if (delta < minDelta)
delta = minDelta;
}
last_sample += (this_bit) ? delta : -delta;
if (last_sample > maxValue)
last_sample = maxValue;
if (last_sample < minValue)
last_sample = minValue;
return (unsigned char)(this_bit << bit);
}
int main(int argc, char **argv)
{
int cvsd_fd, pcm_fd;
int cnt;
short int pcm_values[4];
printf("PCM to CVSD converter 1.0\n");
if (argc != 3)
{
printf("use '%s <pcmfile> <cvsdfile>' ", argv[0]);
printf("to convert <pcmfile> to <cvsdfile>\n");
return 0;
}
if ((pcm_fd = open(argv[1], O_RDONLY | O_BINARY)) == -1)
{
printf("Error opening %s\n", argv[1]);
return 0;
}
if ((cvsd_fd = open(argv[2], O_CREAT | O_TRUNC | O_RDWR | O_BINARY, 0662)) == -1)
{
printf("Error creating %s\n", argv[2]);
return 0;
}
printf("PCM to CVSD conversion started\n");
memset(pcm_values, 0, 8);
while ((cnt = read(pcm_fd, pcm_values, 8)))
{
unsigned char cvsd_value = 0;
if (first)
{
first = 0;
last_sample = 0;
last_bit = (pcm_values[0] >= 0) ? 0 : 1; // make last_bit the inverse of the first bit in stream
}
cvsd_value |= pcm2cvsd(pcm_values[0], 7);
cvsd_value |= pcm2cvsd(pcm_values[0], 6);
cvsd_value |= pcm2cvsd(pcm_values[1], 5);
cvsd_value |= pcm2cvsd(pcm_values[1], 4);
cvsd_value |= pcm2cvsd(pcm_values[2], 3);
cvsd_value |= pcm2cvsd(pcm_values[2], 2);
cvsd_value |= pcm2cvsd(pcm_values[3], 1);
cvsd_value |= pcm2cvsd(pcm_values[3], 0);
write(cvsd_fd, &cvsd_value, 1);
memset(pcm_values, 0, 8);
}
close(cvsd_fd);
close(pcm_fd);
printf("PCM to CVSD conversion completed\n");
return 0;
}
Now let's look at the decompressor...
cvsd2pcm.c
Code: Select all
#include <string.h>
#include <stdio.h>
#include <fcntl.h>
#ifndef O_BINARY
#define O_BINARY 0
#endif
static int first = 1;
static int last_sample;
static int last_bit;
static int delta;
static int filter_sample;
#define minDelta 2
#define maxDelta 1024
#define minValue -32768
#define maxValue 32767
unsigned char cvsd2pcm(unsigned char input, int bit)
{
int this_sample;
int this_bit = (input >> bit) & 1;
if (this_bit == last_bit)
{
delta = delta << 1;
if (delta > maxDelta)
delta = maxDelta;
}
else
{
last_bit = this_bit;
delta = delta >> 1;
if (delta < minDelta)
delta = minDelta;
}
last_sample += (this_bit) ? delta : -delta;
if (last_sample > maxValue)
last_sample = maxValue;
if (last_sample < minValue)
last_sample = minValue;
this_sample = (last_sample + filter_sample)>>1;
filter_sample = this_sample;
return (unsigned char)((this_sample >> 8) + 128);
}
int main(int argc, char **argv)
{
int cvsd_fd, pcm_fd;
int cnt;
unsigned char cvsd_value;
printf("CVSD to PCM converter 1.0\n");
if (argc != 3)
{
printf("use '%s <cvsdfile> <pcmfile>' ", argv[0]);
printf("to convert <cvsdfile> to <pcmfile>\n");
return 0;
}
if ((cvsd_fd = open(argv[1], O_RDONLY | O_BINARY)) == -1)
{
printf("Error opening %s\n", argv[1]);
return 0;
}
if ((pcm_fd = open(argv[2], O_CREAT | O_TRUNC | O_RDWR | O_BINARY, 0662)) == -1)
{
printf("Error creating %s\n", argv[2]);
return 0;
}
printf("CVSD to PCM conversion started\n");
while ((cnt = read(cvsd_fd, &cvsd_value, 1)))
{
unsigned char pcm_value;
if (first)
{
first = 0;
last_sample = 0;
last_bit = (~cvsd_value >> 7) & 1; // make last_bit the inverse of the first bit in stream
filter_sample = 0;
}
pcm_value = cvsd2pcm(cvsd_value, 7);
write(pcm_fd, &pcm_value, 1);
pcm_value = cvsd2pcm(cvsd_value, 6);
write(pcm_fd, &pcm_value, 1);
pcm_value = cvsd2pcm(cvsd_value, 5);
write(pcm_fd, &pcm_value, 1);
pcm_value = cvsd2pcm(cvsd_value, 4);
write(pcm_fd, &pcm_value, 1);
pcm_value = cvsd2pcm(cvsd_value, 3);
write(pcm_fd, &pcm_value, 1);
pcm_value = cvsd2pcm(cvsd_value, 2);
write(pcm_fd, &pcm_value, 1);
pcm_value = cvsd2pcm(cvsd_value, 1);
write(pcm_fd, &pcm_value, 1);
pcm_value = cvsd2pcm(cvsd_value, 0);
write(pcm_fd, &pcm_value, 1);
}
close(cvsd_fd);
close(pcm_fd);
printf("CVSD to PCM conversion completed\n");
return 0;
}
So how does it sound? Pretty decent. Not as clean as the original, obviously, but fairly good for 8:1 compression using such a simple algorithm. You can get all the files here:
cvsd.7z
Note that only the cvsd2pcm() function needs to be converted for use in the Genesis. The raw compressed data would be embedded in the rom. Note that RLE could also be applied to this for a little more compression with virtually no extra slow down.