Note that the 8 MB version can work only with Mega Everdrive or custom flash cart supporting full 8 MB mapping (without SSF2 bank switch style).
Also some special emulator can support it as well (as this one).
Last edited by Stef on Wed Jun 21, 2017 8:47 am, edited 18 times in total.
Well, it's looking good - now you just have to get the speed up a bit. nice job on the compression - getting it to look decent and fit in 4M is no easy task.
Speed is a bit better (not that much) but at least i fixed last bugs. The video is not anymore choppy and buggy in some place
I am still in C but even with ASM i will need very important optimizations to get things at correct speed.
Better speed but still far from what i need :p
I had to add a 128 KB lookup table to improve the 2 bpp to 4 bpp tile conversion speed... that hurts when you are so close from the 4 MB limit.
I passed almost all the tile unpack algo to ASM code.
Unfortunately that is still too slow :-/
I do not see much more room for big improvements now...
~600000 tiles on the total 850000 tiles are packed with dictionary method.
Unfortunately the dictionary unpack code is the more complex and slowest one : i believe that i have 20% to 70% of CPU time (depending the frame complexity) eat in that code.
I profiled time to unpack a single 2bpp tile with dictionary method : 5 to 16 scanlines (close to 8000 cycles in worst case) ! And we can have 250 tiles to unpack per frame. I think i should find a simpler unpacking method :p
The speed on the latest is actually rather good. While it's still slow on large changes, it's not THAT slow - just not real-time. There are packing schemes that are very fast depacking... of course, the tradeoff is usually space. You won't know until you try.
Yeah speed is much better in the last version, basically because i moved to ASM all "bitstream" code (read a buffer bit per bit) which is used everywhere as well as the tile unpack code. Some others parts can be ported to ASM but there are not the bottleneck so i don't bother with them...
As you said all the problem is to find the good trade off between speed and space. I will try to find how i can simplify compression code without sacrificing too much space... There are always ways to do better :p