Sega Megadrive Portable

For hardware talk only (please avoid ROM dumper stuff)
Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Mon Aug 20, 2007 5:53 pm

Shiru wrote:In case of small devices like today's PDA's, smartphones, and current-gen handhelds, assembler is not solution. These devices often has relatively fast RISC CPU's (like ARM) with really slow memory. So, it's comes very hard to make hand-written code works faster than today's optimizing C compilers generate. They still generate bunch of crap, but that crap can works (if written specially for given architecture) as fast as your beautiful and perfect code - because in the end all limited by memory speed. Only advantage in writing for these devices in assembly - you always can write much shorter code than compiler generate.

So there is no problem with C/C++ itself, there is more problem with understanding of architecture of small devices and with optimization of algorithms and it's C/C++ code for RISC CPU's and slow memory.
Did you tried Genesis Plus on your 400 Mhz PDA ? It is pure C but the code is good :)

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Mon Aug 20, 2007 8:16 pm

yes, i can confirm this: if you manage to understand how the rendering code is working (mostly in render.c) then you will know what smart code design really mean :)

Charles Mc Donald has made a really awesome work with this emulator and even if it's coded in pure C, the rendering code, for example, is damn fast,with use of lot of precalculated lookup tables and coding tricks

The only downsides might come from the 68k & YM2612 (and maybe Z80) 'C' cores, which came from MAME, are not really optimized and would perhaps slowdown the emulation on some platform, I don't know...

but on the Gamecube (~500Mhz PowerPC CPU), I have every genesis games running at full speed :wink:


PS: does someone know about the Radica ? according to devster (http://devster.monkeeh.com/sega/radica/) it's something like Genesis on a single Chip ... I really wonder how they did this, is this some kind of multiple chips "emulation" on a FPGA ?

evildragon
Very interested
Posts: 326
Joined: Mon Mar 12, 2007 1:53 am
Contact:

Post by evildragon » Mon Aug 20, 2007 9:48 pm

no, it's exactly like the NES on a chip... (which i want so badly! but don't have the money :( )

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Mon Aug 20, 2007 10:24 pm

The problem with using C/C++ vs assembly is register usage. For best speed, you need to keep certain things in registers as much as possible as the memory is SLOW (as you mentioned yourself). With C/C++ emulators, the emulated stated is always in memory. Because of this, I've yet to see a C/C++ emulator even reach half the speed of an emulator done in assembly, despite optimizing compilers. When you choose what to keep in registers correctly, even poor assembly is VASTLY superior to the best C code for emulations.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Tue Aug 21, 2007 5:39 pm

Chilly Willy wrote:The problem with using C/C++ vs assembly is register usage. For best speed, you need to keep certain things in registers as much as possible as the memory is SLOW (as you mentioned yourself). With C/C++ emulators, the emulated stated is always in memory. Because of this, I've yet to see a C/C++ emulator even reach half the speed of an emulator done in assembly, despite optimizing compilers. When you choose what to keep in registers correctly, even poor assembly is VASTLY superior to the best C code for emulations.
Good C compiler knows how to optimize registers usage. Of course they can't do the best as you can with assembler code but they are able to keep certains values in registers when you often use them (you need good C code too) =)

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Tue Aug 21, 2007 9:12 pm

Stef wrote:
Chilly Willy wrote:The problem with using C/C++ vs assembly is register usage. For best speed, you need to keep certain things in registers as much as possible as the memory is SLOW (as you mentioned yourself). With C/C++ emulators, the emulated stated is always in memory. Because of this, I've yet to see a C/C++ emulator even reach half the speed of an emulator done in assembly, despite optimizing compilers. When you choose what to keep in registers correctly, even poor assembly is VASTLY superior to the best C code for emulations.
Good C compiler knows how to optimize registers usage. Of course they can't do the best as you can with assembler code but they are able to keep certains values in registers when you often use them (you need good C code too) =)
The difference is in GLOBAL register usage. That is particularly important in emulations. The emulator writer can pick what parts to keep in registers across the entire app, while compilers are still fairly poor in global optimizations.

If I wanted local optimizations, I'd probably compile the function in C and look at the generated code. It would probably be better than you could do by hand. By the same token, the compiler wouldn't be able to determine what would be best to keep in registers across the entire program. Part of that problem has to do with the design of C/C++ emulators. The authors generally define an emulated machine state table which is allocated as one big memory block. Since some of the things that should be kept in registers are just one small piece of this big state structure, the compiler doesn't realize it has higher significance than the rest of the state structure and therefore leaves it in memory most of the time.

Maybe if the compiler had some way to define a priority of parts of a structure so that the optimization pass can focus on those areas with higher priority than the rest, maybe then C/C++ emulations would approach the speed pure assembly gives. The "register" keyword is too limiting on most compilers - it tends to reserve the register COMPLETELY for the scope of the variable specified. Again, this is something an assembly program can deal with better.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Wed Aug 22, 2007 4:33 pm

Chilly Willy wrote:
Stef wrote:
Chilly Willy wrote:The problem with using C/C++ vs assembly is register usage. For best speed, you need to keep certain things in registers as much as possible as the memory is SLOW (as you mentioned yourself). With C/C++ emulators, the emulated stated is always in memory. Because of this, I've yet to see a C/C++ emulator even reach half the speed of an emulator done in assembly, despite optimizing compilers. When you choose what to keep in registers correctly, even poor assembly is VASTLY superior to the best C code for emulations.
Good C compiler knows how to optimize registers usage. Of course they can't do the best as you can with assembler code but they are able to keep certains values in registers when you often use them (you need good C code too) =)
The difference is in GLOBAL register usage. That is particularly important in emulations. The emulator writer can pick what parts to keep in registers across the entire app, while compilers are still fairly poor in global optimizations.

If I wanted local optimizations, I'd probably compile the function in C and look at the generated code. It would probably be better than you could do by hand. By the same token, the compiler wouldn't be able to determine what would be best to keep in registers across the entire program. Part of that problem has to do with the design of C/C++ emulators. The authors generally define an emulated machine state table which is allocated as one big memory block. Since some of the things that should be kept in registers are just one small piece of this big state structure, the compiler doesn't realize it has higher significance than the rest of the state structure and therefore leaves it in memory most of the time.

Maybe if the compiler had some way to define a priority of parts of a structure so that the optimization pass can focus on those areas with higher priority than the rest, maybe then C/C++ emulations would approach the speed pure assembly gives. The "register" keyword is too limiting on most compilers - it tends to reserve the register COMPLETELY for the scope of the variable specified. Again, this is something an assembly program can deal with better.
Yep, i understand your problem with global register allocation. You have no choice, you have to localise your variables. It's what i did with C68K, the main function is huge because of that ;)

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Wed Aug 22, 2007 7:56 pm

Stef wrote:
Chilly Willy wrote:
Stef wrote: Good C compiler knows how to optimize registers usage. Of course they can't do the best as you can with assembler code but they are able to keep certains values in registers when you often use them (you need good C code too) =)
The difference is in GLOBAL register usage. That is particularly important in emulations. The emulator writer can pick what parts to keep in registers across the entire app, while compilers are still fairly poor in global optimizations.

If I wanted local optimizations, I'd probably compile the function in C and look at the generated code. It would probably be better than you could do by hand. By the same token, the compiler wouldn't be able to determine what would be best to keep in registers across the entire program. Part of that problem has to do with the design of C/C++ emulators. The authors generally define an emulated machine state table which is allocated as one big memory block. Since some of the things that should be kept in registers are just one small piece of this big state structure, the compiler doesn't realize it has higher significance than the rest of the state structure and therefore leaves it in memory most of the time.

Maybe if the compiler had some way to define a priority of parts of a structure so that the optimization pass can focus on those areas with higher priority than the rest, maybe then C/C++ emulations would approach the speed pure assembly gives. The "register" keyword is too limiting on most compilers - it tends to reserve the register COMPLETELY for the scope of the variable specified. Again, this is something an assembly program can deal with better.
Yep, i understand your problem with global register allocation. You have no choice, you have to localise your variables. It's what i did with C68K, the main function is huge because of that ;)
Ah - make it one big local function and the optimizing SHOULD make it fast. Yes, that would make for a HUGE function for most emulators. :shock: :D

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) » Thu Aug 23, 2007 11:59 am

Why not squeeze some ASM into the program ? You don't need much to gain incredible amount of speed increase... 100lines of ASM in my QB program made it 200% faster....
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Thu Aug 23, 2007 5:12 pm

ASM ins't portable. I use ASM only for non portable project and when it is really needed ;)

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Thu Aug 23, 2007 8:32 pm

Stef wrote:ASM ins't portable. I use ASM only for non portable project and when it is really needed ;)
Yes, that's the KEY point for using C. Assembly should be reserved for those times when speed is the factor above all else... or maybe when FUN is the factor above all else. :lol:

I've been dumping small example apps from the PSP SDK (via psp-objdump) to get an idea of how you call the libraries and setup the system stuff from assembly. My goal is to make a SEGA Genesis/CD/32X emulation for the PSP. Such a beast cannot possibly be done in C/C++, no matter how good. It IS on the edge of possibility for pure assembly. In this case, portability is of no consequence - it's strictly a PSP-only project. SEGA Genesis/CD is fairly easy for assembly - it's squeezing in the 32X that is pushing it. Fortunately, the extra hardware the 32X adds is simple, and the extra CPUs are pretty straightforward RISC chips. If I shuffle all the non-CPU stuff to the MediaEngine, the main CPU should just barely squeak by emulating the CPUs.

Eke
Very interested
Posts: 885
Joined: Wed Feb 28, 2007 2:57 pm
Contact:

Post by Eke » Fri Aug 24, 2007 8:33 am

while doing some googling, I find this page (all in japanese, I can't really read it) http://psp.nukenin.jp/index_E.html

which mention a port of DGEN for the PSP:
this is called Segadrive , it seems to be different from the other dgen port and is not mentionned in any "western" website

http://psp.nukenin.jp/HTM/index_SEGA_DRIVE.html

by the same guy, there is also a port of Picodrive for PSP (which is also a genesis emulator originally made by Dave and now is maintaned by Notaz, whomade the GP2X version and added segacd support)

http://psp.nukenin.jp/HTM/index_PICO_DRIVE.html

the dev' blog: http://ameblo.jp/pspdevblog/ seem also to regroup many technical infos (you can see many coding quote, it seems that this guy has done some work to improve dgen original code), too bad I can not read japanese (it seems to have some reflexion about FMemulation)


following some links on this japanese website, I found that there was also a port of Genesis Plus for PSP ! As always, I found no western website who mentionned about these port

http://psp.nukenin.jp/HTM/osakana_mirror.html


and there is also a modified version of Gens for windows :
Gens2.14Souvenir clone few "ASM" to "C" converting feautures, and with the Reality sound.
(the SEGA VDP's real the Triangultic rectangular wave's. but 76496 are Rectangular wave's. not compatibles.)
(do real machine from 315-5313 95pin sampling analysis it things)
This is megadrive/genesis only(without 32X, without MEGA/SEGA CD).(2006 Oct-1st)
I wonder what he means about the PSG waves :idea:

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Fri Aug 24, 2007 6:41 pm

The main page for PSP DGen is: http://syn-k.sakura.ne.jp/dgen_psp/

Thanks for the links - I'll check out some of these other emulators.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Sat Aug 25, 2007 10:58 am

Eke wrote:...and there is also a modified version of Gens for windows :
Gens2.14Souvenir clone few "ASM" to "C" converting feautures, and with the Reality sound.
(the SEGA VDP's real the Triangultic rectangular wave's. but 76496 are Rectangular wave's. not compatibles.)
(do real machine from 315-5313 95pin sampling analysis it things)
This is megadrive/genesis only(without 32X, without MEGA/SEGA CD).(2006 Oct-1st)
I wonder what he means about the PSG waves :idea:
triangle wave ?!! that doesn't seems realistic for me... if the VDP PSG was really generating triangle wave the sound would have been quite different.
I know the PSG doesn't generate perfect square waves, they look more like that :

Code: Select all

 ___                 ___ 
|   ---___          |   ---___
|         |_________|         |___
but that's still far from triangle waves ;)

Shiru
Very interested
Posts: 786
Joined: Sat Apr 07, 2007 3:11 am
Location: Russia, Moscow
Contact:

Post by Shiru » Sat Aug 25, 2007 9:29 pm

Maybe, it will be better to split thread to discussion about MDP and discussion about SMD emulation on handheld devices?

Stef wrote:Did you tried Genesis Plus on your 400 Mhz PDA ? It is pure C but the code is good :)
I tried it today, and that works surprisingly well. Not ideal, of course (as I understand, author did not optimize code for PPC) - with noticeable but acceptable frameskip, and with good sound, at least. Unfortunately, this emulator not shows FPS, and I was not able to edit config file at device while test, so I can't do some benchmarks. I only can say that v1.3 allow to choose between different Z80 and M68K cores, and MAME M68K core is much slower (games becomes unplayable), and change of Z80 core does not make noticeable difference.


Eke wrote:Charles Mc Donald has made a really awesome work with this emulator and even if it's coded in pure C, the rendering code, for example, is damn fast,with use of lot of precalculated lookup tables and coding tricks
I believe that, but I must notice that lookup tables can slowdown code, which works very fast on x86 PC's, on systems like PocketPC - these tables just does not fit in smaller cache (sometimes real computations works faster than precalculated tables on systems with slow memory).
Eke wrote:The only downsides might come from the 68k & YM2612 (and maybe Z80) 'C' cores, which came from MAME, are not really optimized and would perhaps slowdown the emulation on some platform, I don't know...
As I said in this post, they very slowdown emulation. And I can say, sometimes these cores not 'not really optimized', but 'really not optimized'. That not bad, of course, because these cores designed not to work fast, but to work very precisely.
Eke wrote:but on the Gamecube (~500Mhz PowerPC CPU), I have every genesis games running at full speed
I think, Gamecube does not have slow memory, like portable devices has, and also have bigger cache.
Eke wrote:PS: does someone know about the Radica ? according to devster (http://devster.monkeeh.com/sega/radica/) it's something like Genesis on a single Chip ... I really wonder how they did this, is this some kind of multiple chips "emulation" on a FPGA ?
I don't know about Radica (never seen), but all SMD clones in my country from early 2000s is built on two or single big chips. And all NES clones here from mid of 1990s is one-chip - only in early 1990s there sometimes was multi-chip clones, with external memory in DIP cases, or complete with many discrete DIP chips (including CPU and PPU).


Chilly Willy wrote:The problem with using C/C++ vs assembly is register usage. For best speed, you need to keep certain things in registers as much as possible as the memory is SLOW (as you mentioned yourself). With C/C++ emulators, the emulated stated is always in memory. Because of this, I've yet to see a C/C++ emulator even reach half the speed of an emulator done in assembly, despite optimizing compilers. When you choose what to keep in registers correctly, even poor assembly is VASTLY superior to the best C code for emulations.
I don't say that you point is wrong, I just explain my point. If we have slow memory, to make code faster, we generally must minimize access to memory. Storage of most usable variables in registers is just an particular case of that. But we can't fit all needed variables in registers anyway, and there is other methods exists. We have cache, we can temporarily move global variables to register variables, until loop execution, etc. And when we minimize memory access in C code, we get same bottleneck as in equal assembly code - for loops it's usually reading of input data stream from slow memory.


Stef wrote:
Gens2.14Souvenir clone few "ASM" to "C" converting feautures, and with the Reality sound.
(the SEGA VDP's real the Triangultic rectangular wave's. but 76496 are Rectangular wave's. not compatibles.)
triangle wave ?!!
There is written triangultic rectangular, not just triangle. This can mean something like you draw - rectangular wave smoothed with output filter.

Post Reply