Active Disassembly
Moderator: Nemesis
Active Disassembly
Hello all,
I've tried to run mortal kombat (rev 01) on the newest version of exodus 2.1 and found that it randomly locks up and does not even allow for a crash report. Additionally, im curious how to go about getting a good clean asm output of the dis-assembly. are there documented steps for this? Im looking to do some rom hacking and finding out where and how specific routines work would be great however i cant reliably run exodus without freezing and any dumps i DO end up getting are around 166mb which doesn't seem right for a 2mb rom file.. The system im running is a brand new clean x64 version of windows 8.1 all up to date with 8GB of system ram. It seems this is plenty for whats needed
			
			
									
						
										
						I've tried to run mortal kombat (rev 01) on the newest version of exodus 2.1 and found that it randomly locks up and does not even allow for a crash report. Additionally, im curious how to go about getting a good clean asm output of the dis-assembly. are there documented steps for this? Im looking to do some rom hacking and finding out where and how specific routines work would be great however i cant reliably run exodus without freezing and any dumps i DO end up getting are around 166mb which doesn't seem right for a 2mb rom file.. The system im running is a brand new clean x64 version of windows 8.1 all up to date with 8GB of system ram. It seems this is plenty for whats needed
Re: Active Disassembly
You are getting impossible results because there is code and data all mixed, and the plain disassembly you are using can't differentiate them, even worst can get confused between them.
There are two ways:
- Use a complex disassembly like IDA Pro, which allows you to emulate and/or disassembly from certain point. The problem is as far as I know IDA can't emulate the full Megadrive system. Perhaps there are extensions, I don't know.
- Use an emulator with a debugger built in, like Regen. This can help with the crash too. The problem is that the disassembly abilities are kind of limited in space, in the long shoot.
So, use both.
 
What you have to end up is with functions/procedures that have a start point and an end point, this pieces of binary must be disassembled; and pieces of data that must be defined as data for the compiler.
For example, start all as data and then, you find those procedures and proceed to disassembly one by one.
If you want to only find a piece of code that crashes, perhaps Regen is just enough.
Edit: can be even data inside a procedure, any kind of table for example; so you need always this interactive ability to start/stop disassembling.
			
			
									
						
							There are two ways:
- Use a complex disassembly like IDA Pro, which allows you to emulate and/or disassembly from certain point. The problem is as far as I know IDA can't emulate the full Megadrive system. Perhaps there are extensions, I don't know.
- Use an emulator with a debugger built in, like Regen. This can help with the crash too. The problem is that the disassembly abilities are kind of limited in space, in the long shoot.
So, use both.
What you have to end up is with functions/procedures that have a start point and an end point, this pieces of binary must be disassembled; and pieces of data that must be defined as data for the compiler.
For example, start all as data and then, you find those procedures and proceed to disassembly one by one.
If you want to only find a piece of code that crashes, perhaps Regen is just enough.
Edit: can be even data inside a procedure, any kind of table for example; so you need always this interactive ability to start/stop disassembling.
HELP. Spanish TVs are brain washing people to be hostile to me.
			
						- 
				ryanfaescotland
- Very interested
- Posts: 53
- Joined: Mon Feb 09, 2015 10:46 pm
- Contact:
Re: Active Disassembly
Have you used the Active Disassembly feature of Exodus before? Doesn't seem like it...Miquel wrote: Fri Nov 09, 2018 2:37 pm You are getting impossible results because there is code and data all mixed, and the plain disassembly you are using can't differentiate them, even worst can get confused between them...
Re: Active Disassembly
No, I haven’t.
- How it works ? Perhaps it just marks the rom as code or data as it emulates ?
- Can be integrated into an asm file easily ?
			
			
									
						
							- How it works ? Perhaps it just marks the rom as code or data as it emulates ?
- Can be integrated into an asm file easily ?
HELP. Spanish TVs are brain washing people to be hostile to me.
			
						- 
				ryanfaescotland
- Very interested
- Posts: 53
- Joined: Mon Feb 09, 2015 10:46 pm
- Contact:
Re: Active Disassembly
That would be the nutshell version yes, although it generates the assembly for the entire game using prediction as well. I just created a disassembly of Fatal Labyrinth using the active disassembler and with just a little tweaking of the end result I created the code to generate a bit perfect copy of the original binary from the source code Exodus generated. This includes the separation of data and code (admittedly I don't know if it has done this perfectly or not though!) 
			
			
									
						
										
						
Re: Active Disassembly
That’s right, better some intelligent disassembly than a raw one, even if there are mistakes. An that’s why point to point disassembly helps a lot, since you can change it while you are reading.
 
Still on an automatic dissasembly there are occasions where is not possible to know if a piece of data is a code or a data, unless you emulate this piece of code, and to reach certain point you need external events, pretty often on a video game, which an emulated disassembly can’t do appropriately.
One thing more: since 68k uses byte and short relative addressing quite often as a speed optimization, not always can you separate data and code, as in two files.
			
			
									
						
							Still on an automatic dissasembly there are occasions where is not possible to know if a piece of data is a code or a data, unless you emulate this piece of code, and to reach certain point you need external events, pretty often on a video game, which an emulated disassembly can’t do appropriately.
One thing more: since 68k uses byte and short relative addressing quite often as a speed optimization, not always can you separate data and code, as in two files.
HELP. Spanish TVs are brain washing people to be hostile to me.
			
						Re: Active Disassembly
That's the key to the active disassembly feature in Exodus Miquel, is that it does actually emulate the game, and uses real-world results from executing the actual code to guide the disassembly process. You actually perform your disassembly by playing the game, while active disassembly runs in the background, gathering information on what's occurring at runtime. This includes mapping not only which areas of the ROM are actually code beyond what can be done by static disassembly tools, like being able to follow through interrupt handlers and data-dependent jump tables, it also builds information about how data is accessed, to predict data structures and arrays within the ROM, as well as identify values that are used as offsets rather than simply data. It is also able to gather information over repeated lookups into traditionally opaque structures like jump tables and offset arrays, to predict their bounds, and explore down branches that were not taken. This is quite a powerful feature I haven't seen a parallel for elsewhere. It was born out of my own experiences making a Sonic 2 disassembly years ago, which I did by modifying Gens to spit out the PC address at each opcode step to a log file. I fed that into IDA Pro to make a fairly comprehensive disassembly. It still took a month of cleanup after that though to fix offsets, explore additional code paths through jump tables, correctly identify and format data regions, and so on. Active disassembly nowadays can do that for me in a few hours. While trace logs can be generated from other emulators which are comparable with what I did in Gens all those years back, that's only one aspect of what active disassembly involves, although it's certainly the most important.
			
			
									
						
										
						- 
				ryanfaescotland
- Very interested
- Posts: 53
- Joined: Mon Feb 09, 2015 10:46 pm
- Contact:
Re: Active Disassembly
Do you think it could be continued to the point of simulating input Nemesis?
My thinking being, there is a fixed set of values you'd expect each system to give as input at each stage, sticking with the Megadrive you've got the typical 6 button pad input. Could there be value in simulating different input on multiple iterations of the code to see what paths are taken?
			
			
									
						
										
						My thinking being, there is a fixed set of values you'd expect each system to give as input at each stage, sticking with the Megadrive you've got the typical 6 button pad input. Could there be value in simulating different input on multiple iterations of the code to see what paths are taken?
Re: Active Disassembly
That probably won't yield very much, as the active disassembly feature can already scan down known code execution pathways even if they aren't actually seen being taken while the game is running. Most opcodes have known potential resulting addresses, including your basic conditional branch instructions, which is how I expect input would generally be processed. Those pathways are already scanned down as predicted code addresses. It's really your blind data-dependent jumps that cause problems, IE, "read a value from memory address X, and jump to code location X+Y". You wouldn't normally see that kind of code in connection with input data I'd expect, but even if you did, once a few entries in a jump table like that have been explored, active disassembly usually has enough information to start estimating the table bounds and predicting other entries.
			
			
									
						
										
						Re: Active Disassembly
I have just tested Exodus this past days and I keep my words, predictive disassembly while being much better than plain disassembly still generates holes of unknown data, here and there. Verified with my own game, so no maybe's.
Active disassembly is pretty much perfect, but you need to run all code, that means playing to the last detail. Point to point disassembly can help here.
Still when code executes from RAM there are some problems, seems to lost track sometimes.
			
			
									
						
							Active disassembly is pretty much perfect, but you need to run all code, that means playing to the last detail. Point to point disassembly can help here.
Still when code executes from RAM there are some problems, seems to lost track sometimes.
HELP. Spanish TVs are brain washing people to be hostile to me.
			
						- 
				ryanfaescotland
- Very interested
- Posts: 53
- Joined: Mon Feb 09, 2015 10:46 pm
- Contact:
Re: Active Disassembly
Can you show some examples of the holes? Would be interesting and beneficial to all to see the potential pitfalls we may face.
			
			
									
						
										
						Re: Active Disassembly
Branch instructions are well behaved since both destinations are easily known, but not necessarily with jump instructions. A text book example can be what a “switch” is in C, it works more or less like this:
d0 <= switch value
add.w d0, d0
move.w cases(d0.w,pc), d0
jmp cases(d0.w,pc)
cases:
.word case1- cases
.word case2- cases
.word case3- cases
case1:
case2:
case3:
Notice that there are plenty of variations, this is only the most standard.
			
			
									
						
							d0 <= switch value
add.w d0, d0
move.w cases(d0.w,pc), d0
jmp cases(d0.w,pc)
cases:
.word case1- cases
.word case2- cases
.word case3- cases
case1:
case2:
case3:
Notice that there are plenty of variations, this is only the most standard.
HELP. Spanish TVs are brain washing people to be hostile to me.
			
						Re: Active Disassembly
Those kind of structures are the ones that Active Disassembly in Exodus is designed to handle. You will need to follow a jump table like that through at least once before Exodus can even attempt to scan past it, but if you have a table with, say 20 entries, once you've hit around 5 of them, I'd expect the full 20 to be successfully mapped out. Larger tables are more favourable, so 200 entries still might only take 5 or 10. It may take more or less (probably less), but Exodus has specific code designed to detect and analyze these kind of structures, as well as variations and more complex ones, such as raw jump tables where you do a jump to another jump operation, or when you have a data structure and the offset is a field in that structure, so the offsets aren't tightly packed. All that kind of stuff is handled. It won't be perfect, but it works very well in practice, and the more you let the emulator traverse the various execution pathways, the more accurate it becomes. You can save and load your progress building the active disassembly info, and you can run the emulator without throttling to make it faster. Basically, I just enable active disassembly, turn throttling off, and bash the game/program for awhile, doing things I expect to explore as many logic pathways as possible. That'll do in 10 minutes or an hour what would take you weeks without it. If you hit a raw jump like that though, it's a brick wall unless you've explored at least one output entry. Once you give Exodus that "anchor" of a known resulting target though, it has context to start scanning out from, and it will do that. It won't blindly guess though, you need to hit that table at least once.
			
			
									
						
										
						Re: Active Disassembly
Agree, I believe I said that mentioned holes occurs when predictive disassembly is kicking in.
			
			
									
						
							HELP. Spanish TVs are brain washing people to be hostile to me.