Assemblers You Use

Ask anything your want about Megadrive/Genesis programming.

Moderator: BigEvilCorporation

bgvanbur
Interested
Posts: 46
Joined: Fri Jun 22, 2012 11:13 pm

Assemblers You Use

Post by bgvanbur » Fri Apr 25, 2014 9:13 pm

I am working on a 68k/z80 disassembler. The main goal is to make the output assemble. It also has features like code path tracing and can use the address list output from blastem and address pointers in the cartridge ROM header as a list of entry code points. One of the annoying issues is to make it work on several assemblers but I want the output code to be as portable as possible.

Currently I support:
asl (68k and z80)
asmx (68k and z80)
asm68k
SNASM68K
pasmo (z80)
z80asm

Does anyone else use another 68k or z80 assembler with a similar format to these assemblers?

powerofrecall
Very interested
Posts: 237
Joined: Fri Apr 17, 2009 7:35 pm
Location: USA

Post by powerofrecall » Fri Apr 25, 2014 10:21 pm

vasm with the motorola syntax module for 68k, vasm with "old style" syntax for z80

edit: I think both of these assemblers with those syntax modules are generic, I know the motorola module accepts all the pretty standard 68k syntax & pseudo-ops, you might not have to go out of your way to support them

bgvanbur
Interested
Posts: 46
Joined: Fri Jun 22, 2012 11:13 pm

Post by bgvanbur » Sat Apr 26, 2014 11:09 am

Thanks for vasm recommendation!

I was able to get vasm 68k to match (once I figured out -no-opt, it has a lot of cool optimizations though for when you don't want to perfectly match unoptimized code, which is making me reconsider my default 68k assembler). The z80 is being annoying though, but hopefully I can get that to work too.

kubilus1
Very interested
Posts: 237
Joined: Thu Aug 16, 2012 2:25 am
Contact:

Post by kubilus1 » Sat Apr 26, 2014 3:37 pm

I use the GNU assembler (as).

Gigasoft
Very interested
Posts: 95
Joined: Fri Jan 01, 2010 2:24 am

Post by Gigasoft » Sat Apr 26, 2014 8:48 pm

I use TASM (Telemark Assembler) for Z80.

Generating an output that can be modified and then reassembled is extremely challenging, due to the need to locate all pointers. They can be in the immediate operand of an instruction, they could be in a table, they could be structure members in a table of structures, or worse. Then, some pointers may require a displacement to determine what they are really referring to. Furthermore, if they are code pointers, the code they point to should be analyzed. IDA Pro, a popular product that costs thousands of dollars, still isn't very good at guessing and leaves most of the work to the user. I'd really love to have a disassembler that would actually make an effort to save me some time by tracing unknown pointer types across function calls and generating cross references to structure offsets in the correct structure automatically, like IDA Pro already does with absolute references. Then it could use this information to automatically assign correct types to values that are assigned to structure fields.

bgvanbur
Interested
Posts: 46
Joined: Fri Jun 22, 2012 11:13 pm

Post by bgvanbur » Sun Apr 27, 2014 1:37 am

At some point I will make a an option to support the gnu assembler, but its syntax is completely incompatible with all the other mentioned 68k assemblers.

The TASM is quite tame and would be easy to support.

But you are right disassembling is quite rough with addresses hiding in immediate data, or code tables. Or An regs used for data (an extra register). And then you have some things like assuming a JSR will return and execute the next instruction (I found a game where the initializationg has a JSR and has invalid 68k after the JSR indicating it probably does not ever return).

So far my disasm will read in all my labels/equates from my listing file and if needed make generic labels for code points. It will also parse ;; CODE comments from the listing and treat those as code points too. It does basic code point determination based on the instruction decoded and if it is sequential, jumping, conditional branching, etc and makes generic labels based on the address (unless a label was read in already from a list file). It can also do a blind disassemble everything mode after the smart code point checking. It also does some basic multi line code analysis, such as:

LEA <addr>,Ax
MOVEA.L d8(Ax,Dy),Az
<JMP|JSR> (Az)

And place a ;; CODETBL comment at <addr> indicating it is a code table and I have few other code table segments checked too. You can't know the start or end of the code table so the user needs to look at the code pointing to the code table and the code table data itself to determine what the start and end of the code table is (many times it starts at index 0 and the first code entry pointed to directly follows the code table).

powerofrecall
Very interested
Posts: 237
Joined: Fri Apr 17, 2009 7:35 pm
Location: USA

Post by powerofrecall » Sun Apr 27, 2014 3:43 am

Gigasoft wrote:I use TASM (Telemark Assembler) for Z80.

Generating an output that can be modified and then reassembled is extremely challenging, due to the need to locate all pointers. They can be in the immediate operand of an instruction, they could be in a table, they could be structure members in a table of structures, or worse. Then, some pointers may require a displacement to determine what they are really referring to. Furthermore, if they are code pointers, the code they point to should be analyzed. IDA Pro, a popular product that costs thousands of dollars, still isn't very good at guessing and leaves most of the work to the user. I'd really love to have a disassembler that would actually make an effort to save me some time by tracing unknown pointer types across function calls and generating cross references to structure offsets in the correct structure automatically, like IDA Pro already does with absolute references. Then it could use this information to automatically assign correct types to values that are assigned to structure fields.
You and me both buddy. I've been disassembling 68k konami stuff and they are big fans of pointer tables for anything & everything. I think they deserve a special place in hell. It's strange because once you get rolling it makes chunks of code easy to pick out and there is a certain logical structure to it all but you'll eventually get bogged down in separating data tables from pointer tables and pointers to data tables and pointers to pointers and so on. A ton of jumps are register jumps. IDA doesn't know what to do with any of this stuff and damn sure can't figure out data types. I imagine the 68k isn't a lead target for IDA though.

You'll literally get redundant weird code that looks like this (pointer to... another pointer!)
lea (pointer).l,a6
movea.l 0(a6),a6

on top of the standard pointer + displacement and some functions even use tables for the displacement offsets. It's nuts and really hard to work through backward. In some of their 68k arcade games they use tables even for flow control (reading from the table, comparing against an immediate constant and using i.e. bpl/bmi to branch) and they're nearly impossible to follow. I would love to see the flowcharts these guys were working off of.

I'm just bitching at this point but I'd love to see a disassembler that can help out with some of this stuff...

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Re: Assemblers You Use

Post by MintyTheCat » Wed Apr 30, 2014 12:46 pm

bgvanbur wrote:I am working on a 68k/z80 disassembler. The main goal is to make the output assemble. It also has features like code path tracing and can use the address list output from blastem and address pointers in the cartridge ROM header as a list of entry code points. One of the annoying issues is to make it work on several assemblers but I want the output code to be as portable as possible.

Currently I support:
asl (68k and z80)
asmx (68k and z80)
asm68k
SNASM68K
pasmo (z80)
z80asm

Does anyone else use another 68k or z80 assembler with a similar format to these assemblers?
That's a good area to work on as Disassemblers are always necessary.

It makes sense to support ASM68K and NASM68k but also consider GNU's GAS. It is still actively worked on and indeed SNASM and NASM 68K both have limitations.

I could not comment on Z80.

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Post by MintyTheCat » Wed Apr 30, 2014 12:51 pm

powerofrecall wrote:
Gigasoft wrote:I use TASM (Telemark Assembler) for Z80.

Generating an output that can be modified and then reassembled is extremely challenging, due to the need to locate all pointers. They can be in the immediate operand of an instruction, they could be in a table, they could be structure members in a table of structures, or worse. Then, some pointers may require a displacement to determine what they are really referring to. Furthermore, if they are code pointers, the code they point to should be analyzed. IDA Pro, a popular product that costs thousands of dollars, still isn't very good at guessing and leaves most of the work to the user. I'd really love to have a disassembler that would actually make an effort to save me some time by tracing unknown pointer types across function calls and generating cross references to structure offsets in the correct structure automatically, like IDA Pro already does with absolute references. Then it could use this information to automatically assign correct types to values that are assigned to structure fields.
You and me both buddy. I've been disassembling 68k konami stuff and they are big fans of pointer tables for anything & everything. I think they deserve a special place in hell. It's strange because once you get rolling it makes chunks of code easy to pick out and there is a certain logical structure to it all but you'll eventually get bogged down in separating data tables from pointer tables and pointers to data tables and pointers to pointers and so on. A ton of jumps are register jumps. IDA doesn't know what to do with any of this stuff and damn sure can't figure out data types. I imagine the 68k isn't a lead target for IDA though.

You'll literally get redundant weird code that looks like this (pointer to... another pointer!)
lea (pointer).l,a6
movea.l 0(a6),a6

on top of the standard pointer + displacement and some functions even use tables for the displacement offsets. It's nuts and really hard to work through backward. In some of their 68k arcade games they use tables even for flow control (reading from the table, comparing against an immediate constant and using i.e. bpl/bmi to branch) and they're nearly impossible to follow. I would love to see the flowcharts these guys were working off of.

I'm just bitching at this point but I'd love to see a disassembler that can help out with some of this stuff...
What you are describing is actually not that uncommon.

the Flow-Control aspect could be State-Machines. The benefit of using these approaches is that of Code structure being pretty static but the functionality and resultant behaviour being alterable.

Yes, it is a Bitch to follow :D I have a similar issue when I find C++ used and you try to debug that with all its 'on the fly' casting of Objects - that gets pretty nutty rather quickly - I prefer C as things never get too complex :lol:

Mechanical Menace
Newbie
Posts: 7
Joined: Mon Apr 07, 2014 6:00 am

Post by Mechanical Menace » Wed Apr 30, 2014 6:14 pm

I use GAS for everything.

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Post by MintyTheCat » Wed Apr 30, 2014 6:44 pm

Mechanical Menace wrote:I use GAS for everything.
I was won over to it too due to ASM68K's limitations when I wished to generate Memory Locations for Variables. LD and GAS make life a whole lot easier due to Linker-Scripts :D

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Wed Apr 30, 2014 11:19 pm

powerofrecall wrote:
Gigasoft wrote:I use TASM (Telemark Assembler) for Z80.

Generating an output that can be modified and then reassembled is extremely challenging, due to the need to locate all pointers. They can be in the immediate operand of an instruction, they could be in a table, they could be structure members in a table of structures, or worse. Then, some pointers may require a displacement to determine what they are really referring to. Furthermore, if they are code pointers, the code they point to should be analyzed. IDA Pro, a popular product that costs thousands of dollars, still isn't very good at guessing and leaves most of the work to the user. I'd really love to have a disassembler that would actually make an effort to save me some time by tracing unknown pointer types across function calls and generating cross references to structure offsets in the correct structure automatically, like IDA Pro already does with absolute references. Then it could use this information to automatically assign correct types to values that are assigned to structure fields.
You and me both buddy. I've been disassembling 68k konami stuff and they are big fans of pointer tables for anything & everything. I think they deserve a special place in hell. It's strange because once you get rolling it makes chunks of code easy to pick out and there is a certain logical structure to it all but you'll eventually get bogged down in separating data tables from pointer tables and pointers to data tables and pointers to pointers and so on. A ton of jumps are register jumps. IDA doesn't know what to do with any of this stuff and damn sure can't figure out data types. I imagine the 68k isn't a lead target for IDA though.

You'll literally get redundant weird code that looks like this (pointer to... another pointer!)
lea (pointer).l,a6
movea.l 0(a6),a6

on top of the standard pointer + displacement and some functions even use tables for the displacement offsets. It's nuts and really hard to work through backward. In some of their 68k arcade games they use tables even for flow control (reading from the table, comparing against an immediate constant and using i.e. bpl/bmi to branch) and they're nearly impossible to follow. I would love to see the flowcharts these guys were working off of.

I'm just bitching at this point but I'd love to see a disassembler that can help out with some of this stuff...
The "Active Disassembly" feature in Exodus (http://www.exodusemulator.com) was written specifically to handle this kind of code. By gathering information on the actual runtime behaviour of the code, and using some reasonably intelligent analysis of the data, structures like jump tables and offset tables can be identified and disassembled. I wrote this feature specifically based on my experience doing this kind of disassembly for "Sonic 2" about 10 years or so back now. Back then I just wrote a quick hack for Gens to log all unique PC locations that were encountered at runtime, and fed them in as disassembly start locations into IDA Pro. That worked well, but it still took me over a month to correct all the offsets and map out unexplored jump table entries to get something to the point where I could rebase or move all the code without the compiled version breaking. I used Sonic 2 as a test case with my active disassembly feature in Exodus, and I was able to get a disassembly that was comparable just by messing around in the game for half an hour.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Wed Apr 30, 2014 11:23 pm

As for assemblers I use, maybe I'm old fashioned, but I just stick with my hacked version of asm68k (to add support for spaces in paths and longer filenames).

powerofrecall
Very interested
Posts: 237
Joined: Fri Apr 17, 2009 7:35 pm
Location: USA

Post by powerofrecall » Thu May 01, 2014 12:03 am

Nemesis wrote: The "Active Disassembly" feature in Exodus (http://www.exodusemulator.com) was written specifically to handle this kind of code. By gathering information on the actual runtime behaviour of the code, and using some reasonably intelligent analysis of the data, structures like jump tables and offset tables can be identified and disassembled. I wrote this feature specifically based on my experience doing this kind of disassembly for "Sonic 2" about 10 years or so back now. Back then I just wrote a quick hack for Gens to log all unique PC locations that were encountered at runtime, and fed them in as disassembly start locations into IDA Pro. That worked well, but it still took me over a month to correct all the offsets and map out unexplored jump table entries to get something to the point where I could rebase or move all the code without the compiled version breaking. I used Sonic 2 as a test case with my active disassembly feature in Exodus, and I was able to get a disassembly that was comparable just by messing around in the game for half an hour.
I have used Exodus' active disassembly and it is indeed great and ridiculously helpful for things like this but the game I was talking about specifically is a non Mega Drive 68k Konami arcade (unfortunately for me--I've been looking into & interested in their 68000/Z80 TMNT based arcade hardware).

Exodus has been a boon to a disassembly of Rocket Knight Adventures that I have been slowly working on that seemingly shares a lot of the coding philosophy other 68k Konami games do. I've also learned a lot from it, hell, I wouldn't be as far into some of these games as I would have otherwise been if not for the stuff I learned from Exodus' help.

I will probably end up hacking the code/data logging approach into Final Burn Alpha or something eventually but I know for fact I'm not sharp enough or industrious enough to put together anything on the scale of Exodus' features.

Just as an aside, how extensible is Exodus at this point? I know from the MAME source that all the Konami TMNT hardware games are basically 68k/z80/ym2151+pcm and whatever bespoke custom video hardware in different configurations.

bgvanbur
Interested
Posts: 46
Joined: Fri Jun 22, 2012 11:13 pm

Post by bgvanbur » Thu May 01, 2014 1:52 am

I got gas working so I should be able to support it. I think gas -mri and gas plain will be treated as two seperate assemblers (since -mri can hopefully be similar to snasm68k format).

I would love to use Exodus but it doesn't support Linux. The active disassembly looks great though! By using the blastem's address logging it really helps determine the code points.

Post Reply