68000 instruction timing from source?

Talk about development tools here

Moderator: BigEvilCorporation

Post Reply
BigEvilCorporation
Very interested
Posts: 209
Joined: Sat Sep 08, 2012 10:41 am
Contact:

68000 instruction timing from source?

Post by BigEvilCorporation » Sat Oct 01, 2016 4:39 pm

I'd like a tool that parses 68000 source code, and writes it back out with the instruction timing in the margin (or as a spreadsheet or whatever). Then I could produce a heat map of the most expensive code.

Does such a thing already exist? If not, I'll write one.

I guess an end goal would be to integrate heat map style profiling into an emulator to evaluate realtime code bottlenecks (static analysing won't properly account for loops and branches, etc).
A blog of my Megadrive programming adventures: http://www.bigevilcorporation.co.uk

cero
Very interested
Posts: 338
Joined: Mon Nov 30, 2015 1:55 pm

Re: 68000 instruction timing from source?

Post by cero » Sat Oct 01, 2016 6:06 pm

If you're going to integrate it to an emulator, it would be most useful to use one of the existing setups: gprof, oprofile, or valgrind. Being compatible with those would allow you to use their tools (GUIs, filters...), and extend into C and C++ code too.

edit: Here's a link to gprof on embedded bare-metal ARM:
https://mcuoneclipse.com/2015/08/23/tut ... -cortex-m/

Miquel
Very interested
Posts: 514
Joined: Sat Jul 30, 2016 12:33 am

Re: 68000 instruction timing from source?

Post by Miquel » Sat Oct 01, 2016 9:45 pm

Unfortunately you need to simulate/emulate/execute the code to analyze it: there are loops and they are the hot zones, and you can't detect it with a simple parser. I don't have recollection of a time when really maters to optimize a non-loop code.

Still more there are instructions timing that depend on the context, shift's can go from 4 cycles to about 66 cycles. As modern cpu's, jumping or not on a conditional jump can discard data on load/decode stage, increasing by 2 the cycle count, and so on.
HELP. Spanish TVs are brain washing people to be hostile to me.

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Re: 68000 instruction timing from source?

Post by MintyTheCat » Sun Oct 02, 2016 10:30 am

I began work on a similar tool some time ago. You need to essentially execute the loops and account each time. I have not looked into it but it would make sense to look at Starscream for ideas.
UMDK Fanboy

BigEvilCorporation
Very interested
Posts: 209
Joined: Sat Sep 08, 2012 10:41 am
Contact:

Re: 68000 instruction timing from source?

Post by BigEvilCorporation » Sun Oct 02, 2016 2:48 pm

I'll get busy then!

Static analysis isn't properly representative, of course, but it's a useful first pass and can be used for comparison of small sections of code during an optimisation refactor, rather than the whole picture at runtime. Examples of instruction-counted code is a hot topic on this forum and Sega-16, and I see a lot of small snippets demonstrating various optimisations. It would only be a first step to a proper runtime solution, too.
A blog of my Megadrive programming adventures: http://www.bigevilcorporation.co.uk

ehaliewicz
Very interested
Posts: 50
Joined: Tue Dec 24, 2013 1:00 am

Re: 68000 instruction timing from source?

Post by ehaliewicz » Tue Oct 04, 2016 7:05 pm

What you really want is some sort of profiler.

Grind
Very interested
Posts: 69
Joined: Fri Jun 13, 2014 1:26 pm
Location: US
Contact:

Re: 68000 instruction timing from source?

Post by Grind » Fri Oct 07, 2016 1:44 pm

ehaliewicz wrote:What you really want is some sort of profiler.
As far as I know the closest thing gennydev has to profiling is random gdb pausing and timing with Kdebug.

I hope I'm wrong though.

r57shell
Very interested
Posts: 478
Joined: Sun Dec 23, 2012 1:30 pm
Location: Russia
Contact:

Re: 68000 instruction timing from source?

Post by r57shell » Fri Oct 07, 2016 5:25 pm

You can grab opcode lengths & opcode timings from any emulator source,
then get gens r57shell mod and put timings into lua, then setup break on PC on whole ROM, and count each opcode timing.
Image

Mask of Destiny
Very interested
Posts: 615
Joined: Thu Nov 30, 2006 6:30 am

Re: 68000 instruction timing from source?

Post by Mask of Destiny » Fri Oct 07, 2016 6:51 pm

I'd be willing to add profiling support to BlastEm, but it would be good to have some input on the best output format. The simplest thing to do would be to have a list of address and the cumulative number of cycles spent executing instructions at those addresses. This would not allow any degree of callstack analysis though which many profiling tools offer. There's also the issue of whether it makes sense to differentiate from time spent actually executing an instruction and time spent waiting for DMA or the like to complete.

BigEvilCorporation
Very interested
Posts: 209
Joined: Sat Sep 08, 2012 10:41 am
Contact:

Re: 68000 instruction timing from source?

Post by BigEvilCorporation » Fri Oct 07, 2016 8:23 pm

I'm going to start with a simple Exodus plugin - I already have a SNASM68K COFF reader for it so I can match addresses to source, so this makes sense. I'll just make it dump out accumulative cost per instruction in CSV format for now and go from there.
A blog of my Megadrive programming adventures: http://www.bigevilcorporation.co.uk

Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: 68000 instruction timing from source?

Post by Sik » Sat Oct 08, 2016 3:03 pm

Mask of Destiny wrote:There's also the issue of whether it makes sense to differentiate from time spent actually executing an instruction and time spent waiting for DMA or the like to complete.
You normally want to know how long it took for the code to execute, so you probably want to take into account the DMAs.
Sik is pronounced as "seek", not as "sick".

Mask of Destiny
Very interested
Posts: 615
Joined: Thu Nov 30, 2006 6:30 am

Re: 68000 instruction timing from source?

Post by Mask of Destiny » Sun Oct 09, 2016 5:36 am

Oh certainly, but you might want to be able to separate out the actual execution time from the time spent waiting for DMA to complete e.g. "the move.l at $XXX took N cycles of which 12 were execution and the rest were DMA"

r57shell
Very interested
Posts: 478
Joined: Sun Dec 23, 2012 1:30 pm
Location: Russia
Contact:

Re: 68000 instruction timing from source?

Post by r57shell » Sun Oct 09, 2016 12:54 pm

Any timing of transfer-DMA will be in same func that starts it. In other words, it's not spread around.
So, I don't see any reason to take off account DMA, except if you want to do some DMA copy/fill.
Image

Sik
Very interested
Posts: 939
Joined: Thu Apr 10, 2008 3:03 pm
Contact:

Re: 68000 instruction timing from source?

Post by Sik » Sun Oct 09, 2016 3:44 pm

If DMA copy/fill is your bottleneck then the profiler would show your 68000 code spending lots of time wherever you put the wait for the DMA flag to clear, wouldn't it?
Sik is pronounced as "seek", not as "sick".

r57shell
Very interested
Posts: 478
Joined: Sun Dec 23, 2012 1:30 pm
Location: Russia
Contact:

Re: 68000 instruction timing from source?

Post by r57shell » Mon Oct 10, 2016 7:47 pm

yes it would, so? (in case if you testing that bit)
Image

Post Reply