Page 2 of 3

Posted: Fri Nov 22, 2013 6:28 pm
by Stef
Ti_ wrote:
r57shell wrote: Anyway, there is so many emulators, and no method to get timings of code chunk :(
Gens standart debugger shows how cycles each command takes. ) hotkey 't' is step into.
O_o ? Did i put that feature ??
I only remember about execute 1/10/100/1000 instructions shortcut but nothing about cycles...

Posted: Fri Nov 22, 2013 6:38 pm
by Ti_
Stef wrote:
Ti_ wrote:
r57shell wrote: Anyway, there is so many emulators, and no method to get timings of code chunk :(
Gens standart debugger shows how cycles each command takes. ) hotkey 't' is step into.
O_o ? Did i put that feature ??
I only remember about execute 1/10/100/1000 instructions shortcut but nothing about cycles...
Yep, that's it:
Image

Posted: Fri Nov 22, 2013 6:44 pm
by MintyTheCat
r57shell wrote:Oh... god... why? :? So many posts...
Question is totally different.
I want to do same thing.
Ok, there is example:

Code: Select all

lsl.l #2,d0
and

Code: Select all

add.l d0,d0
add.l d0,d0
Which one is faster? I don't know surely, but as I think: second one.
If you know timings, you can calculate yourself.
BUT, if it is not small bunch of code? something like this:

Code: Select all

 addq.l  #4,d0
 moveq   #0,d2
 moveq   #$12,d1
@loop:
 move.l  d2,(a6,d0.l)
 addq.l  #4,d0
 subq.w  #1,d1
 bne.s   @loop
and

Code: Select all

 addq.w  #4,d0
 moveq   #0,d1
 move.l  d0,a0
 moveq   #$11,d0
@loop:
 move.l  d1,(a0)+
 dbf     d0,@loop
As I think: second one, it is "my optimization", but again, I don't know surely.

I don't think that you can test by backdrop color such thing, but you can do over 1000 runs of such code to make backdrop color line visible :)
Anyway, there is so many emulators, and no method to get timings of code chunk :(
Hi,

yes, one can easily run their Routine through a Program that calculates the Weight of their Code in Machine-Cycles but it gets complicated when Loops are involved and especially Loops with an unknown number of Iterations or indeed Code that executes only on certain Conditions.

To be honest it is much simpler to determine the amount of time the 68K spends executing Code but becomes more involved when you want to know exactly how much work the VDP has to deal with - polling is not really going to help all the time either.

Posted: Fri Nov 22, 2013 7:48 pm
by MintyTheCat
I think that what's missing would be some Tool to calculate the weight of a piece of 68K Code and let it cover all the Routes through the Code or give some kind of Formula for you to work out how much a Path would weight up to a limit. The Developer could then set the upper limit to get the actual Execution-Weight.

I have used GCov and GProf a fair amount over the Years but usually on ARM and PPC Projects. I am not sure if one could do the same with gcov and gprof with the GNU Gas or if it is more a 'C' Thing - I shall have to look into it.

Posted: Fri Nov 22, 2013 8:03 pm
by Stef
Ti_ wrote: Yep, that's it:
Image
Oh indeed i totally forgot about that, only give you the number of cycles for the current instruction but better than nothing :p

Posted: Fri Nov 22, 2013 10:07 pm
by gasega68k
I do not know if there is some other emulator with debugger that shows the cycle count, but I use all the time the gens emulator (v2.11), to know the number of cycles for each instruction and thus to optimize the codes that I do, I actually learned most of 68000 asm only using this debugger..:wink:
I think it's enough to know the number of cycles for each instruction.

Posted: Fri Nov 22, 2013 11:17 pm
by Stef
Hehe glad to heard it was that useful after all ;)
I really wonder why i used 10 digits to display the cycles number :D

Posted: Sat Nov 23, 2013 7:36 am
by Nemesis
I meant to add a running cycle counter to the 68000 core in Exodus for just that kind of thing, but it slipped my mind. I'll add it in for the next release, along with a counter telling you how many cycles have elapsed since the last breakpoint was triggered. Would that be sufficient?

Static analysis of the code to determine the execute time for a given code pathway isn't going to ever really work, not only because of the iteration problem mentioned, but that the time various opcodes take to execute can actually depend on the input data too. The only way to actually figure it out is to run the code and measure the timing.

Posted: Sat Nov 23, 2013 11:28 am
by Stef
Yeah in Gens it actually give the number of cycles of the last executed instruction which make it less convenient to use. A simple and nice feature would be to have a cycle counter and a reset counter button so you can count cycles for a given portion of code when tracing execution.

Posted: Sat Nov 23, 2013 2:14 pm
by Nemesis
The manual reset feature is a great idea, much better than tying it into the last breakpoint or anything like that. I'll provide two counters, one which simply counts up from 0 forever and is never reset, and one which can be manually reset at any time. That'll be trivial to implement, and very useful for situations like this one.

Posted: Sat Nov 23, 2013 5:33 pm
by r57shell
Hmm Interesting. (about Gens standard debug)
But, you need to make new ROM with such chunk, and calculate yourself.
There is no simple way to break in particular point in Dissasembly.

By the way, I have tried to add Lua method into gens rerecording to get cycles counter, but its M68k emulator (starscream) updates odometer only after end of cycle.
So, Stef how to improve that.
And there is another bug: after lsl.l #$20,d0 highest bit is wrong. (must always be 0)

Posted: Sat Nov 23, 2013 5:42 pm
by TmEE co.(TM)
ASL has highest bit 0 or 1 depending on if you have negative or positive number not the bit in the next to last position. LSL will send in the one bit next to last, it is not held as zero.

Posted: Sat Nov 23, 2013 5:51 pm
by r57shell

Code: Select all

moveq       #-1,d0
lsl.l       #$20,d0
after that you must get
d0 = 0
Am I wrong?
Calc:
Mode->Programmer
4 bytes, HEX
1, +-, Lsh, 1F, Lsh, 1, =
0
(calc does not support lsh $20 :( )

Posted: Sat Nov 23, 2013 6:06 pm
by TmEE co.(TM)
I did not notice "$" :oops:
$20 is 32, and that indeed should make for a zero.
EDIT: shift range is only 8 bits. Your assembler should whine on larger values...

Posted: Sat Nov 23, 2013 6:38 pm
by Stef
r57shell wrote:Hmm Interesting. (about Gens standard debug)
But, you need to make new ROM with such chunk, and calculate yourself.
There is no simple way to break in particular point in Dissasembly.

By the way, I have tried to add Lua method into gens rerecording to get cycles counter, but its M68k emulator (starscream) updates odometer only after end of cycle.
So, Stef how to improve that.
And there is another bug: after lsl.l #$20,d0 highest bit is wrong. (must always be 0)
Gens is not maintained since a long time now but as far i remember there are methods which allow you to get the "live" odometer so you should be able to sort that (i use them for hv counter or z80 bus cycle synchronization). About the lsl #20,d0 instruction i remember i fixed a bug in the original starscream as the implementation was not perfect, i'm really surprised there is still an issue with it !

Edit:
Just had a look on the starscream code... my fix is shit, the bug is obvious :p
It miss one extra :

Code: Select all

emit("%s%c %s[__dreg+ebx*4], %s\n", op, direction[main_dr], sizename[main_size], tmps);
at line 3148 of the emitter, and still that does not cover the shift #63 case ! Anyway the base code was even more buggy so and my fix was enough to make games working, i guess it's why i never noticed it :p