Determining the timing of a piece of code

Ask anything your want about Megadrive/Genesis programming.

Moderator: BigEvilCorporation

walker7
Interested
Posts: 45
Joined: Tue Jul 24, 2012 6:27 am

Determining the timing of a piece of code

Post by walker7 » Sun Nov 17, 2013 7:28 am

Is there any routine that you can use that will return the amount of time it takes for a certain piece of code to be run? The returned time could be in one of the data registers, and it can be in HBlanks, VBlanks, seconds, or other units.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Sun Nov 17, 2013 10:34 am

SGDK offers some methods for that in the timer unit :
https://code.google.com/p/sgdk/source/b ... de/timer.h

Mainly these methods :

Code: Select all

void startTimer(u16 numTimer);
u32  getTimer(u16 numTimer, u16 restart);
If you don't use SGDK you can still copy the code to get the idea.

walker7
Interested
Posts: 45
Joined: Tue Jul 24, 2012 6:27 am

Post by walker7 » Sun Nov 17, 2013 12:42 pm

What about a piece of 68000 assembly code as it would be done on the Sega Genesis?

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Sun Nov 17, 2013 2:12 pm

SGDK is for the Sega Genesis and you don't necessary need to do all in 68000 assembly, but if you need and want that you will have to make more efforts to translate the code in assembly.

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Post by MintyTheCat » Mon Nov 18, 2013 10:08 am

Stef wrote:SGDK is for the Sega Genesis and you don't necessary need to do all in 68000 assembly, but if you need and want that you will have to make more efforts to translate the code in assembly.
There are a couple of ways of doing this:

1. Write your own 68K Routine(s).
2. Take the Object generated by timer - timer.o let's say then do this:

objdump -D timer.o > timer.s

That will get you the GNU GAS style 68K Assembly.

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Re: Determining the timing of a piece of code

Post by MintyTheCat » Mon Nov 18, 2013 10:19 am

walker7 wrote:Is there any routine that you can use that will return the amount of time it takes for a certain piece of code to be run? The returned time could be in one of the data registers, and it can be in HBlanks, VBlanks, seconds, or other units.
You cannot get exact Wall-Clock timing but what you can do is to use something that's periodic and in the MD's case that can be achieved using VBlank and HBlank Interrupts.

Look in init.asm and there you will find the Interrupt Vector-Table, look for the Routine assigned to the VBlank and HBlank such as HBlankInterrupt and VBlankInterrupt.

Then look for those Routines and then the simplest thing to do will be to have your VBlank Routine increment a Value held somewhere in Work-RAM.

Then set up a Routine to essentially poll on the Value of the Counter 'Variable'.
What I would do is to keep two samples of these Counter-Variables' Values - one will be the 'last Counter Value' the other will be the 'current Counter Value' then just use a Compare to see if the Values differ if they do then increment another 'wait' Variable until it reaches some Value. You can determine what that value should be for your Region: NTSC/PAL and your intended Screen-Resolution such as 64x28 Cells.

Another way would be to use the YM2612's internal Timers to also get timing but again this would entail some kind of polling, sadly.

I am in the Office at present but if I have time later then I shall try to put a small Example together for you.

Cheers.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Mon Nov 18, 2013 12:28 pm

If you need something fairly high resolution and precise, but for a relatively short period of time, and you don't need the Z80 for anything, it should be possible to have the Z80 spin around incrementing a counter in Z80 ram, and poll it at the beginning and end of your time block from the 68000. This would be very regular. It would probably give you a much more stable and repeatable result than anything which relies on the VDP hcounter.

It really depends on what you want to use the timer to track which approach is most suitable.[/i]

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) » Mon Nov 18, 2013 1:20 pm

Best way is just to change VDP backdrop color, you'll see visually where your code starts and where it ends.
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Post by MintyTheCat » Mon Nov 18, 2013 1:39 pm

TmEE co.(TM) wrote:Best way is just to change VDP backdrop color, you'll see visually where your code starts and where it ends.
If you are going to do it that way then simply use an Interrupt. Plus, that is only really going to give a qualitative answer when numbers are what you need.

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Post by MintyTheCat » Mon Nov 18, 2013 1:43 pm

Nemesis wrote:If you need something fairly high resolution and precise, but for a relatively short period of time, and you don't need the Z80 for anything, it should be possible to have the Z80 spin around incrementing a counter in Z80 ram, and poll it at the beginning and end of your time block from the 68000. This would be very regular. It would probably give you a much more stable and repeatable result than anything which relies on the VDP hcounter.

It really depends on what you want to use the timer to track which approach is most suitable.[/i]
Yes, that'd make a nice 'nop' :D The thing is though that - and I'd like to be excused for going out on a Limb here but, would it not be that most Commercial Games would use HBlank and VBlank Interrupts and form their Game's Software's State-Machine around that? Plus, unless you are 'between' Notes then the Z80 is likely to be servicing the Audio Hardware so this would not work for anything other than a test, correct?

TmEE co.(TM)
Very interested
Posts: 2440
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) » Mon Nov 18, 2013 2:03 pm

MintyTheCat wrote:If you are going to do it that way then simply use an Interrupt. Plus, that is only really going to give a qualitative answer when numbers are what you need.
Ints only give line granularity, but the BG color change gives pixel granularity, which is all that really matters in the end. Your frame is 262 or 313 lines tall which of each line containing 384 pixels. That is your 100% figure. Each line is always 488x 68K cycles, so it is not hard to get actual CPU cycles figure out the pixels.
Will work only in really nice emulators that do attempt pixel level VDP emulation, or the very least sub-line level.
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen

MintyTheCat
Very interested
Posts: 484
Joined: Sat Mar 05, 2011 11:11 pm
Location: Berlin, Germany

Post by MintyTheCat » Mon Nov 18, 2013 2:14 pm

TmEE co.(TM) wrote:
MintyTheCat wrote:If you are going to do it that way then simply use an Interrupt. Plus, that is only really going to give a qualitative answer when numbers are what you need.
Ints only give line granularity, but the BG color change gives pixel granularity, which is all that really matters in the end. Your frame is 262 or 313 lines tall which of each line containing 384 pixels. That is your 100% figure. Each line is always 488x 68K cycles, so it is not hard to get actual CPU cycles figure out the pixels.
Will work only in really nice emulators that do attempt pixel level VDP emulation, or the very least sub-line level.
Good point - I see what you mean. So unless you had a Block of Code and you'd calculated its execution Time you'd have to have a way to stall it and pick the exact time that it should be allowed to execute.

What about if you employed some of the Real-Time Executive Ideas and have a Queue of Tasks to execute each with a number on them - a Counting-Semaphore that give you Lists of actions to carry out during the current Pixel Line and if not executing anything then simply execute some Idle Task?

You could get a Load Level Calculation based of how much time you spend idling per line and simply subtract it from 488 x 68K_Cycles to get your actual Code_Execution_Duration.

To be honest that was next on the Cards for me to try out but this Question was asked and it fed into my next Questions too - cheers.

Nemesis
Very interested
Posts: 791
Joined: Wed Nov 07, 2007 1:09 am
Location: Sydney, Australia

Post by Nemesis » Tue Nov 19, 2013 4:55 am

MintyTheCat wrote:
Nemesis wrote:If you need something fairly high resolution and precise, but for a relatively short period of time, and you don't need the Z80 for anything, it should be possible to have the Z80 spin around incrementing a counter in Z80 ram, and poll it at the beginning and end of your time block from the 68000. This would be very regular. It would probably give you a much more stable and repeatable result than anything which relies on the VDP hcounter.

It really depends on what you want to use the timer to track which approach is most suitable.[/i]
Yes, that'd make a nice 'nop' :D The thing is though that - and I'd like to be excused for going out on a Limb here but, would it not be that most Commercial Games would use HBlank and VBlank Interrupts and form their Game's Software's State-Machine around that? Plus, unless you are 'between' Notes then the Z80 is likely to be servicing the Audio Hardware so this would not work for anything other than a test, correct?
All true, but again, it depends on what you're using the timer for. If you wanted to use it to profile 68000 code for development, IE, to simply test which approach was faster, I think this kind of method would be more suitable than relying on the VDP, but obviously blocking the Z80 makes it unsuitable for actual "release" code.
TmEE co.(TM) wrote:
MintyTheCat wrote:If you are going to do it that way then simply use an Interrupt. Plus, that is only really going to give a qualitative answer when numbers are what you need.
Ints only give line granularity, but the BG color change gives pixel granularity, which is all that really matters in the end. Your frame is 262 or 313 lines tall which of each line containing 384 pixels. That is your 100% figure. Each line is always 488x 68K cycles, so it is not hard to get actual CPU cycles figure out the pixels.
Will work only in really nice emulators that do attempt pixel level VDP emulation, or the very least sub-line level.
That's a really clever approach! That's got me thinking on a similar line too. You could use the same VDP "synchronization" method used for stable CRAM DMA copies, and trigger a fill with just a couple of values, after which, you'll have perfect sync with the VDP HV counter. You can then execute your code to be "profiled" at that point, and sample the HV counter immediately afterwards. Since the HV counter position when the first cycle of your code began executing is fixed, and can be calculated, as can the delay involved in reading the current HV counter value at the end, you can calculate the exact number of counter steps that occurred during that time. Since the HV counter advances at a faster rate than the 68000 clock rate, with a bit of math to account for the HV counter progression, you should be able to get a cycle-accurate measure of execution time, and one that you can test in code too.

Like all HV counter methods, there's one main flaw with this, in that the HV counter is updated "live", out of sync with reads, and it's possible on the real hardware to get cases where the bit values change mid-read, creating incorrect results, but this is a rare occurrence.

r57shell
Very interested
Posts: 478
Joined: Sun Dec 23, 2012 1:30 pm
Location: Russia
Contact:

Post by r57shell » Fri Nov 22, 2013 1:09 pm

Oh... god... why? :? So many posts...
Question is totally different.
I want to do same thing.
Ok, there is example:

Code: Select all

lsl.l #2,d0
and

Code: Select all

add.l d0,d0
add.l d0,d0
Which one is faster? I don't know surely, but as I think: second one.
If you know timings, you can calculate yourself.
BUT, if it is not small bunch of code? something like this:

Code: Select all

 addq.l  #4,d0
 moveq   #0,d2
 moveq   #$12,d1
@loop:
 move.l  d2,(a6,d0.l)
 addq.l  #4,d0
 subq.w  #1,d1
 bne.s   @loop
and

Code: Select all

 addq.w  #4,d0
 moveq   #0,d1
 move.l  d0,a0
 moveq   #$11,d0
@loop:
 move.l  d1,(a0)+
 dbf     d0,@loop
As I think: second one, it is "my optimization", but again, I don't know surely.

I don't think that you can test by backdrop color such thing, but you can do over 1000 runs of such code to make backdrop color line visible :)
Anyway, there is so many emulators, and no method to get timings of code chunk :(
Image

Ti_
Very interested
Posts: 97
Joined: Tue Aug 30, 2011 7:50 am
Contact:

Post by Ti_ » Fri Nov 22, 2013 6:16 pm

r57shell wrote: Anyway, there is so many emulators, and no method to get timings of code chunk :(
Gens standart debugger shows how cycles each command takes. ) hotkey 't' is step into.

If you want to compare large functions like unpacking, you can try easy68k ide.

Post Reply