Efficient C Question: Global variables

Ask anything your want about Megadrive/Genesis programming.

Moderator: BigEvilCorporation

djcouchycouch
Very interested
Posts: 710
Joined: Sat Feb 18, 2012 2:44 am

Efficient C Question: Global variables

Post by djcouchycouch »

Hi,

I've been looking at the documents suggested in the "Best practices for writing C on 68000?" thread and one thing that the docs don't talk about much is global variables.

Are there any performance penalties with using them? Lots of my functions will be using them and I'm not sure that having globals being "'far" from the functions that use them will have a performance impact.

Are there any performance penalties with having a lot of them?
Are there any performance advantages in the order they're declared?
Does it make any difference, performance wise, if they are declared as static?

Thanks!
DJCC
Chilly Willy
Very interested
Posts: 2993
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy »

Long absolute addressing uses 4 cycles more for each instruction that uses it compared to an offset from an address register or from the pc, or compared to word absolute addressing.
Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef »

A problem with global variable is that the compiler cannot optimize it in a register as easily than it can with local variable. Maybe using static keyword can help here though (the compiler know at least the variable cannot be modified outside the file).
djcouchycouch
Very interested
Posts: 710
Joined: Sat Feb 18, 2012 2:44 am

Post by djcouchycouch »

Using a global variable incurs a performance penalty, and the docs say to avoid passing parameters as much as possible. I feel like I'm stuck here.

So what's the generally accepted strategy for variables modified by many functions, like player-related data? Position, score, etc, etc.?

Would it be better for a function like, say, UpdatePlayerPosition() to make itself a local copy of the global player position variable, work on it and then copy it back? If I'm working with a array that is global, would simply having a local pointer in the function point to it avoid the performance penalty?

Any ideas?

Thanks!
DJCC
Shiru
Very interested
Posts: 786
Joined: Sat Apr 07, 2007 3:11 am
Location: Russia, Moscow
Contact:

Post by Shiru »

Global variables are more effective on 8-bit platforms. It is unclear for M68K and various compilers that are available. I think it should be just tested, moving few variables from and into a function should not be a big problem.
djcouchycouch
Very interested
Posts: 710
Joined: Sat Feb 18, 2012 2:44 am

Post by djcouchycouch »

I'll do some benchmarking this weekend. Things I'll try:

- a function that uses global variables only
- a function that copies global variables to local ones and uses them
- a function that takes parameters

Run each of these about a thousand or ten thousand times and see what it gives us.

DJCC.
Charles MacDonald
Very interested
Posts: 292
Joined: Sat Apr 21, 2007 1:14 am

Post by Charles MacDonald »

Check out the GCC calling convention for the 68K:

http://www.makestuff.eu/wordpress/?p=1544

Frankly I'm surprised all parameters go on the stack and nothing is passed in the registers. I always thought the first four parameters were in the data registers and anything beyond that went onto the stack?

If this is true I guess globals would be better, but that seems unusual.
Chilly Willy
Very interested
Posts: 2993
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy »

When in doubt, compile to source and check the source. I would think that putting the variables into a struct and then passing the struct pointer would use address register offset addressing to access the fields of the struct. That would be faster than globals.
djcouchycouch
Very interested
Posts: 710
Joined: Sat Feb 18, 2012 2:44 am

Post by djcouchycouch »

Chilly Willy wrote:I would think that putting the variables into a struct and then passing the struct pointer would use address register offset addressing to access the fields of the struct. That would be faster than globals.
I'll add that to the benchmarking tests.
When in doubt, compile to source and check the source.
As I'm no assembler expert, I'll run the tests and post the results with compiled source this weekend.
djcouchycouch
Very interested
Posts: 710
Joined: Sat Feb 18, 2012 2:44 am

Post by djcouchycouch »

So here's what I have. Feedback on the testing methodology is very welcome since I'm probably forgetting something.

If you can think of other cases to add, that be cool too.

Using SGDK 0.9 unmodified, using its default settings for gcc and everything.

The test program runs six scenarios, each a hundred thousand times:
1 - a function that modifies global variables directly
2 - a function that makes a local copy of global variables, modifies them and copies them back to the global variables
3 - a function that takes a pointer to a global struct and modifies it
4 - a function that modifies a global struct directly
5 - a function that takes a pointer to a struct as a parameter
6 - a function that takes a pointer to a struct that is static as a parameter

Using SGDK's getSubTick(), these are the numbers I've gotten:
EDIT: Whoops again. Wasn't running in an emulator. D'oh. Updated with numbers from real hardware.
1 - 127610 (was 123185 in Gens + Rewind 1.0)
2 - 201790 (was 187845 in Gens + Rewind 1.0)
3 - 127610 (was 123205 in Gens + Rewind 1.0)
4 - 127610 (was 123205 in Gens + Rewind 1.0)
5 - 143005 (was 139360 in Gens + Rewind 1.0)
6 - 143035 (was 139360 in Gens + Rewind 1.0)

EDIT: It now looks like modifying globals, standalone variables, structs or pointers to structs take the same amount of time.

You can find the program code at http://dl.dropbox.com/u/17303735/performancetest001.c

I don't know how output the C code into assembly, though. Sorry about that.

So what do you think?

DJCC
Last edited by djcouchycouch on Sat Mar 24, 2012 6:28 pm, edited 2 times in total.
TmEE co.(TM)
Very interested
Posts: 2452
Joined: Tue Dec 05, 2006 1:37 pm
Location: Estonia, Rapla City
Contact:

Post by TmEE co.(TM) »

...but modifying globals directly number is smallest ...?
Mida sa loed ? Nagunii aru ei saa ;)
http://www.tmeeco.eu
Files of all broken links and images of mine are found here : http://www.tmeeco.eu/FileDen
djcouchycouch
Very interested
Posts: 710
Joined: Sat Feb 18, 2012 2:44 am

Post by djcouchycouch »

TmEE co.(TM) wrote:...but modifying globals directly number is smallest ...?
Whoops. I'll correct the post.
Shiru
Very interested
Posts: 786
Joined: Sat Apr 07, 2007 3:11 am
Location: Russia, Moscow
Contact:

Post by Shiru »

It would be interesting to test a simple time-wasting loop that uses few variables as counters, and check if there is difference when these variables are local and global.

I mean, something like

Code: Select all

unsigned short cnt1;
unsigned long cnt2,cnt3;

for(cnt1=0;cnt1<50000;++cnt1)
{
for(cnt2=0;cnt2<10000;++cnt2)
{
  cnt3+=10;
}
}
Why I'm not testing myself - I don't have a setup to test it, as I don't use Stef's SDK. So it'll take time for me to prepare the test, maybe you can test it much faster, having the test code ready.
djcouchycouch
Very interested
Posts: 710
Joined: Sat Feb 18, 2012 2:44 am

Post by djcouchycouch »

Hi Shiru,

I've updated what I'll call PerfTest with your suggestions. Although I reduced the number of interations because they were really long :)

I've also renamed functions to make them more clear, and added descriptions on screen when running the tests.

Same place:

http://dl.dropbox.com/u/17303735/performancetest001.c

I've added a compiled PerfTests.bin for people to try out.

http://dl.dropbox.com/u/17303735/PerfTests.bin


I've also just noticed a fatal flaw in my numbers. I was running it in Gens and not real hardware. DOH!

Here it is running on hardware:

http://dl.dropbox.com/u/17303735/IMG_1853.JPG

I've also updated the numbers I posted above.
djcouchycouch
Very interested
Posts: 710
Joined: Sat Feb 18, 2012 2:44 am

Post by djcouchycouch »

These are the tests based on Shiru's suggestions:

Code: Select all

void TestFunction_InnerLoopWithLocalVariables()
{
    unsigned short cnt1; 
    unsigned long cnt2,cnt3; 

    for(cnt1=0;cnt1<500;++cnt1) 
    { 
        for(cnt2=0;cnt2<100;++cnt2) 
        { 
          cnt3+=10; 
        } 
    }
}

unsigned short global_cnt1; 
unsigned long global_cnt2;
unsigned long global_cnt3; 


void TestFunction_InnerLoopWithGlobalVariables()
{

    for(global_cnt1 = 0; global_cnt1 < 500; ++global_cnt1) 
    { 
        for(global_cnt2 = 0; global_cnt2 < 100;++global_cnt2) 
        { 
          global_cnt3+=10; 
        } 
    }
}


void TestFunction_InnerLoopWithLocalVariables2()
{
    unsigned short cnt1; 

    for(cnt1=0;cnt1<500;++cnt1) 
    { 
        unsigned long cnt2;
        for(cnt2=0;cnt2<100;++cnt2) 
        { 
          static unsigned long cnt3 = 0; 

          cnt3+=10; 
        } 
    }
}


void TestFunction_InnerLoopWithLocalStaticVariables()
{
    static unsigned short cnt1; 
    static unsigned long cnt2,cnt3; 

    for(cnt1=0;cnt1<500;++cnt1) 
    { 
        for(cnt2=0;cnt2<100;++cnt2) 
        { 
          cnt3+=10; 
        } 
    }
}

Benchmark times on a Genesis II were
1 - 14530 // local variables
2 - 18560 // global variables
3 - 18630 // local variables declared only before getting used
4 - 18630 // local variables are static
Post Reply