Page 1 of 2
Efficient C Question: Global variables
Posted: Fri Mar 23, 2012 1:57 am
by djcouchycouch
Hi,
I've been looking at the documents suggested in the "Best practices for writing C on 68000?" thread and one thing that the docs don't talk about much is global variables.
Are there any performance penalties with using them? Lots of my functions will be using them and I'm not sure that having globals being "'far" from the functions that use them will have a performance impact.
Are there any performance penalties with having a lot of them?
Are there any performance advantages in the order they're declared?
Does it make any difference, performance wise, if they are declared as static?
Thanks!
DJCC
Posted: Fri Mar 23, 2012 3:25 am
by Chilly Willy
Long absolute addressing uses 4 cycles more for each instruction that uses it compared to an offset from an address register or from the pc, or compared to word absolute addressing.
Posted: Fri Mar 23, 2012 8:56 am
by Stef
A problem with global variable is that the compiler cannot optimize it in a register as easily than it can with local variable. Maybe using static keyword can help here though (the compiler know at least the variable cannot be modified outside the file).
Posted: Fri Mar 23, 2012 10:51 am
by djcouchycouch
Using a global variable incurs a performance penalty, and the docs say to avoid passing parameters as much as possible. I feel like I'm stuck here.
So what's the generally accepted strategy for variables modified by many functions, like player-related data? Position, score, etc, etc.?
Would it be better for a function like, say, UpdatePlayerPosition() to make itself a local copy of the global player position variable, work on it and then copy it back? If I'm working with a array that is global, would simply having a local pointer in the function point to it avoid the performance penalty?
Any ideas?
Thanks!
DJCC
Posted: Fri Mar 23, 2012 12:06 pm
by Shiru
Global variables are more effective on 8-bit platforms. It is unclear for M68K and various compilers that are available. I think it should be just tested, moving few variables from and into a function should not be a big problem.
Posted: Fri Mar 23, 2012 1:57 pm
by djcouchycouch
I'll do some benchmarking this weekend. Things I'll try:
- a function that uses global variables only
- a function that copies global variables to local ones and uses them
- a function that takes parameters
Run each of these about a thousand or ten thousand times and see what it gives us.
DJCC.
Posted: Fri Mar 23, 2012 4:11 pm
by Charles MacDonald
Check out the GCC calling convention for the 68K:
http://www.makestuff.eu/wordpress/?p=1544
Frankly I'm surprised all parameters go on the stack and nothing is passed in the registers. I always thought the first four parameters were in the data registers and anything beyond that went onto the stack?
If this is true I guess globals would be better, but that seems unusual.
Posted: Fri Mar 23, 2012 5:57 pm
by Chilly Willy
When in doubt, compile to source and check the source. I would think that putting the variables into a struct and then passing the struct pointer would use address register offset addressing to access the fields of the struct. That would be faster than globals.
Posted: Fri Mar 23, 2012 6:07 pm
by djcouchycouch
Chilly Willy wrote:I would think that putting the variables into a struct and then passing the struct pointer would use address register offset addressing to access the fields of the struct. That would be faster than globals.
I'll add that to the benchmarking tests.
When in doubt, compile to source and check the source.
As I'm no assembler expert, I'll run the tests and post the results with compiled source this weekend.
Posted: Sat Mar 24, 2012 2:49 pm
by djcouchycouch
So here's what I have. Feedback on the testing methodology is very welcome since I'm probably forgetting something.
If you can think of other cases to add, that be cool too.
Using SGDK 0.9 unmodified, using its default settings for gcc and everything.
The test program runs six scenarios, each a hundred thousand times:
1 - a function that modifies global variables directly
2 - a function that makes a local copy of global variables, modifies them and copies them back to the global variables
3 - a function that takes a pointer to a global struct and modifies it
4 - a function that modifies a global struct directly
5 - a function that takes a pointer to a struct as a parameter
6 - a function that takes a pointer to a struct that is static as a parameter
Using SGDK's getSubTick(), these are the numbers I've gotten:
EDIT: Whoops again. Wasn't running in an emulator. D'oh. Updated with numbers from real hardware.
1 - 127610 (was 123185 in Gens + Rewind 1.0)
2 - 201790 (was 187845 in Gens + Rewind 1.0)
3 - 127610 (was 123205 in Gens + Rewind 1.0)
4 - 127610 (was 123205 in Gens + Rewind 1.0)
5 - 143005 (was 139360 in Gens + Rewind 1.0)
6 - 143035 (was 139360 in Gens + Rewind 1.0)
EDIT: It now looks like modifying globals, standalone variables, structs or pointers to structs take the same amount of time.
You can find the program code at
http://dl.dropbox.com/u/17303735/performancetest001.c
I don't know how output the C code into assembly, though. Sorry about that.
So what do you think?
DJCC
Posted: Sat Mar 24, 2012 3:28 pm
by TmEE co.(TM)
...but modifying globals directly number is smallest ...?
Posted: Sat Mar 24, 2012 3:30 pm
by djcouchycouch
TmEE co.(TM) wrote:...but modifying globals directly number is smallest ...?
Whoops. I'll correct the post.
Posted: Sat Mar 24, 2012 3:46 pm
by Shiru
It would be interesting to test a simple time-wasting loop that uses few variables as counters, and check if there is difference when these variables are local and global.
I mean, something like
Code: Select all
unsigned short cnt1;
unsigned long cnt2,cnt3;
for(cnt1=0;cnt1<50000;++cnt1)
{
for(cnt2=0;cnt2<10000;++cnt2)
{
cnt3+=10;
}
}
Why I'm not testing myself - I don't have a setup to test it, as I don't use Stef's SDK. So it'll take time for me to prepare the test, maybe you can test it much faster, having the test code ready.
Posted: Sat Mar 24, 2012 6:29 pm
by djcouchycouch
Hi Shiru,
I've updated what I'll call PerfTest with your suggestions. Although I reduced the number of interations because they were really long
I've also renamed functions to make them more clear, and added descriptions on screen when running the tests.
Same place:
http://dl.dropbox.com/u/17303735/performancetest001.c
I've added a compiled PerfTests.bin for people to try out.
http://dl.dropbox.com/u/17303735/PerfTests.bin
I've also just noticed a fatal flaw in my numbers. I was running it in Gens and not real hardware. DOH!
Here it is running on hardware:
http://dl.dropbox.com/u/17303735/IMG_1853.JPG
I've also updated the numbers I posted above.
Posted: Sat Mar 24, 2012 6:35 pm
by djcouchycouch
These are the tests based on Shiru's suggestions:
Code: Select all
void TestFunction_InnerLoopWithLocalVariables()
{
unsigned short cnt1;
unsigned long cnt2,cnt3;
for(cnt1=0;cnt1<500;++cnt1)
{
for(cnt2=0;cnt2<100;++cnt2)
{
cnt3+=10;
}
}
}
unsigned short global_cnt1;
unsigned long global_cnt2;
unsigned long global_cnt3;
void TestFunction_InnerLoopWithGlobalVariables()
{
for(global_cnt1 = 0; global_cnt1 < 500; ++global_cnt1)
{
for(global_cnt2 = 0; global_cnt2 < 100;++global_cnt2)
{
global_cnt3+=10;
}
}
}
void TestFunction_InnerLoopWithLocalVariables2()
{
unsigned short cnt1;
for(cnt1=0;cnt1<500;++cnt1)
{
unsigned long cnt2;
for(cnt2=0;cnt2<100;++cnt2)
{
static unsigned long cnt3 = 0;
cnt3+=10;
}
}
}
void TestFunction_InnerLoopWithLocalStaticVariables()
{
static unsigned short cnt1;
static unsigned long cnt2,cnt3;
for(cnt1=0;cnt1<500;++cnt1)
{
for(cnt2=0;cnt2<100;++cnt2)
{
cnt3+=10;
}
}
}
Benchmark times on a Genesis II were
1 - 14530 // local variables
2 - 18560 // global variables
3 - 18630 // local variables declared only before getting used
4 - 18630 // local variables are static