Use a pointer multiple times, or assign pointer to local var

SGDK only sub forum

Moderator: Stef

Post Reply
BroOfTheSun
Interested
Posts: 33
Joined: Fri Dec 12, 2014 2:41 am
Location: USA - Chicago, IL

Use a pointer multiple times, or assign pointer to local var

Post by BroOfTheSun » Sun Jan 25, 2015 1:09 am

I was thinking about the difference between using a pointer multiple times, or assigning the value of a pointer to a local variable, and using that local variable through code. Would the performance be better, worse, or negligible to assign a local variable and pass it through some code?

Here is an example, which would be faster?

Code: Select all

if(obj->speed > FIX32(0)) {
		obj->speed -= DECELERATION;
		
		if(obj->speed <= FIX32(0))
			obj->speed = FIX32(0);
	}
	
	else if(obj->speed < FIX32(0)) {
		obj->speed += DECELERATION;
		
		if(obj->speed >= FIX32(0))
			obj->speed = FIX32(0);
	}
	
	else
		obj->speed = FIX32(0);
OR

Code: Select all

fix32 speed = obj->speed;

if(speed > FIX32(0)) {
		speed -= DECELERATION;
		
		if(speed <= FIX32(0))
			speed = FIX32(0);
	}
	
	else if(speed < FIX32(0)) {
		speed += DECELERATION;
		
		if(speed >= FIX32(0))
			speed = FIX32(0);
	}
	
	else
		speed = FIX32(0);

        obj->speed = speed;

r57shell
Very interested
Posts: 478
Joined: Sun Dec 23, 2012 1:30 pm
Location: Russia
Contact:

Post by r57shell » Sun Jan 25, 2015 8:21 am

check out generated assembly
Image

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Sun Jan 25, 2015 11:54 am

Generally the second form is better.
If the compiler is smart enough it can optimize it but it may also think that object can be externally modified and so not optimize access so definitely go with the second one ;)

Manveru
Very interested
Posts: 85
Joined: Wed Sep 05, 2012 3:30 pm

Post by Manveru » Sun Jan 25, 2015 2:39 pm

I remember to get better results with the first code. After some code tips reading, i discover that the first time the obj->speed pointer is used it will be placed on the cache, so the next times you use it (if cache is not overwriten) the access to it will be a lot faster than the first time, so creating a new var in each vsync should be slowler. Make a stress test to check best performance.

I think also that if speed field is the first of obj struct, the access will be as fast as it is a regular var.
The man who moves a mountain begins by carrying away small stones. Confucius, 551-479 BC

r57shell
Very interested
Posts: 478
Joined: Sun Dec 23, 2012 1:30 pm
Location: Russia
Contact:

Post by r57shell » Sun Jan 25, 2015 2:53 pm

r57shell wrote:check out generated assembly
:x
Image

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Sun Jan 25, 2015 10:49 pm

r57shell wrote:check out generated assembly
This. This is something that will vary with the version of gcc as well as the O level used, among other things. You cannot really say without using the switch to save the generated assembly and checking it to see what was actually generated by the compiler.

Manveru
Very interested
Posts: 85
Joined: Wed Sep 05, 2012 3:30 pm

Post by Manveru » Mon Jan 26, 2015 1:03 am

Thats right, this is the best way, but you can try some test to know what code has better performance, specially if you know nothing about ASM and because of that you work with C instead of ASM to dev Megadrive games, like me :oops:
The man who moves a mountain begins by carrying away small stones. Confucius, 551-479 BC

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Mon Jan 26, 2015 9:17 am

Manveru wrote:I remember to get better results with the first code. After some code tips reading, i discover that the first time the obj->speed pointer is used it will be placed on the cache, so the next times you use it (if cache is not overwriten) the access to it will be a lot faster than the first time, so creating a new var in each vsync should be slowler. Make a stress test to check best performance.

I think also that if speed field is the first of obj struct, the access will be as fast as it is a regular var.
The problem is all about the "cache", compiler won't always be able to determine if the speed changed and so using a local variable is actually a temporary cache the compiler can put in register.
At least on the GCC i'm using with SGDK, it helps a lot to use local variable to speed up the code.

Manveru
Very interested
Posts: 85
Joined: Wed Sep 05, 2012 3:30 pm

Post by Manveru » Mon Jan 26, 2015 10:15 am

Stef wrote:At least on the GCC i'm using with SGDK, it helps a lot to use local variable to speed up the code.
Then it is right that performance and the generated code can change a lot between different versions of GCC.

I didnt know so much about this, but after some learning reading in webs like stackoverflow and others, i made some test to compare performance in some situations, and in this case, in my GCC version i got better results reaccesing fields than creating extra vars. So we need to check what our compiler does with some tests or watching assembly if we understand it :P
The man who moves a mountain begins by carrying away small stones. Confucius, 551-479 BC

BroOfTheSun
Interested
Posts: 33
Joined: Fri Dec 12, 2014 2:41 am
Location: USA - Chicago, IL

Post by BroOfTheSun » Mon Jan 26, 2015 1:50 pm

Thanks for the replies. I figured I could test this situation out to see which has the best performance, but thought there was something more out there. I will check out the generated assembly. I am using SGDK, so it looks like option 2 has the best performance.

Chilly Willy
Very interested
Posts: 2984
Joined: Fri Aug 17, 2007 9:33 pm

Post by Chilly Willy » Mon Jan 26, 2015 8:05 pm

Manveru wrote:
Stef wrote:At least on the GCC i'm using with SGDK, it helps a lot to use local variable to speed up the code.
Then it is right that performance and the generated code can change a lot between different versions of GCC.

I didnt know so much about this, but after some learning reading in webs like stackoverflow and others, i made some test to compare performance in some situations, and in this case, in my GCC version i got better results reaccesing fields than creating extra vars. So we need to check what our compiler does with some tests or watching assembly if we understand it :P
My guess is that it's leaving the pointer in an address register and using address register relative addressing to access the variable, which only needs a word to access the var rather than a long absolute pointer. Of course, you could also try "register fix32 speed = obj->speed;" for even better speed.

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Mon Jan 26, 2015 8:18 pm

Generally when you have few local variables GCC optimizes them in register as soon you enable any level of optimization, so you never really need the register keyword. hen you have many local variable adding it help the compiler in deciding "which ones" to cache in register ;)

Manveru
Very interested
Posts: 85
Joined: Wed Sep 05, 2012 3:30 pm

Post by Manveru » Mon Jan 26, 2015 11:12 pm

Chilly Willy wrote:My guess is that it's leaving the pointer in an address register and using address register relative addressing to access the variable, which only needs a word to access the var rather than a long absolute pointer. Of course, you could also try "register fix32 speed = obj->speed;" for even better speed.
I think i have read that register keyword is not recomended because "newer" compilers uses it automatically or ignore it. Anyway when you call a var or pointer it keeps in cache so recalling it is a lot faster. So if the code reuses this var or pointer it will be accessed very fast, at least in the gcc version i use.

For that and for some other situations i made some stress test, and in the case we are talking about, i got a higher number of sprites in screen without slowdowns. I admit of course that it depends on the compiler and gcc version, thats why i am simply saying that you should make some test to compare performances in your code (if you cant try assembly of course, which is always the best way).
The man who moves a mountain begins by carrying away small stones. Confucius, 551-479 BC

Post Reply