Cache Coherency Sanity Check

Ask anything your want about the 32X Mushroom programming.

Moderator: BigEvilCorporation

Post Reply
Mask of Destiny
Very interested
Posts: 616
Joined: Thu Nov 30, 2006 6:30 am

Cache Coherency Sanity Check

Post by Mask of Destiny » Fri Feb 16, 2007 6:01 pm

So I'm thinking about porting a multi-threaded program I'm working on (an interpretter for a dataflow language if you're curious) to the 32X (or perhaps the Saturn) and I was trying to think of a sane way to use both processors in a reasonably efficient fashion. I think I have a reasonable solution, but I'd like some feedback to make sure I haven't missed anything obvious. So here it goes:

Threads locked to the processor they started on
Code and stack accessed through cached memory region
Globals and heap accessed through non-cached memory region

The logic behind this being that code is read only and therefore we don't have any coherency problems there. Each stack will only be touched by the single thread it belongs to and since threads can't move between processors only one processor will ever look at a given stack. Globals and heap on the other hand are potentially shared by all threads and manually flushing everything is going to be difficult to do properly and probably not very performant (apart from some special cases where the data is mostly read only, but those I can handle as exceptions if there's enough performance to be gained).

Will this approach work? Does it sound like a good compromise between performance and complexity?

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Re: Cache Coherency Sanity Check

Post by Stef » Fri Feb 16, 2007 11:00 pm

Mask of Destiny wrote:So I'm thinking about porting a multi-threaded program I'm working on (an interpretter for a dataflow language if you're curious) to the 32X (or perhaps the Saturn) and I was trying to think of a sane way to use both processors in a reasonably efficient fashion. I think I have a reasonable solution, but I'd like some feedback to make sure I haven't missed anything obvious. So here it goes:

Threads locked to the processor they started on
Code and stack accessed through cached memory region
Globals and heap accessed through non-cached memory region

The logic behind this being that code is read only and therefore we don't have any coherency problems there. Each stack will only be touched by the single thread it belongs to and since threads can't move between processors only one processor will ever look at a given stack. Globals and heap on the other hand are potentially shared by all threads and manually flushing everything is going to be difficult to do properly and probably not very performant (apart from some special cases where the data is mostly read only, but those I can handle as exceptions if there's enough performance to be gained).

Will this approach work? Does it sound like a good compromise between performance and complexity?
In fact because of the cache problem, it's really recommended to avoid as most as possible to share data between the main and slave SH2 cpu. So having them working on different thread sounds as the (only ? best ?) solution to use them at their best potential :) Avoid globals and heap access as much you can :) I never coded on 32X but i often heard the cache cohenrency cause many troubles when you want to use both CPU at same time !

Fonzie
Genny lover
Posts: 323
Joined: Tue Aug 29, 2006 11:17 am
Contact:

Post by Fonzie » Sat Feb 17, 2007 9:25 am

Code and stack accessed through cached memory region
Isn't it dangerous to read/write the stack from cache? Or maybe the SH update the cache after each write?

Stef
Very interested
Posts: 3131
Joined: Thu Nov 30, 2006 9:46 pm
Location: France - Sevres
Contact:

Post by Stef » Sat Feb 17, 2007 10:21 am

Fonzie wrote:
Code and stack accessed through cached memory region
Isn't it dangerous to read/write the stack from cache? Or maybe the SH update the cache after each write?
I guess that each CPU use it own stack pointer so no data share = no problems :)

Mask of Destiny
Very interested
Posts: 616
Joined: Thu Nov 30, 2006 6:30 am

Post by Mask of Destiny » Sat Feb 17, 2007 12:12 pm

Each thread has to have its own stack even on systems where there are no cache coherency problems otherwise each thread will overwrite the others data.

ob1
Very interested
Posts: 463
Joined: Wed Dec 06, 2006 9:01 am
Location: Aix-en-Provence, France

Post by ob1 » Tue Feb 20, 2007 2:27 pm

Seems right to me.
Expect more answers from my holidays next week.
BTW, I will have receveid a book on OS. Maybe it will help.
Stay tuned.

The dinosaur book :
Operating System Concepts - 7th edition
Silberschatz - John Wiley
978-0-471-69466-3 (jan. 2005)

Post Reply