Page 3 of 8

Posted: Tue Mar 22, 2011 11:32 pm
by notaz
Chilly Willy wrote:Just a reminder of something I ran into before that may concern this: don't forget that any C code that directly uses hardware MUST be compiled at -O1! If you compile at or above -O2, the compiler reorders the code for better speed regardless of whether hardware may go nuts due to the reordering. Also, casting the access as volatile WILL NOT cure the problem. So stick all hardware related code in it's own file and compile that file with -O1 to avoid trouble. Note, this doesn't matter with assembly files, just C/C++ files.

I can personally vouch this behavior exists in at least gcc 4.4 and 4.5 for MIPS, SH, and M68K, so it's probably all platforms unless specifically stated otherwise.
I've been using -O2 with '-fno-schedule-insns -fno-schedule-insns2' in another project to avoid reordering, but I haven't tested if this works here.

Posted: Wed Mar 23, 2011 1:33 am
by antime
notaz wrote:I've been using -O2 with '-fno-schedule-insns -fno-schedule-insns2' in another project to avoid reordering, but I haven't tested if this works here.
-fschedule-insns attempts to hide the latency of slow instructions by pipeline scheduling, it has no effect on in-order CPUs like 68000 or SH2.

Posted: Wed Mar 23, 2011 2:31 am
by Chilly Willy
antime wrote:
notaz wrote:I've been using -O2 with '-fno-schedule-insns -fno-schedule-insns2' in another project to avoid reordering, but I haven't tested if this works here.
-fschedule-insns attempts to hide the latency of slow instructions by pipeline scheduling, it has no effect on in-order CPUs like 68000 or SH2.
Uhh - wouldn't in-order execution CPUs be MORE likely to use scheduling? You want to reorder the instructions manually (by the compiler) for best speed since the CPU doesn't do it in hardware. Scheduling makes no sense for out-of-order CPUs since the CPU is just going to ignore the order anyway and schedule the instructions internally for best speed.

Posted: Sat Mar 26, 2011 4:31 pm
by antime
Chilly Willy wrote:Uhh - wouldn't in-order execution CPUs be MORE likely to use scheduling? You want to reorder the instructions manually (by the compiler) for best speed since the CPU doesn't do it in hardware. Scheduling makes no sense for out-of-order CPUs since the CPU is just going to ignore the order anyway and schedule the instructions internally for best speed.
Well, most pipelined CPUs benefit from scheduling, but the description of the flag makes it sound like you'd need at least a superscalar CPU. OOO do certainly benefit from scheduling as they don't have infinite resources.
In any case, if I'm reading the source code right schedule-insns is disabled for all the m68k CPUs and for all SuperH CPUs below SH4 to circumvent a bug.

BTW, GCC 4.6.0 was released yesterday. I'll try building my default set of compilers and see if I run into any problems.

Posted: Sat Mar 26, 2011 7:30 pm
by Chilly Willy
Superscalar has nothing to do with pipelining... the SH2 is not superscalar, but it's certainly pipelined. Reordering is particularly important for delay slots, and after multiplies.

On non-pipeline processors, reordering is more about bus and register usage than anything else. For example, with the 68000, you could almost do EVERYTHING with memory operands, but that makes the code slower and increases bus usage tremendously. You want to put things into registers and leave them as much as possible to speed things up and decrease bus usage.

That also applies to pipelined processors, but the focus is different - with one you're focusing on the pipeline, and on the other you're focusing on bus utilization.

Posted: Sat Mar 26, 2011 9:44 pm
by antime
Chilly Willy wrote:Reordering is particularly important for delay slots, and after multiplies.
You're right, I had forgotten about the async multiplier.
On non-pipeline processors, reordering is more about bus and register usage than anything else. For example, with the 68000, you could almost do EVERYTHING with memory operands, but that makes the code slower and increases bus usage tremendously. You want to put things into registers and leave them as much as possible to speed things up and decrease bus usage.
That's not reordering though. As only one instruction can be processed at a time, there are no dependencies between them.

Anyhoo, built the 4.6.0 compilers, no issues there. For 68000, 4.6.0 seems to produce better code than 4.5.2 for my testcase, but still not quite as good as 4.4.0.

Posted: Sat Mar 26, 2011 10:08 pm
by Chilly Willy
antime wrote:That's not reordering though. As only one instruction can be processed at a time, there are no dependencies between them.
Reordering means the operations are done in a different order than the original code. It doesn't require that instructions have dependencies. For example:

Code: Select all

for (i=0; i<4; i++)
    x[i] = i;
Could be done forwards, as the code above does,

x[0] = 0
x[1] = 1
x[2] = 2
x[3] = 3

or it could be done backwards

x[3] = 3
x[2] = 2
x[1] = 1
x[0] = 0

I've seen loop unrolling that did things like that. The sequence of the instructions is "reordered" from the original code (which goes forwards) into code that goes backwards.

Posted: Sun Mar 27, 2011 4:02 am
by antime
Chilly Willy wrote:I've seen loop unrolling that did things like that. The sequence of the instructions is "reordered" from the original code (which goes forwards) into code that goes backwards.
That's not what the optimization under question is about.

Posted: Sun Mar 27, 2011 4:48 am
by Chilly Willy
antime wrote:
Chilly Willy wrote:I've seen loop unrolling that did things like that. The sequence of the instructions is "reordered" from the original code (which goes forwards) into code that goes backwards.
That's not what the optimization under question is about.
That's why I said it was an EXAMPLE of reordering that had nothing to do with pipelining or delay slots or any other CPU related dependency. It's merely about swapping around instructions because the compiler thinks it will run better/faster. There are other ways the compiler reorders code that has nothing to do with the CPU, but can affect MMIO. I was just showing a very simple example that would make the point that reordering can be independent of the CPU.

Posted: Mon Mar 28, 2011 1:04 pm
by antime
Reorder != reorganize.

Posted: Mon Mar 28, 2011 10:21 pm
by Chilly Willy
antime wrote:Reorder == reorganize.
FTFY

Now you're just arguing to hear your own voice. :lol:

Posted: Sat Apr 02, 2011 2:54 pm
by Tomy
powerofrecall wrote:Windows GCC 4.5.2 toolchain (about 55mb compressed, 800+ uncompressed)
Thank you for your files. It work fine. But I want to ask, why it missing m68k-elf-g++.exe ?

And about the command "dd", dd if=temp.bin of=$@ bs=8K conv=sync
Can do in windows ?

Thanks.

Posted: Sat Apr 02, 2011 6:40 pm
by Chilly Willy
Tomy wrote:
powerofrecall wrote:Windows GCC 4.5.2 toolchain (about 55mb compressed, 800+ uncompressed)
Thank you for your files. It work fine. But I want to ask, why it missing m68k-elf-g++.exe ?

And about the command "dd", dd if=temp.bin of=$@ bs=8K conv=sync
Can do in windows ?

Thanks.
It was probably done with my first rev, which used gcc-core, which won't compile g++ for the 68000. He needs to redo that using the current instructions and makefile to get g++ for the 68000.

The command dd.exe should be part of CygWin (you may need to install it if it's not in the base), and can be added to MinGW using unxutils from SourceForge:

http://unxutils.sourceforge.net/

Posted: Mon Apr 04, 2011 2:34 pm
by Tomy
Thanks Chilly Willy,

I use MinGW to make m68k-elf-g++.exe.
But when compile your example, it get ..

Code: Select all

C:\gen\project\Tic\C++\MD>make
m68k-elf-as -m68000 --register-prefix-optional crt0.s -o crt0.o
m68k-elf-gcc  -m68000 -Wall -O2 -c -fomit-frame-pointer  -c crtstuff.c -o crtstu
ff.o
m68k-elf-g++  -m68000 -Wall -O2 -c -fno-exceptions -nostartfiles -ffreestanding
-fno-rtti  -c main.cpp -o main.o
m68k-elf-as -m68000 --register-prefix-optional hw_md.s -o hw_md.o
m68k-elf-gcc  -m68000 -Wall -O2 -c -fomit-frame-pointer  -c font.c -o font.o
m68k-elf-g++ -T c:\gen/ldscripts/md.ld -Wl,-Map=output.map -nostdlib -ffreestand
ing -fno-rtti crt0.o crtstuff.o main.o hw_md.o font.o -Lc:\gen/m68k-elf/lib -Lc:
\gen/m68k-elf/m68k-elf/lib -lstdc++ -lc -lgcc -lnosys -o TicTacToe.elf
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-closer.o): In function `_close_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/closer.c:53: warning: _close is not implemented an
d will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-fstatr.o): In function `_fstat_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/fstatr.c:62: warning: _fstat is not implemented an
d will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-signalr.o): In function `_getpid_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/signalr.c:96: warning: _getpid is not implemented
and will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-isattyr.o): In function `_isatty_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/isattyr.c:58: warning: _isatty is not implemented
and will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-signalr.o): In function `_kill_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/signalr.c:61: warning: _kill is not implemented an
d will always fail
c:\gen/m68k-elf/m68k-elf/lib\libstdc++.a(basic_file.o): In function `std::__basi
c_file<char>::seekoff(long long, std::_Ios_Seekdir)':
d:\gen\build-gcc-m68k-elf-4.5.2\m68k-elf\libstdc++-v3\src/basic_file.cc:326: war
ning: _lseek is not implemented and will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-openr.o): In function `_open_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/openr.c:59: warning: _open is not implemented and
will always fail
c:\gen/m68k-elf/m68k-elf/lib\libstdc++.a(basic_file.o): In function `std::__basi
c_file<char>::xsgetn(char*, long)':
d:\gen\build-gcc-m68k-elf-4.5.2\m68k-elf\libstdc++-v3\src/basic_file.cc:288: war
ning: _read is not implemented and will always fail
c:\gen/m68k-elf/m68k-elf/lib\libstdc++.a(basic_file.o): In function `(anonymous
namespace)::xwrite(int, char const*, long)':
d:\gen\build-gcc-m68k-elf-4.5.2\m68k-elf\libstdc++-v3\src/basic_file.cc:117: war
ning: _write is not implemented and will always fail
m68k-elf-objcopy -O binary TicTacToe.elf temp.bin
dd if=temp.bin of=TicTacToe.bin bs=512K conv=sync
0+1 records in
1+0 records out
524288 bytes (524 kB) copied, 0.00722271 s, 72.6 MB/s
Why it point to my d: build directory ? I think it is not right.
And compiled BIN can not run.

Posted: Mon Apr 04, 2011 5:45 pm
by Chilly Willy
The warnings are from libnosys and are merely to tell you that if you used certain functions (like open/close/read/write/etc) aren't implemented by said library. They can be safely ignored as the functions AREN'T used.

The makefile uses an export... remember this from the OP?

Code: Select all

export GENDEV=/opt/toolchains/gen
export PATH=$GENDEV/m68k-elf/bin:$GENDEV/bin:$PATH
You need to be using the MinGW terminal, not the Windows command shell, and you need to set those environment vars before running make.

How are you trying to "run" the TicTacToe.bin file? It ran in Gens/GS and on real hardware for me. It might be the issue with the exports causing trouble with making the binary as well.