I've been using -O2 with '-fno-schedule-insns -fno-schedule-insns2' in another project to avoid reordering, but I haven't tested if this works here.Chilly Willy wrote:Just a reminder of something I ran into before that may concern this: don't forget that any C code that directly uses hardware MUST be compiled at -O1! If you compile at or above -O2, the compiler reorders the code for better speed regardless of whether hardware may go nuts due to the reordering. Also, casting the access as volatile WILL NOT cure the problem. So stick all hardware related code in it's own file and compile that file with -O1 to avoid trouble. Note, this doesn't matter with assembly files, just C/C++ files.
I can personally vouch this behavior exists in at least gcc 4.4 and 4.5 for MIPS, SH, and M68K, so it's probably all platforms unless specifically stated otherwise.
Update your Genesis/32X Toolchain!
Moderator: BigEvilCorporation
-fschedule-insns attempts to hide the latency of slow instructions by pipeline scheduling, it has no effect on in-order CPUs like 68000 or SH2.notaz wrote:I've been using -O2 with '-fno-schedule-insns -fno-schedule-insns2' in another project to avoid reordering, but I haven't tested if this works here.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
Uhh - wouldn't in-order execution CPUs be MORE likely to use scheduling? You want to reorder the instructions manually (by the compiler) for best speed since the CPU doesn't do it in hardware. Scheduling makes no sense for out-of-order CPUs since the CPU is just going to ignore the order anyway and schedule the instructions internally for best speed.antime wrote:-fschedule-insns attempts to hide the latency of slow instructions by pipeline scheduling, it has no effect on in-order CPUs like 68000 or SH2.notaz wrote:I've been using -O2 with '-fno-schedule-insns -fno-schedule-insns2' in another project to avoid reordering, but I haven't tested if this works here.
Well, most pipelined CPUs benefit from scheduling, but the description of the flag makes it sound like you'd need at least a superscalar CPU. OOO do certainly benefit from scheduling as they don't have infinite resources.Chilly Willy wrote:Uhh - wouldn't in-order execution CPUs be MORE likely to use scheduling? You want to reorder the instructions manually (by the compiler) for best speed since the CPU doesn't do it in hardware. Scheduling makes no sense for out-of-order CPUs since the CPU is just going to ignore the order anyway and schedule the instructions internally for best speed.
In any case, if I'm reading the source code right schedule-insns is disabled for all the m68k CPUs and for all SuperH CPUs below SH4 to circumvent a bug.
BTW, GCC 4.6.0 was released yesterday. I'll try building my default set of compilers and see if I run into any problems.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
Superscalar has nothing to do with pipelining... the SH2 is not superscalar, but it's certainly pipelined. Reordering is particularly important for delay slots, and after multiplies.
On non-pipeline processors, reordering is more about bus and register usage than anything else. For example, with the 68000, you could almost do EVERYTHING with memory operands, but that makes the code slower and increases bus usage tremendously. You want to put things into registers and leave them as much as possible to speed things up and decrease bus usage.
That also applies to pipelined processors, but the focus is different - with one you're focusing on the pipeline, and on the other you're focusing on bus utilization.
On non-pipeline processors, reordering is more about bus and register usage than anything else. For example, with the 68000, you could almost do EVERYTHING with memory operands, but that makes the code slower and increases bus usage tremendously. You want to put things into registers and leave them as much as possible to speed things up and decrease bus usage.
That also applies to pipelined processors, but the focus is different - with one you're focusing on the pipeline, and on the other you're focusing on bus utilization.
You're right, I had forgotten about the async multiplier.Chilly Willy wrote:Reordering is particularly important for delay slots, and after multiplies.
That's not reordering though. As only one instruction can be processed at a time, there are no dependencies between them.On non-pipeline processors, reordering is more about bus and register usage than anything else. For example, with the 68000, you could almost do EVERYTHING with memory operands, but that makes the code slower and increases bus usage tremendously. You want to put things into registers and leave them as much as possible to speed things up and decrease bus usage.
Anyhoo, built the 4.6.0 compilers, no issues there. For 68000, 4.6.0 seems to produce better code than 4.5.2 for my testcase, but still not quite as good as 4.4.0.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
Reordering means the operations are done in a different order than the original code. It doesn't require that instructions have dependencies. For example:antime wrote:That's not reordering though. As only one instruction can be processed at a time, there are no dependencies between them.
Code: Select all
for (i=0; i<4; i++)
x[i] = i;
x[0] = 0
x[1] = 1
x[2] = 2
x[3] = 3
or it could be done backwards
x[3] = 3
x[2] = 2
x[1] = 1
x[0] = 0
I've seen loop unrolling that did things like that. The sequence of the instructions is "reordered" from the original code (which goes forwards) into code that goes backwards.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
That's why I said it was an EXAMPLE of reordering that had nothing to do with pipelining or delay slots or any other CPU related dependency. It's merely about swapping around instructions because the compiler thinks it will run better/faster. There are other ways the compiler reorders code that has nothing to do with the CPU, but can affect MMIO. I was just showing a very simple example that would make the point that reordering can be independent of the CPU.antime wrote:That's not what the optimization under question is about.Chilly Willy wrote:I've seen loop unrolling that did things like that. The sequence of the instructions is "reordered" from the original code (which goes forwards) into code that goes backwards.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
Thank you for your files. It work fine. But I want to ask, why it missing m68k-elf-g++.exe ?powerofrecall wrote:Windows GCC 4.5.2 toolchain (about 55mb compressed, 800+ uncompressed)
And about the command "dd", dd if=temp.bin of=$@ bs=8K conv=sync
Can do in windows ?
Thanks.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
It was probably done with my first rev, which used gcc-core, which won't compile g++ for the 68000. He needs to redo that using the current instructions and makefile to get g++ for the 68000.Tomy wrote:Thank you for your files. It work fine. But I want to ask, why it missing m68k-elf-g++.exe ?powerofrecall wrote:Windows GCC 4.5.2 toolchain (about 55mb compressed, 800+ uncompressed)
And about the command "dd", dd if=temp.bin of=$@ bs=8K conv=sync
Can do in windows ?
Thanks.
The command dd.exe should be part of CygWin (you may need to install it if it's not in the base), and can be added to MinGW using unxutils from SourceForge:
http://unxutils.sourceforge.net/
Thanks Chilly Willy,
I use MinGW to make m68k-elf-g++.exe.
But when compile your example, it get ..
Why it point to my d: build directory ? I think it is not right.
And compiled BIN can not run.
I use MinGW to make m68k-elf-g++.exe.
But when compile your example, it get ..
Code: Select all
C:\gen\project\Tic\C++\MD>make
m68k-elf-as -m68000 --register-prefix-optional crt0.s -o crt0.o
m68k-elf-gcc -m68000 -Wall -O2 -c -fomit-frame-pointer -c crtstuff.c -o crtstu
ff.o
m68k-elf-g++ -m68000 -Wall -O2 -c -fno-exceptions -nostartfiles -ffreestanding
-fno-rtti -c main.cpp -o main.o
m68k-elf-as -m68000 --register-prefix-optional hw_md.s -o hw_md.o
m68k-elf-gcc -m68000 -Wall -O2 -c -fomit-frame-pointer -c font.c -o font.o
m68k-elf-g++ -T c:\gen/ldscripts/md.ld -Wl,-Map=output.map -nostdlib -ffreestand
ing -fno-rtti crt0.o crtstuff.o main.o hw_md.o font.o -Lc:\gen/m68k-elf/lib -Lc:
\gen/m68k-elf/m68k-elf/lib -lstdc++ -lc -lgcc -lnosys -o TicTacToe.elf
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-closer.o): In function `_close_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/closer.c:53: warning: _close is not implemented an
d will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-fstatr.o): In function `_fstat_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/fstatr.c:62: warning: _fstat is not implemented an
d will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-signalr.o): In function `_getpid_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/signalr.c:96: warning: _getpid is not implemented
and will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-isattyr.o): In function `_isatty_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/isattyr.c:58: warning: _isatty is not implemented
and will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-signalr.o): In function `_kill_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/signalr.c:61: warning: _kill is not implemented an
d will always fail
c:\gen/m68k-elf/m68k-elf/lib\libstdc++.a(basic_file.o): In function `std::__basi
c_file<char>::seekoff(long long, std::_Ios_Seekdir)':
d:\gen\build-gcc-m68k-elf-4.5.2\m68k-elf\libstdc++-v3\src/basic_file.cc:326: war
ning: _lseek is not implemented and will always fail
c:\gen/m68k-elf/m68k-elf/lib\libc.a(lib_a-openr.o): In function `_open_r':
d:\gen\build-newlib-m68k-elf-1.19.0\m68k-elf\newlib\libc\reent/../../../../../ne
wlib-1.19.0/newlib/libc/reent/openr.c:59: warning: _open is not implemented and
will always fail
c:\gen/m68k-elf/m68k-elf/lib\libstdc++.a(basic_file.o): In function `std::__basi
c_file<char>::xsgetn(char*, long)':
d:\gen\build-gcc-m68k-elf-4.5.2\m68k-elf\libstdc++-v3\src/basic_file.cc:288: war
ning: _read is not implemented and will always fail
c:\gen/m68k-elf/m68k-elf/lib\libstdc++.a(basic_file.o): In function `(anonymous
namespace)::xwrite(int, char const*, long)':
d:\gen\build-gcc-m68k-elf-4.5.2\m68k-elf\libstdc++-v3\src/basic_file.cc:117: war
ning: _write is not implemented and will always fail
m68k-elf-objcopy -O binary TicTacToe.elf temp.bin
dd if=temp.bin of=TicTacToe.bin bs=512K conv=sync
0+1 records in
1+0 records out
524288 bytes (524 kB) copied, 0.00722271 s, 72.6 MB/s
And compiled BIN can not run.
-
- Very interested
- Posts: 2984
- Joined: Fri Aug 17, 2007 9:33 pm
The warnings are from libnosys and are merely to tell you that if you used certain functions (like open/close/read/write/etc) aren't implemented by said library. They can be safely ignored as the functions AREN'T used.
The makefile uses an export... remember this from the OP?
You need to be using the MinGW terminal, not the Windows command shell, and you need to set those environment vars before running make.
How are you trying to "run" the TicTacToe.bin file? It ran in Gens/GS and on real hardware for me. It might be the issue with the exports causing trouble with making the binary as well.
The makefile uses an export... remember this from the OP?
Code: Select all
export GENDEV=/opt/toolchains/gen
export PATH=$GENDEV/m68k-elf/bin:$GENDEV/bin:$PATH
How are you trying to "run" the TicTacToe.bin file? It ran in Gens/GS and on real hardware for me. It might be the issue with the exports causing trouble with making the binary as well.