I wanted to make it easier to load code/data into the 2kB RAM at 0xC0000000 by creating a special section for it and having the crt0 code copy it during startup.
In my linker script I've got something like this (I'm leaving out everything irrelevant, there's obviously more inside SECTIONS for example):
OUTPUT_ARCH(sh)
SEARCH_DIR(.)
/*
* Setup the memory map of the SEGA 32X.
* stack grows down from high memory.
*
* The memory map look like this:
* +--------------------+ <- low memory
* | .text |
* | _etext |
* | ctor list | the ctor and dtor lists are for
* | dtor list | C++ support
* +--------------------+
* | .data | initialized data goes here
* | _data |
* | _edata |
* +--------------------+
* | .bss |
* | _bstart | start of bss, cleared by crt0
* | _bend | start of heap, used by sbrk()
* +--------------------+
* . .
* . .
* . .
* | __stack | top of stack
* +--------------------+
*/
MEMORY
{
rom : ORIGIN = 0x02000000, LENGTH = 0x00400000
ram : ORIGIN = 0x06000000, LENGTH = 0x0003F000
}
/*
* allocate the stack to be at the top of memory, since the stack
* grows down
*/
PROVIDE (__stack = 0x06040000);
SECTIONS
{
.text 0x02000000:
{
*(.text)
. = ALIGN(0x4);
__CTOR_LIST__ = .;
LONG((__CTOR_END__ - __CTOR_LIST__) / 4 - 2)
*(.ctors)
LONG(0)
__CTOR_END__ = .;
__DTOR_LIST__ = .;
LONG((__DTOR_END__ - __DTOR_LIST__) / 4 - 2)
*(.dtors)
LONG(0)
__DTOR_END__ = .;
*(.rodata*)
__INIT_SECTION__ = . ;
*(.init)
LONG (0x000B0009) /* rts ; nop */
__FINI_SECTION__ = . ;
*(.fini)
LONG (0x000B0009) /* rts ; nop */
*(.lit)
. = ALIGN(0x4);
_etext = .;
} > rom
.data 0x06000000 :
AT ( ADDR (.text) + SIZEOF (.text) )
{
_data = . ;
*(.shdata)
*(.data)
. = ALIGN(0x4);
*(.data.align4)
*(.data.align2)
*(.data.align1)
. = ALIGN(0x4);
_edata = . ;
} > ram
_sdata = SIZEOF (.data);
.bss 0x06000000 + SIZEOF (.data) :
{
_bstart = . ;
*(.shbss)
*(.bss)
*(COMMON)
*(.eh_fram)
*(.eh_frame)
_bend = . ;
__bend = _bend;
} > ram
. = ALIGN(0x4);
_end = .; PROVIDE (end = .);
}
I don't include the cache in it - but the cache is small enough you should really only be copying assembly to it. You can check the sh2_crt0.s file in Wolf32X to see how I do that.
Note that the data sections are compiled to rom with a ram load address using the AT syntax.
the cache is small enough you should really only be copying assembly to it
That's what I'm doing. It's for my GB emulator.
I know how to copy the stuff anyway, I just think it'd be much more convenient to have a special section dedicated to it, especially when there are thousands of lines of code divided into multiple source files. The GNU linker for ARM targets can do this, but I've had no luck with the SH version so far..
the cache is small enough you should really only be copying assembly to it
That's what I'm doing. It's for my GB emulator.
I know how to copy the stuff anyway, I just think it'd be much more convenient to have a special section dedicated to it, especially when there are thousands of lines of code divided into multiple source files. The GNU linker for ARM targets can do this, but I've had no luck with the SH version so far..
Then do it just the way I did for the .data section - set to be compiled right behind the code (or code and data), but with a the cache address.
The thing is, if you make the code PC relative (and assembly, that's easy enough to do), then it doesn't matter what section it's compiled in. As long as you copy it and purge the cache, it'll work. That's why I didn't make a special cache section - it wasn't needed.
If you need a jump table, make the table in rom offsets from the start of the table/code, then when you move the code to cache, "fix" the table by adding the address of the table/code.
The problem isn't what address it gets, the problem is that the linker ignores it when generating my binary. It shows up in the map file, and based on the addresses there it should've been right after the data section in the binary, but the file just ends there.
The thing is, if you make the code PC relative (and assembly, that's easy enough to do), then it doesn't matter what section it's compiled in. As long as you copy it and purge the cache, it'll work.
It is all PC relative, and it does work when I copy it to cache. I just think this it gets much uglier than it should've needed to be. A ".section .whatever" should've been sufficient.
Looking at the link script again, it appears to be okay. That leaves you with two answers: you aren't specifying the section correctly in the source; the linker won't generate data for anything other than .text or .data. If it's the latter, you're stuck putting it in .code or .data and moving it from there.
Seriously, there's only ONE REASON to have a separate section for the cache code - to avoid relocating non pc relative code yourself. You still have to copy it yourself. That doesn't change. If it's pc relative, then the code is the same no matter where it is. If you're concerned that the code might be too big, add a check for it.
the linker won't generate data for anything other than .text or .data. If it's the latter ...
It seems like that's the case for --oformat binary and --oformat elf32-big. --oformat elf32-sh doesn't work at all, it just complains about some relocation that makes no sense at all.
Seriously, there's only ONE REASON to have a separate section for the cache code - to avoid relocating non pc relative code yourself.
For me the main reason is that I have maybe 25-30 functions that I want to copy to cache. Few or none of these are adjacent, and they are located in 2 source files that are assembled separately. Unless I break the logical code order that I prefer and lump all the stuff that I want to copy to cache together (at least internally within each source file) the address calculations become a mess. Not to mention that some of these functions are referenced in maybe 10-15 places each, so I have to change all those as well.
the linker won't generate data for anything other than .text or .data. If it's the latter ...
It seems like that's the case for --oformat binary and --oformat elf32-big. --oformat elf32-sh doesn't work at all, it just complains about some relocation that makes no sense at all.
Seriously, there's only ONE REASON to have a separate section for the cache code - to avoid relocating non pc relative code yourself.
For me the main reason is that I have maybe 25-30 functions that I want to copy to cache. Few or none of these are adjacent, and they are located in 2 source files that are assembled separately. Unless I break the logical code order that I prefer and lump all the stuff that I want to copy to cache together (at least internally within each source file) the address calculations become a mess. Not to mention that some of these functions are referenced in maybe 10-15 places each, so I have to change all those as well.
Oh... I guess a separate section DOES make sense in that case.
Okay, try a text sub-section... like ".text.cache" and add an entry in the .text section at the end for .text.cache. You'll notice my .data section has a number of sub-sections like that (I didn't make them - the SH2 gcc toolchain did).
That won't give you the virtual address, but it should group all the different parts together.