Last updated: 14 June 2019
First published: 6 April 2015

Designing Linker Scripts with GNU Linker

Whether you are writing a program for embedded systems or PC in general, linker scripts are implicitly used by the linker. Default linker script generated by various IDE tools will take care of different memory types and placement of input sections in the corresponding memory region based on the selected micro-controller's memory map.
So why should we bother about writing our own? there are situations where customizing a linker script may be the only option or it may make things easier, like:

Memory types: Embedded systems usually have ROM, Flash and SRAM memories. These memories may further be distinguished by execute, read, write, latency and shareable attributes. A default linker script may not be able to address all possible requirements of an application.
Programming Flash: Flash memory usually can't be programmed by the code running in the same Flash memory. If this is the case then on the fly copy of part of code region is required where virtual and load memory addresses will be different.
Functions with fixed addresses: In some applications low-level driver functions may be part of ROM and/or Flash memory region and exported to higher level RTOS applications. These functions will have fixed addresses and these addresses needs to be preserved across firmware upgrades.

Before we get into details let us review some basics of linker scripts with the assumption of using a GNU ARM toolchain and input object files and final executable in ELF format.

Useful terms and definitions

Object files, Executable and output Binary

C and assembly source files are compiled into object files by the compiler and these become input to the linker along with any additional libraries. With linker script we can fine control how sections from input object files are placed in output object or executable file and in which memory regions. From the executable file, final binary can be extracted in various formats like hex, mot, plain binary, etc. using objcopy utility.

Section

Sections contain code, data or any other information inside object files. Each section has a name, content and size. Standard naming convention is to store executable instructions code in .text section, initialized global variables in .data section, uninitialized global variables in .bss section, vector table in .vectors section, you may always find variation in these conventions, the constraints on section name depends on the object file format used. Each section also has two addresses: Virtual Memory Address (VMA) and Load Memory Address (LMA).
Sections are described using SECTIONS and section-output command, this command controls location, type, alignment, order of placement, target memory region, program headers and fill pattern of output sections, most of these parameters are optional, as simple form is as shown:

Listing-1: Section description command example
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
SECTIONS
{
    output_section_name :
    {
        input_section_descriptions
        symbol_assignments
        
    } >memory_region AT>lma_region
}

/* Comment syntax in linker script is similar to C programming language.
SECTIONS command example:
*/
SECTIONS
{
    .text :
    {
        vector_table.o (.rodata)
        * (.text)
        _text_end_addr = . ;
    
    } >FLASH
}

Linker script can have at most one SECTIONS command and as many output section definitions as required. Input sections can be described as object file name followed by an optional list of section names in parentheses separated by spaces. Input sections can also be described with wild card '*' to match all input object files on command line. Linker script symbols are accessible from C and vice versa, its naming convention should follow C language symbol naming rules. dot '.' is linker variable containing current location counter or VMA. Symbol expressions must be terminated by a semicolon. Output section would be placed in FLASH memory region. Linker will calculate the starting VMA address from the end of previous section, if there is no previous section then from the beginning of FLASH memory region, linker will also take into account of any section alignment requirement. As LMA is not provided in the above example, LMA would be equal to VMA.

VMA

Virtual Memory Address of a section is the running address of the program. For example, let VMA of function foo() inside .text section be 0x1000, so when foo() is called, program counter would jump to address 0x1000 where it will execute the first instruction of foo().

LMA

Load Memory Address of a section is the address in memory where it will be loaded or programmed or flashed. Continuing with the above example, let the LMA of .text section be 0x200 and the LMA of function foo() inside it be 0x240, before this function can be called, some other piece of the program need to copy the code of foo() from 0x240 to 0x1000.
If LMA of a section is equal to its VMA then it means the memory region where the section is loaded is both readable and executable as in ROM and on-chip Flash memories.

Symbols

Every defined symbol in C program has a name, address and an allocated memory to hold the corresponding value. Symbols defined in a C source code can be accessed in linker script and vice versa. Symbols defined in a linker script only has name, address and no memory is allocated to hold any value, so only address information can be obtained from C source code. Attempting to dereference a linker script symbol in C code may lead to incorrect results.
Symbols are stored in symbol table inside object files. You can use objdump -t object_file to view symbol table.

Memory Layout

MEMORY command is used to specify size, location and attributes of all memory blocks available in the target. Each memory block also has read-only ('r'), read/write ('w') and executable ('x') attributes. ORIGIN and LENGTH keywords are used to specify start address and length of the memory region respectively, these can be expression and should get evaluate to constant. Suffixes 'K' and 'M' with numeric constant can be used to specify Kilobyte and Megabyte respectively. While designing linker scripts it is good practice to specify output section descriptions in a target independent way and another target dependent part that describes mapping of output sections to required memory region, this is achieved by REGION_ALIAS function. First parameter is an alias name to a memory region. Just by specifying a different set of memory aliases we can completely change memory mappings of output sections without modifying section description commands, as shown below.

Listing-2: Memory and Region Alias command example
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
/* Linker script code for memory layout description */
MEMORY
{
    FLASH (rx)    : ORIGIN = 0x0, LENGTH = 512K
    SRAM1 (wx)    : ORIGIN = 0x20000000, LENGTH = 128K
    SRAM2 (wx)    : ORIGIN = 0x40000000, LENGTH = 64K
}

/* One possible configuration */
REGION_ALIAS("TEXT_REGION", FLASH);
REGION_ALIAS("RODATA_REGION", FLASH);
REGION_ALIAS("RWDATA_REGION", SRAM1);
REGION_ALIAS("BSS_REGION", SRAM1);

/* Another possible configuration, say for development purposes*/
REGION_ALIAS("TEXT_REGION", SRAM1);
REGION_ALIAS("RODATA_REGION", SRAM2);
REGION_ALIAS("RWDATA_REGION", SRAM2);
REGION_ALIAS("BSS_REGION", SRAM2);

Placing Sections

Listing 3, shows a sample linker script which describes memory layout and section placements which are independent of target memory layout. It is very generic and it is designed for bare metal program i.e. we can use it without including any standard library, which also means we will have to write our own code to initialize .data and .bss sections. You can download the sample source codes to see how to write simple bare metal program with linker script. You can compile the source code for arm cortex M0 or M3 and simulate it on GDB to better understand the program flow. To make things easier there is no assembly startup code (it is not required), vector table is written in C and GNU GCC's attribute feature is not used to define any sections.
Now, let us look into the script in some more detail:

ENTRY point command points the linker to the first instruction in the program, from the entry point linker can walk through the code graph and figure out unreferenced code to discard them from the output object file. KEEP keyword is used to retain unreferenced sections that will get discarded by the linker, as in the example, vector_table.o (.rodata) section won't be referenced in code however it contains addresses of interrupt service handlers to be externally referenced by the CPU.
Notice that for .data section both >mem_region and AT>lma_region options are specified as we want to store the initialized data after the end of .text section and use to it initialize the SRAM addresses in runtime before calling application's main() function.
PROVIDE keyword is used to define symbol in linker script only when it is referenced and not defined in any other linked library or object files.
Uninitialized non-static global variables may get stored in COMMON section, hence it is included in output .bss section.

Listing-3: Placing Sections
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
MEMORY
{
    FLASH (rx)    : ORIGIN = 0x0, LENGTH = 512K
    SRAM1 (wx)    : ORIGIN = 0x20000000, LENGTH = 128K
}

/* One possible configuration */
REGION_ALIAS("TEXT_REGION", FLASH);
REGION_ALIAS("RODATA_REGION", FLASH);
REGION_ALIAS("RWDATA_REGION", SRAM1);
REGION_ALIAS("BSS_REGION", SRAM1);


/* Linker script example to place sections independent of target memory map */

ENTRY(reset_handler)

SECTIONS
{
    .text :
    {
        KEEP(vector_table.o (.rodata))
        *(.text)
    } >TEXT_REGION

    .data :
    {
        *(.data)
    } >RWDATA_REGION AT>TEXT_REGION

    .rodata :
    {
        *(.rodata)
    } >RODATA_REGION

    .bss :
    {
        *(.bss) *(COMMON)

    } >BSS_REGION

    PROVIDE(_data_lma_start = LOADADDR(.data));
    PROVIDE(_data_lma_end = LOADADDR(.data) + SIZEOF(.data) - 1);
    PROVIDE(_data_vma_start = ADDR(.data));

    PROVIDE(_bss_vma_start = ADDR(.bss));
    PROVIDE(_bss_vma_end = ADDR(.bss) + SIZEOF(.bss) - 1);

    PROVIDE(_heap_start = ADDR(.bss) + SIZEOF(.bss));
    PROVIDE(_stack_top = ORIGIN(SRAM1) + LENGTH(SRAM1) - 4);
}


/* ------------ Accessing linker script symbols from C -------------- */

extern unsigned char _data_lma_start;
extern unsigned char _data_lma_end;
extern unsigned char _data_vma_start;

/* Initialize .data section */
data_copy(&_data_vma_start, &_data_lma_start, &_data_lma_end);

/* ... */

void data_copy(unsigned char *dst, unsigned char *src_start, unsigned char *src_end)
{
    while (src_start <= src_end)
    {
        *dst++ = *src_start++;
    }
}

Overlaying Sections

Section overlaying may be used in applications where code is stored in an external non executable memory or there is a fast SRAM, in either case an overlay manager is required to copy sections into and out of overlay memory region during runtime. OVERLAY command is used inside the SECTIONS command and its syntax is similar to output section description command. Sections inside the OVERLAY command will have same virtual memory addresses and consecutive load memory addresses. Specifying overlay start address is optional, if not provided start address defaults to current location counter. In listing 4, it is assumed that some sort of overlay manager is in ROM and application code is overlayed into OVERLAY_REGION, an alias. Dummy SRAM2 region is defined to load overlay sections but in actual application it would be either a on-chip or off-chip non-volatile memory. With NOCROSSREFS keyword, linker would report an error if there are any symbol references across sections with same VMA. The linker automatically provides two symbols pointing to load addresses of each section within an OVERLAY command: __load_start_section_name and __load_stop_section_name.

Listing-4: Overlaying Sections
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
MEMORY
{
    ROM (rx)    : ORIGIN = 0x0, LENGTH = 32K
    SRAM1 (wx)    : ORIGIN = 0x20000000, LENGTH = 128K
    SRAM2 (wx)    : ORIGIN = 0x40000000, LENGTH = 128K
}

REGION_ALIAS("TEXT_REGION", ROM);
REGION_ALIAS("RODATA_REGION", ROM);
REGION_ALIAS("RWDATA_REGION", SRAM1);
REGION_ALIAS("BSS_REGION", SRAM1);

REGION_ALIAS("OVERLAY_REGION", SRAM1);
REGION_ALIAS("OVERLAY_LOAD_REGION", SRAM2);


ENTRY(reset_handler)

SECTIONS
{
    .text :
    {
        KEEP(non-overlayed/vector_table.o (.rodata))
        non-overlayed/*.o(.text)
    } >TEXT_REGION

    .data :
    {
        non-overlayed/*.o(.data)
    } >RWDATA_REGION AT>TEXT_REGION

    .rodata :
    {
        non-overlayed/*.o(.rodata)
    } >RODATA_REGION

    .bss :
    {
        non-overlayed/*.o(.bss COMMON)
    } >BSS_REGION
    
    /* Should be a fixed/constant value */
    _overlay_start = ADDR(.bss) + SIZEOF(.bss);
    
    /*
    Linker automatically defines:
    __load_start_text_o1, __load_stop_text_o1
    __load_start_text_o2, __load_stop_text_o2
    */
    OVERLAY (_overlay_start): NOCROSSREFS
    {    
        .text_o1 { o1/*.o(.text) }
        .text_o2 { o2/*.o(.text) }
    } >OVERLAY_REGION AT>OVERLAY_LOAD_REGION
    
    /*
    Linker automatically defines:
    __load_start_data_o1, __load_stop_data_o1
    __load_start_data_o2, __load_stop_data_o2
    */
    OVERLAY :
    {
        .data_o1 { o1/*.o(.data) }
        .data_o2 { o2/*.o(.data) }
    } >OVERLAY_REGION AT>OVERLAY_LOAD_REGION
    
    /*
    Linker automatically defines:
    __load_start_bss_o1, __load_stop_bss_o1
    __load_start_bss_o2, __load_stop_bss_o2
    */
    OVERLAY :
    {    
        .bss_o1 { o1/*.o(.bss COMMON) }
        .bss_o2 { o2/*.o(.bss COMMON) }
    } >OVERLAY_REGION
    
    /* ... */
} 

When customizing linker script, it is always a good practice to generate linker map file to cross verify section placement in the output ELF file.
There are many more optional keywords and commands supported in linker scripts to meet all sort of application requirements, please refer to the reference manual of GNU linker.

References

Using ld The GNU linker↗