ELF file format icon
(No Ratings Yet)
Loading...

ELF stands for Executable and Linkable Format, and even if you’re a programmer it is very possible you’ve never actually heard of it. Files in this format are produced by the compiler and most importantly – they are cross platform. You will find different types of formats produced by compilers on different platforms, but ELF has become the standard for unix-type systems in 1999.

Windows uses the PE – Portable Executable format, which is different. A representation of PE are .exe files. Mac OS for example, uses the Mach-O format, so no, you cannot run linux executables directly under OS X.

There are different types of files that could be in the ELF format:

  • executable files
  • object files
  • shared libraries
  • core dumps

Object files

Object files contain object code, which is machine code, that has not yet been linked. Linking is a phase of the build process.

There are several stages of the build process:

  • preprocessing
  • compilation
  • assembly
  • linking

Preprocessing will take care of your macros, join lines, and generally prepare your program for compilation.

Compilation will translate your code to assembly instructions specific to the target architecture. I feel it’s important to note that while the ELF file format is cross platform, assembly instructions created by this stage within the ELF file format aren’t.

During the assembly stage, an assembler will translate these assembly instructions to machine instructions – byte code. These instructions are the actual instructions that will run on the target processor, but they contain jumps that are partially parameterized, so that a linker can fill them in.

A linker will then either via dynamic or static linking, link your program together into an executable or shared library. With dynamic linking, some symbols will remain unresolved until a program is run, useful for shared libraries and such. With static linking, all library routines will be copied into the final program.

Executable files

Executable files are the product of the final stage of the build process. Depending on the type of linking, either the OS will do some additional linking, or just run the code you have built.

Shared libraries

Shared libraries are files, that can be shared by executable files. They are loaded into memory by the OS, when running dynamically linked executable files, rather than being copied by a linker, when it created a single executable file.

On unix-like systems, these files have an extension of .a or .so, are stored in /usr/lib, /usr/local/lib, or /lib and filenames always start with lib. On macOS, you will find library bundles, wrapped around the library files and metadata. Windows’ libraries usually have the extension .dll and you will find them in C:\Windows\System32

Core dumps

Memory or core dumps are particularly useful for debugging programs. They contain the state of memory of a computer program at a specific time, usually upon abnormal termination or easily put – a program crash. Along with memory, registers, stack pointer, memory management information and other processor flags and information are stored. Core dumps can be analyzed by tools like gdb (The GNU project debugger).

ELF for object files

In the following examples I am using a machine running ubuntu.
The ELF format contains a header, multiple or zero sections and multiple or zero segments. We will take a look how the files differ between an object and an executable.

First, let’s create a simple hello world program in c (called main.c):

1
2
3
int main(void) {
        return 0;
}

Let’s build the program, skipping the linking phase:

$ gcc -c main.c

Header

Using a helpful shell program readelf, let’s analyze the object file’s header:

$ readelf -h main.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          528 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         11
  Section header string table index: 8

In short, the output tells us we have a relocatable ELF file. A relocatable file holds code and data suitable for linking with other object files to create an executable or a shared object file. As you can see, memory addresses start at 0x0, because the memory addresses are relative.

Sections

You see that in the object file, there are no segments, only sections, because segments are only present in the executable, defined by the program headers. One or more sections from a relocatable will be put inside a segment of the executable by the linker. Sections are important in the linking and relocation phase.

With the next command, we will take a look at section headers:

$ readelf -S main.o
There are 11 section headers, starting at offset 0x210:
 
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00000040
       000000000000000b  0000000000000000  AX       0     0     1
  [ 2] .data             PROGBITS         0000000000000000  0000004b
       0000000000000000  0000000000000000  WA       0     0     1
  [ 3] .bss              NOBITS           0000000000000000  0000004b
       0000000000000000  0000000000000000  WA       0     0     1
  [ 4] .comment          PROGBITS         0000000000000000  0000004b
       0000000000000035  0000000000000001  MS       0     0     1
  [ 5] .note.GNU-stack   PROGBITS         0000000000000000  00000080
       0000000000000000  0000000000000000           0     0     1
  [ 6] .eh_frame         PROGBITS         0000000000000000  00000080
       0000000000000038  0000000000000000   A       0     0     8
  [ 7] .rela.eh_frame    RELA             0000000000000000  000001a0
       0000000000000018  0000000000000018   I       9     6     8
  [ 8] .shstrtab         STRTAB           0000000000000000  000001b8
       0000000000000054  0000000000000000           0     0     1
  [ 9] .symtab           SYMTAB           0000000000000000  000000b8
       00000000000000d8  0000000000000018          10     8     8
  [10] .strtab           STRTAB           0000000000000000  00000190
       000000000000000d  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

As you can see, we have no info about memory addresses, only offsets in ELF file.

Some of the sections are:

  • text – where your code is (goes to flash memory in embedded systems)
  • data – initialized variables and arrays (starts in flash memory, lives in RAM in embedded systems)
  • rodata – constants, readonly data
  • bss – uninitialized variables and arrays (ends up in RAM in embedded systems)

ELF for executables

Header

Next we can try to see the difference in headers by also linking the file:

$ gcc main.c

The executable file created will by default have the name of a.out, which is an old file format for object files and executables. Although the name remained, the file is in ELF format.

$ readelf -h a.out
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x4003e0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6568 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         9
  Size of section headers:           64 (bytes)
  Number of section headers:         31
  Section header string table index: 28

As you can see, the file now has the type of EXEC, meaning it is an executable file, and the memory addresses have been filled in by the linker. Also, a number of C library components are statically linked into the library, which we can also see in the difference in file sizes:

 12K a.out
4.0K main.c
4.0K main.o

Segments

As already mentioned, these are only present in the executable file. The section headers which are optional in the executable file, are in this case also present in it, which is why readelf can provide the section to segment mapping.

Let’s read the program headers (segment headers) in the executable file:

$ readelf -f a.out
Elf file type is EXEC (Executable file)
Entry point 0x4003e0
There are 9 program headers, starting at offset 64
 
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001f8 0x00000000000001f8  R E    8
  INTERP         0x0000000000000238 0x0000000000400238 0x0000000000400238
                 0x000000000000001c 0x000000000000001c  R      1
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x000000000000069c 0x000000000000069c  R E    200000
  LOAD           0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x0000000000000220 0x0000000000000228  RW     200000
  DYNAMIC        0x0000000000000e28 0x0000000000600e28 0x0000000000600e28
                 0x00000000000001d0 0x00000000000001d0  RW     8
  NOTE           0x0000000000000254 0x0000000000400254 0x0000000000400254
                 0x0000000000000044 0x0000000000000044  R      4
  GNU_EH_FRAME   0x0000000000000574 0x0000000000400574 0x0000000000400574
                 0x0000000000000034 0x0000000000000034  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10
  GNU_RELRO      0x0000000000000e10 0x0000000000600e10 0x0000000000600e10
                 0x00000000000001f0 0x00000000000001f0  R      1
 
 Section to Segment mapping:
  Segment Sections...
   00
   01     .interp
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
   03     .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
   04     .dynamic
   05     .note.ABI-tag .note.gnu.build-id
   06     .eh_frame_hdr
   07
   08     .init_array .fini_array .jcr .dynamic .got

As you can see in the output, there are many types of segments, for example:

LOAD
Segment’s content is loaded from the executable file at the offset of the file, specified in the output. We can also see how many bytes it should read with “FileSiz”.

GNU_STACK
Segment is a stack area. As you can see there are many zeroes at memory address, offsets, … That is because the kernel is the one that decides, where the stack segment will start and how big it will be.

ELF for shared libraries

Let’s create a shared library first:

$ nano test.c
1
2
3
4
#include <stdio.h>
void test(void) {
    printf("Shared library test\n");
}
$ nano test.h
1
2
3
4
#ifndef test_h__
#define test_h__
extern void test(void);
#endif  // test_h__
$ nano main.c
1
2
3
4
5
6
7
8
9
#include <stdio.h>
#include "test.h"
 
int main(void)
{
    printf("Main.c:\n");
    test();
    return 0;
}

Let’s build our code into PIC (position independent code). This will create a file called test.o.

$ gcc -c -fpic test.c

Next, let’s create a shared library from our object code. This will create a file called libtest.so.

$ gcc -shared  -o libtest.so test.o

Now we build our main.c with the shared library with dynamic linking, telling the gcc in which directory to look for libtest.so:

$ gcc -L /home/trinch/elftest/sharedlibrary main.c -ltest

For runtime, we need to make the shared library available to the loader:

$ export LD_LIBRARY_PATH=/home/trinch/elftest/sharedlibrary:$LD_LIBRARY_PATH

Now we can run a.out and we should get ‘Main.c: Shared library test’ printed to the console.

$ ./a.out

For the purposes of exploring the ELF format, we will only take a look at libtest.so.

Header

$ readelf -h libtest.so
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x5a0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6264 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         29
  Section header string table index: 26

As you can see, the type is now DYN (Shared object file) and both the sections and segments are present.

Credits:
Icon made by Freepik from www.flaticon.com

Your email is kept private. Required fields are marked *