Home PE file structure
Post
Cancel

PE file structure

PE file format

My next objective is to explore how DLL and Reflective DLL injections work. However, before getting there, it is important to take a step back and understand what a PE file is and how it works.

A PE (Portable Executable) file is a standard file format used in Windows to store executable code, such as programs (.exe) and libraries (.dll). It contains a structured layout with headers, code, data, and resources that the operating system uses to load and execute the file.

In this post, I will explore the different sections that compose a PE file and their uses.

PE File Structure

This image represents the structure in memory of a PE:

PE-Structure.png

It may be a bit overwhelming, but lets go one by one.

DOS Header (IMAGE_DOS_HEADER):

DOS stands for Disk Operating System, an operating system that was widely used in the 1980s and early 1990s. This header offers backward compatibility with DOS environments.

If the file is run in a DOS environment, the DOS Header ensures that a message or a small piece of code is executed, typically displaying a message like “This program cannot be run in DOS mode.”

This header is basically this struct:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
typedef struct _IMAGE_DOS_HEADER {      // DOS .EXE header
    WORD   e_magic;                     // Magic number
    WORD   e_cblp;                      // Bytes on last page of file
    WORD   e_cp;                        // Pages in file
    WORD   e_crlc;                      // Relocations
    WORD   e_cparhdr;                   // Size of header in paragraphs
    WORD   e_minalloc;                  // Minimum extra paragraphs needed
    WORD   e_maxalloc;                  // Maximum extra paragraphs needed
    WORD   e_ss;                        // Initial (relative) SS value
    WORD   e_sp;                        // Initial SP value
    WORD   e_csum;                      // Checksum
    WORD   e_ip;                        // Initial IP value
    WORD   e_cs;                        // Initial (relative) CS value
    WORD   e_lfarlc;                    // File address of relocation table
    WORD   e_ovno;                      // Overlay number
    WORD   e_res[4];                    // Reserved words
    WORD   e_oemid;                     // OEM identifier (for e_oeminfo)
    WORD   e_oeminfo;                   // OEM information; e_oemid specific
    WORD   e_res2[10];                  // Reserved words
    LONG   e_lfanew;                    // Offset to the NT header
  } IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

The e_magic is the magic number and in a executable file this two bytes will always be MZ (0x5A4D).

Another important value here is the e_lfanew , which is a pointer that indicates the offset in memory where we can find the start of NT header. We will always find this value at the 0x3C offset.

DOS Stub (MS-DOS Stub Program)

In modern executable files, this header just has an error message that prints “This program cannot be run in DOS mode” in case the program is loaded in DOS mode.

1_rELYvjYD9HZmHmhEkZQabw.webp

PE Header/NT Headers (IMAGE_NT_HEADERS)

This header is composed of 3 other structures. This is the structure of the PE Header:

1
2
3
4
5
typedef struct _IMAGE_NT_HEADERS64 {
    DWORD                   Signature;
    IMAGE_FILE_HEADER       FileHeader;
    IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;
  • Signature Header: The NT header starts with a signature that identifies the file as a valid PE (like the magic number). This signature is “PE\0\0” (0x50450000 ), being \0 a null byte.
  • File Header: This header is the IMAGE_FILE_HEADER struct:
1
2
3
4
5
6
7
8
9
typedef struct _IMAGE_FILE_HEADER {
  WORD  Machine;
  WORD  NumberOfSections;
  DWORD TimeDateStamp;
  DWORD PointerToSymbolTable;
  DWORD NumberOfSymbols;
  WORD  SizeOfOptionalHeader;
  WORD  Characteristics;
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;

This header contains basic information about the the PE itself. The most important members for us are:

  1. NumberOfSections : Indicates the number of Sections that the PE will have. In a Portable Executable (PE) file, a section is a distinct region that contains a specific type of data or code.
  2. Characteristics : It indicates some characteristics about the file, like if it is a DLL or a system file. More info here.

    image.png

  3. SizeOfOptionalHeader: The size of the next header (inside the PE Header), which is the Optional Header
  • Optional Header: This header is the IMAGE_OPTIONAL_HEADER struct. It contains a lot of information about the PE file.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
typedef struct _IMAGE_OPTIONAL_HEADER64 {
  WORD                 Magic;
  BYTE                 MajorLinkerVersion;
  BYTE                 MinorLinkerVersion;
  DWORD                SizeOfCode;
  DWORD                SizeOfInitializedData;
  DWORD                SizeOfUninitializedData;
  DWORD                AddressOfEntryPoint;
  DWORD                BaseOfCode;
  ULONGLONG            ImageBase;
  DWORD                SectionAlignment;
  DWORD                FileAlignment;
  WORD                 MajorOperatingSystemVersion;
  WORD                 MinorOperatingSystemVersion;
  WORD                 MajorImageVersion;
  WORD                 MinorImageVersion;
  WORD                 MajorSubsystemVersion;
  WORD                 MinorSubsystemVersion;
  DWORD                Win32VersionValue;
  DWORD                SizeOfImage;
  DWORD                SizeOfHeaders;
  DWORD                CheckSum;
  WORD                 Subsystem;
  WORD                 DllCharacteristics;
  ULONGLONG            SizeOfStackReserve;
  ULONGLONG            SizeOfStackCommit;
  ULONGLONG            SizeOfHeapReserve;
  ULONGLONG            SizeOfHeapCommit;
  DWORD                LoaderFlags;
  DWORD                NumberOfRvaAndSizes;
  IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;

Some of the most important members for us are:

  1. Magic - Describes the state of the image file (32 or 64-bit image)
  2. SizeOfCode - The size of the .text section.
  3. AddressOfEntryPoint - Offset to the entry point of the file (Typically the main function)
  4. BaseOfCode - Offset to the start of the .text section
  5. SizeOfImage - The size of the image file in bytes
  6. ImageBase - It specifies the preferred address at which the application is to be loaded into memory when it is executed. However, due to Window’s memory protection mechanisms like Address Space Layout Randomization (ASLR), it’s rare to see an image mapped to its preferred address.
  7. DataDirectory - One of the most important members in the optional header. This is an array of IMAGE_DATA_DIRECTORY, which contains the directories in a PE file. It has a size of 16 and each element in the array is the different Data directory. The IMAGE_DATA_DIRECTORY struct is:

    1
    2
    3
    4
    
     typedef struct _IMAGE_DATA_DIRECTORY {
         DWORD   VirtualAddress;
         DWORD   Size;
     } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
    

    The different directories are:

    image.png

    The most important ones for us are the Export Table and Import Table.

    The Export Table is a data directory that contains information about functions and variables that are exported from the executable so other executable files can use them . It contains the addresses of the exported functions and variables.

    The Import Table is another data directory that contains information about the addresses of functions that need to be imported from other executable files. These addresses are used to access the functions and their data in the other executable.

PE Sections

PE Sections contains all the data and code necessary for the executable program to run. Each section has a name and contain a specific type of content, such as code, data, etc.

There is not a specific number of sections in a PE file. That’s why the NumberOfSections member inside the IMAGE_FILE_HEADER helps to determine the number of them.

These are the most improtant types of PE sections that exist in almost every PE:

  • .text - Contains the executable code which is the written code.
  • .data - Contains initialized data which are variables initialized in the code.
  • .rdata - Contains read-only data. These are constant variables prefixed with const.
  • .idata - Contains the import tables. These are tables of information related to the functions called using the code. This is used by the Windows PE Loader to determine which DLL files to load to the process, along with what functions are being used from each DLL.

The IMAGE_DATA_DIRECTORY that points to the Import Table will probably point here. However, you can’t assume the import section is called “.idata”. You should locate the imports using IMAGE_OPTIONAL_HEADER64.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].

  • .reloc - Contains information on how to fix up memory addresses so that the program can be loaded into memory without any errors.
  • .rsrc - Used to store resources such as icons and bitmaps

Each section has a header data structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
typedef struct _IMAGE_SECTION_HEADER {
  BYTE  Name[IMAGE_SIZEOF_SHORT_NAME];
  union {
    DWORD PhysicalAddress;
    DWORD VirtualSize;
  } Misc;
  DWORD VirtualAddress;
  DWORD SizeOfRawData;
  DWORD PointerToRawData;
  DWORD PointerToRelocations;
  DWORD PointerToLinenumbers;
  WORD  NumberOfRelocations;
  WORD  NumberOfLinenumbers;
  DWORD Characteristics;
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;

These headers are located just after the PE Header/NET Header and allow to locate the actual data in the memory.

This post is licensed under CC BY 4.0 by the author.