PE file format
My next objective is to explore how DLL and Reflective DLL injections work. However, before getting there, it is important to take a step back and understand what a PE file is and how it works.
A PE (Portable Executable) file is a standard file format used in Windows to store executable code, such as programs (.exe) and libraries (.dll). It contains a structured layout with headers, code, data, and resources that the operating system uses to load and execute the file.
In this post, I will explore the different sections that compose a PE file and their uses.
PE File Structure
This image represents the structure in memory of a PE:
It may be a bit overwhelming, but lets go one by one.
DOS Header (IMAGE_DOS_HEADER
):
DOS stands for Disk Operating System, an operating system that was widely used in the 1980s and early 1990s. This header offers backward compatibility with DOS environments.
If the file is run in a DOS environment, the DOS Header ensures that a message or a small piece of code is executed, typically displaying a message like “This program cannot be run in DOS mode.”
This header is basically this struct:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
typedef struct _IMAGE_DOS_HEADER { // DOS .EXE header
WORD e_magic; // Magic number
WORD e_cblp; // Bytes on last page of file
WORD e_cp; // Pages in file
WORD e_crlc; // Relocations
WORD e_cparhdr; // Size of header in paragraphs
WORD e_minalloc; // Minimum extra paragraphs needed
WORD e_maxalloc; // Maximum extra paragraphs needed
WORD e_ss; // Initial (relative) SS value
WORD e_sp; // Initial SP value
WORD e_csum; // Checksum
WORD e_ip; // Initial IP value
WORD e_cs; // Initial (relative) CS value
WORD e_lfarlc; // File address of relocation table
WORD e_ovno; // Overlay number
WORD e_res[4]; // Reserved words
WORD e_oemid; // OEM identifier (for e_oeminfo)
WORD e_oeminfo; // OEM information; e_oemid specific
WORD e_res2[10]; // Reserved words
LONG e_lfanew; // Offset to the NT header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
The e_magic
is the magic number and in a executable file this two bytes will always be MZ
(0x5A4D
).
Another important value here is the e_lfanew
, which is a pointer that indicates the offset in memory where we can find the start of NT header. We will always find this value at the 0x3C
offset.
DOS Stub (MS-DOS Stub Program
)
In modern executable files, this header just has an error message that prints “This program cannot be run in DOS mode” in case the program is loaded in DOS mode.
PE Header/NT Headers (IMAGE_NT_HEADERS
)
This header is composed of 3 other structures. This is the structure of the PE Header:
1
2
3
4
5
typedef struct _IMAGE_NT_HEADERS64 {
DWORD Signature;
IMAGE_FILE_HEADER FileHeader;
IMAGE_OPTIONAL_HEADER64 OptionalHeader;
} IMAGE_NT_HEADERS64, *PIMAGE_NT_HEADERS64;
- Signature Header: The NT header starts with a signature that identifies the file as a valid PE (like the magic number). This signature is “PE\0\0” (
0x50450000
), being \0 a null byte. - File Header: This header is the
IMAGE_FILE_HEADER
struct:
1
2
3
4
5
6
7
8
9
typedef struct _IMAGE_FILE_HEADER {
WORD Machine;
WORD NumberOfSections;
DWORD TimeDateStamp;
DWORD PointerToSymbolTable;
DWORD NumberOfSymbols;
WORD SizeOfOptionalHeader;
WORD Characteristics;
} IMAGE_FILE_HEADER, *PIMAGE_FILE_HEADER;
This header contains basic information about the the PE itself. The most important members for us are:
NumberOfSections
: Indicates the number of Sections that the PE will have. In a Portable Executable (PE) file, a section is a distinct region that contains a specific type of data or code.Characteristics
: It indicates some characteristics about the file, like if it is a DLL or a system file. More info here.SizeOfOptionalHeader
: The size of the next header (inside the PE Header), which is the Optional Header
- Optional Header: This header is the
IMAGE_OPTIONAL_HEADER
struct. It contains a lot of information about the PE file.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
typedef struct _IMAGE_OPTIONAL_HEADER64 {
WORD Magic;
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint;
DWORD BaseOfCode;
ULONGLONG ImageBase;
DWORD SectionAlignment;
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion;
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage;
DWORD SizeOfHeaders;
DWORD CheckSum;
WORD Subsystem;
WORD DllCharacteristics;
ULONGLONG SizeOfStackReserve;
ULONGLONG SizeOfStackCommit;
ULONGLONG SizeOfHeapReserve;
ULONGLONG SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER64, *PIMAGE_OPTIONAL_HEADER64;
Some of the most important members for us are:
Magic
- Describes the state of the image file (32 or 64-bit image)SizeOfCode
- The size of the.text
section.AddressOfEntryPoint
- Offset to the entry point of the file (Typically the main function)BaseOfCode
- Offset to the start of the.text
sectionSizeOfImage
- The size of the image file in bytesImageBase
- It specifies the preferred address at which the application is to be loaded into memory when it is executed. However, due to Window’s memory protection mechanisms like Address Space Layout Randomization (ASLR), it’s rare to see an image mapped to its preferred address.DataDirectory
- One of the most important members in the optional header. This is an array of IMAGE_DATA_DIRECTORY, which contains the directories in a PE file. It has a size of 16 and each element in the array is the different Data directory. TheIMAGE_DATA_DIRECTORY
struct is:1 2 3 4
typedef struct _IMAGE_DATA_DIRECTORY { DWORD VirtualAddress; DWORD Size; } IMAGE_DATA_DIRECTORY, *PIMAGE_DATA_DIRECTORY;
The different directories are:
The most important ones for us are the
Export Table
andImport Table
.The
Export Table
is a data directory that contains information about functions and variables that are exported from the executable so other executable files can use them . It contains the addresses of the exported functions and variables.The
Import Table
is another data directory that contains information about the addresses of functions that need to be imported from other executable files. These addresses are used to access the functions and their data in the other executable.
PE Sections
PE Sections contains all the data and code necessary for the executable program to run. Each section has a name and contain a specific type of content, such as code, data, etc.
There is not a specific number of sections in a PE file. That’s why the NumberOfSections
member inside the IMAGE_FILE_HEADER
helps to determine the number of them.
These are the most improtant types of PE sections that exist in almost every PE:
.text
- Contains the executable code which is the written code..data
- Contains initialized data which are variables initialized in the code..rdata
- Contains read-only data. These are constant variables prefixed withconst
..idata
- Contains the import tables. These are tables of information related to the functions called using the code. This is used by the Windows PE Loader to determine which DLL files to load to the process, along with what functions are being used from each DLL.
The
IMAGE_DATA_DIRECTORY
that points to theImport Table
will probably point here. However, you can’t assume the import section is called “.idata”. You should locate the imports usingIMAGE_OPTIONAL_HEADER64.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT]
.
.reloc
- Contains information on how to fix up memory addresses so that the program can be loaded into memory without any errors..rsrc
- Used to store resources such as icons and bitmaps
Each section has a header data structure:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
typedef struct _IMAGE_SECTION_HEADER {
BYTE Name[IMAGE_SIZEOF_SHORT_NAME];
union {
DWORD PhysicalAddress;
DWORD VirtualSize;
} Misc;
DWORD VirtualAddress;
DWORD SizeOfRawData;
DWORD PointerToRawData;
DWORD PointerToRelocations;
DWORD PointerToLinenumbers;
WORD NumberOfRelocations;
WORD NumberOfLinenumbers;
DWORD Characteristics;
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;
These headers are located just after the PE Header/NET Header and allow to locate the actual data in the memory.