Windows Address Translation Deep Dive - Part 1

x BlueRobot · Mar 16, 2021

This is one of the most fundamental topics which you should read about while you're debugging. The topic will be far too long to write into one post so I'm planning to split it into at least two separate posts; possibly three posts if I deem that it's needed. Please note that this is a technical topic and I'm going to be discussing as much technical content as possible, and even then, we'll only really be scratching at the surface of memory address translation. We'll be looking at segmentation and its role in x86 address translation and how it is ignored in x86-x64. Afterwards, we'll be prepared to take a look at paging and why it was introduced.

First of all, we need to go back to the past - the 16-bit era - and take a look at memory segmentation. A feature which still exists today on modern processors but is thankfully ignored on x64 processors when operating in long mode. Although, before we take a look at that, it's important to recognise that there are three fundamental memory models: physical, flat (sometimes called linear) and segmented. Along with this, there are three modes of operation which the processor can be in: real mode, protected mode and system management mode (SMM).

Differences in Memory Models:

As the name suggests, the physical memory model is just that; physical memory. We've able to write directly to RAM and there is no absolutely no protection mechanisms in place. Once we've addressed all of the available physical memory, then the system will simply crash and we'll be unable to continue.

The flat memory model adds the concept of virtual memory. Now the address space is a single contiguous sequential region of bytes, which each byte directly corresponding to a byte in physical memory. Each address in the flat memory model is known as a linear memory address. The address space is commonly referred to as the linear address space.

Now, the segmented memory model builds upon the flat memory model by dividing the address space into distinct regions called segments. Each address is determined by combining the effective address and the base address of the segment. This is known as the logical address, and in some circumstances, it it is also referred to as a far pointer.
The base address is stored in a special register called the segment selector. The effective address is an offset into the segment referenced by the segment selector.

Now, you could consider the paging model as a separate memory model, however, since it builds upon the flat and segmented memory models, I think it would be easier to refer to it as more of an extension of those memory models. It is also important to note, despite Windows using a segmented memory mode, the details of the process address space are abstracted away from the developer, and effectively the process address space of a process is flat. This is especially true when the processor is operating in long mode. We'll discuss the different operating modes of the processor in the next section.

Differences in Modes:

Real Mode

Real Mode is first mode in which the processor is in when the computer first boots. This is before the operating system has been loaded. It implements the segmented memory model and there is very limited memory protections in place and therefore it is very easy to corrupt the system in this mode. The address space is limited to 20 bits and thus only 1MB of RAM is addressable. Now, each virtual (logical) address is split into two distinct parts: a 16-bit segment selector and a 16-bit offset. As mentioned previously, the segment selector refers to the base address of the segment and the offset is the effective address which is used to index into that segment. This is then used to map to a physical memory address in RAM.

While operating in Real Mode, the size of each register is limited to 16-bits. The registers are mostly a shortened version of their x64 counterparts, but please note that many registers which are available in its successor, are not available in Real Mode. There a few key registers which we'll need to keep in mind, the segment registers are especially important, since they are key when using the segmented memory model.

The four segment registers are as follows:

CS - known as the code segment - contains the address of the segment selector for currently executing code block.

DS - known as the data segment - contains the address of the segment selector for the program's data.

SS - known as the stack segment - contains the address of the segment selector for the program's stack.

ES - known as the extra segment - acts as an additional data or code segment if the program requires it.

There is also the GS and FS segments which have no special meaning and are used by operating system as it so wishes. On Windows x86-x64, the GS segment is used to store the base address of the thread environment block (TEB).

These segment registers can be found in WinDbg using the standard r command.

Rich (BB code):

0: kd> r
rax=fffff8041cdd4e80 rbx=ffff9888c9996000 rcx=ffff9888c9996000
rdx=ffff9888d4332010 rsi=0000000000000000 rdi=ffff9888d2cd0000
rip=fffff8041cdd4e80 rsp=fffff80417474518 rbp=fffff80417474800
r8=0000000000000000  r9=0000000000000000 r10=0000000000000000
r11=0000000000000000 r12=ffff9888d2cd0000 r13=fffff80417474cf8
r14=0000000000000000 r15=ffff9888d2018000
iopl=0         nv up ei pl nz na po nc
cs=0010  ss=0000  ds=0000  es=0000  fs=0000  gs=0000             efl=00000206
igdkmd64+0x334e80:
fffff804`1cdd4e80 48895c2408      mov     qword ptr [rsp+8],rbx ss:fffff804`17474520=ffff9888c9996000

In addition to the segment registers which store the segment selectors, there are the pointer registers, which are used to store the effective address (offset) for a particular segment. These are as follows:

IP - the instruction pointer - contains the address of the currently executing instruction and is used in conjunction with the code segment.

SP - the stack pointer - contains the address of the for the top of the current stack frame and is thus used with the data segment.

BP - the base pointer - points to the bottom of the current stack frame and is used with the stack segment.

Protected Mode

Now, Protected Mode is where things tend to become slightly more complicated. The segmented memory model is still used, however, paging has been added on top of this to provide additional more granular memory protection. On x64 systems, Protected Mode is extended further with the introduction of Long Mode. The key difference to note at this point in time, is that Long Mode sets the base address of all the segment registers (excluding GS and FS) to 0, which effectively, means that the address space is actually more of a flat memory model with the addition of paging. For now, we'll simply focus on the segmented aspect of Protected Mode, and will pretend that paging has not been enabled.

The same segment selectors exist as stated beforehand, however, they no longer are used to map to a physical address space. Instead, x86 has introduced two registers which are the global descriptor table register (GDTR) and the local descriptor table register (LDTR). The local descriptor table is not used on Windows and therefore we'll ignore it and pretend it does not exist. The GDTR contains the base address for the global descriptor table, each entry in the table is known as a segment descriptor and is used to describe a particular segment in the linear address space. The segment selector registers are used to index into the global descriptor table (GDT).

The segment selector is still 16-bits, although, the effective address has now been extended to 32-bits instead. The linear address is then constructed in the same manner as before, with one key difference, the base address is take from the segment descriptor and then combined with the effective address. Of course, in this case, the linear address is now 32-bits and the reason why x86 operating systems are able to address up to 4GB of RAM.

The following diagram describes the translation process from a logical address into the linear address space. With paging, the linear address would then be translated through the use of page tables to a physical address. We'll cover this process in a later part of this tutorial series.

Before we continue, let's explore some of the concepts which we've described using WinDbg. The GDTR can be dumped with the r command like any other register.

Rich (BB code):

0: kd> r @gdtr
gdtr=fffff80417462fb0

Alternatively, the base address of the GDT can be found within the PCR structure for each processor.

Rich (BB code):

0: kd> dt _KPCR -y GdtBase fffff8040defa000
nt!_KPCR
   +0x000 GdtBase : 0xfffff804`17462fb0 _KGDTENTRY64

The size of the table can be found by dumping the GDTL register as follows:

Rich (BB code):

0: kd> r @gdtl
gdtl=0057

To dump a particular segment descriptor, then use the dg command along with the name of the segment selector which you wish to dump. Alternatively, you can provide the address of the segment selector.

Rich (BB code):

2: kd> dg 0x33
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0033 00000000`00000000 00000000`00000000 Code RE Ac 3 Nb By P  Lo 000002fb

Let's examine of the fields shown by the command. Firstly, the Base Address is 0, which is what we would expect on x64 systems which are running on Long Mode. The base address is like this for almost all segments as described previously. The Type field describes the type of segment descriptor and in this example, the segment descriptor is for a code segment register i.e. the cs register. The other two important fields are the Privilege Level and the Present field. The Privilege Level describes the Descriptor Privilege Level (DPL) which corresponds which ring level the segment is able to be accessed at. The Present field indicates if the segment is resident within physical memory or not.

In fact, the segment selector can be dumped with the use of WinDbg by using the .formats command along with the value of the segment selector. A segment selector follows a particular format which we'll describe in a moment.

Rich (BB code):

2: kd> .formats 0x33
Evaluate expression:
  Hex:     00000000`00000033
  Decimal: 51
  Octal:   0000000000000000000063
  Binary:  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00110011
  Chars:   .......3
  Time:    Thu Jan  1 00:00:51 1970
  Float:   low 7.14662e-044 high 0
  Double:  2.51973e-322

The first two highlighted bits indicate the requested privilege level (RPL), in this case, the requested privilege level was ring 3 or user-mode. The third bit is a flag which indicates if the selector corresponds to the GDT (0) or the LDT (1). Lastly, the final 13 bits are reserved as an index. The index is taken by the processor and multiplied by 16 (8 on x86 systems), this is then added to the base of address of the GDT or LDT respectively; depending of if the table indicator bit is set or not i.e. the selector belongs to the GDT or the LDT.

The providing expression will provide the address of the corresponding descriptor for that selector.

Code:

? (index * 0n16) + base

The segment selector can be visualised in the following diagram:

Privilege Levels:

While this is not strictly related to address translation, I thought it would be worth mentioning, that the privilege levels of DPL and RPL are compared against the current privilege level (CPL) with instructions which involve CALL and JMP. This is to prevent user-mode applications from directly accessing hardware and/or executing instructions which are considered as "privileged". The CPL is stored within the code segment selector, therefore instead of an RPL, we have a CPL for segments which are code. There are more nuances and other things to be aware of, however, they don't merit discussion within this tutorial.

Global Descriptor Table:

We've now described some of the differences between protected mode and real mode, in addition to, the differences between the various memory models. Let's have a look at the GDT and how we're able to view it within WinDbg.

The first entry of the GDT always points to a null segment descriptor. This is a special entry and is used to raise an exception if a segment register is set to it and then subsequently used to access a region of memory. There is a good discussion thread which describes this behavior.

Rich (BB code):

0: kd> dg 0x00
                                                    P Si Gr Pr Lo
Sel        Base              Limit          Type    l ze an es ng Flags
---- ----------------- ----------------- ---------- - -- -- -- -- --------
0000 00000000`00000000 00000000`00000000 <Reserved> 0 Nb By Np Nl 00000000

As previously mentioned, each entry within the GDT corresponds to a segment descriptor. Each descriptor is represented by a union type called _KGDTENTRY64. This is what the dg command uses to parse the segment selector passed to it. The structure is used by the !ms_gdt command as well.

Rich (BB code):

0: kd> dt _KGDTENTRY64
nt!_KGDTENTRY64
   +0x000 LimitLow         : Uint2B
   +0x002 BaseLow          : Uint2B
   +0x004 Bytes            : <anonymous-tag>
   +0x004 Bits             : <anonymous-tag>
   +0x008 BaseUpper        : Uint4B
   +0x00c MustBeZero       : Uint4B
   +0x000 DataLow          : Int8B
   +0x008 DataHigh         : Int8B

Each processor will have it's own GDT and the easiest method of dumping the GDT for each processor is by using the !ms_gdt command, which is available by loading the third-party SwishDbgExt extension library. Here is an sample of the output.

Rich (BB code):

3: kd> !ms_gdt
    |-----|-----|--------------------|--------------------------------------------------------|
    | Cre | Idx | Type                             | Address            | Name                                                   |
    |-----|-----|--------------------|--------------------------------------------------------|
    |   0 |   0 | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   1 | TSS32 Busy                       | 0x0000000000000000 | None                                                   |
    |   0 |   2 | TSS32 Busy                       | 0x0000FFFF00000000 | None                                                   |
    |   0 |   3 | TSS32 Busy                       | 0x0000000000000000 | None                                                   |
    |   0 |   4 | Code RE Ac                       | 0xFFFFF80417461000 | None                                                   |
    |   0 |   5 | TSS16 Busy                       | 0x0000000000000000 | None                                                   |
    |   0 |   6 | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   7 | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   8 | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   9 | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   a | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   b | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   c | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   d | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   e | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |   f | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |  10 | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |  11 | Data RO                          | 0x0000000000000000 | None                                                   |
    |   0 |  12 | Data RO                          | 0x0000000000000000 | None                                                   |

The first column indicates the processor number (core index), with the second column referring to the index into the GDT. It's important to also note, that on x64 systems, any attempts to modify or "hook" the GDT will result in a Stop 0x109 bugcheck. This is because the GDT is a protected structure on x64 systems.

Please note that the GDT can have up to 8,192 entries; the !ms_gdt command stops at 256 entries, therefore if you wish to see the entire table, then I suggest generating a simple script which will loop through the entire table and then dump the subsequent descriptors.

Paging and Segmentation:

Now, we've discussed the use of segmentation within Windows and how it is used in address translation. Please refer to the following diagram which will illustrate how paging and segmentation are utilised together. In the next part, we'll be looking at paging and how it is managed on Windows.

References:

Translating Virtual to Physical Address on Windows: Segmentation - Infosec Resources
CPU Rings, Privilege, and Protection
Operating Systems Development Series
The 0x33 Segment Selector (Heavens Gate) - MalwareTech
Intel Processor Architecture Manual - Volume 3, Chapter 3
The Rootkit Arsenal 2nd Edition

Windows Address Translation Deep Dive - Part 1

x BlueRobot

Administrator