Thursday, 23 June 2016

Segmentation in Linux

Linux segmentation (protected mode):Segmentation is used while converting the virtual address to physical address. This conversion goes through segmentation and paging.
In segmentation the segment base address needs to be fetched and added to virtual address. This completes segmentation.

The segment base address for each segment is 0 in Linux. This model is called “flat model”. Flat model is equivalent to disabling segmentation when it comes to translating memory addresses.
An executing Linux process is divided into segments. For example Code segment, data segment etc. Each segment has segment selector corresponding to it. This segment register points to entry in segment descriptor table (segmentation).
There are 2 segment descriptor tables for processes’ code execution.

Local descriptor Table: The LDT is essential to implementing separate address spaces for multiple processes. There will be generally one LDT per user process, describing privately held memory,

Global descriptor table: shared memory and kernel memory will be described by the GDT.
The starting address of these tables in cached in Global descriptor table register (GDTR) and Local descriptor table register (LDTR).
The LDT gets reloaded every time a new process is scheduled:
                              /* Load the LDT, if the LDT is different: */
                              if (unlikely(prev->context.ldt != next->context.ldt))

Privilege levels (CPL DPL RPL):

X86 systems has a feature of privilege levels. This restricts the memory access, IO ports access and ability to execute certain machine instructions. For kernel this privilege level is 0 and for user space programs it is 3. A code executing cannot change its privilege level itself. Change in privilege level can be done using lcall, int, lret and iret instructions. The raise of privilege level is done by lcall and iret instructions and lowering by lret and iret instructions. This explains the use of int 0x80 done while executing any system call. It is this int instruction which elevates the privilege level from user(0) to kernel(3) space.

CPL is the current privilege level (found in the lower 2 bits of the CS register), RPL is the requested privilege level from the segment selector, and DPL is the descriptor privilege level of the segment (found in the descriptor). All privilege levels are integers in the range 0–3, where the lowest number corresponds to the highest privilege.
The only way to change the processor privilege level (and reload CS) is through lcall, int, lret and iret instructions.
Also while accessing it is checked if following is true
max(CPL, RPL) ≤ DPL
else a General Protection Fault is raised.

Lets see the segmentation and per-CPU GDT under Linux :
Segment.h linux-4.0.5\arch\x86\include\asm
64 bit arch uses 16 GDT entries
 #define GDT_ENTRY_KERNEL32_CS 1
 #define __KERNEL32_CS   (GDT_ENTRY_KERNEL32_CS * 8)
 * we cannot use the same code segment descriptor for user and kernel
 * -- not even in the long flat mode, because of different DPL /kkeil
 * The segment offset needs to contain a RPL. Grr. -AK
 * GDT layout to get 64bit syscall right (sysret hardcodes gdt offsets)
#define __USER32_CS   (GDT_ENTRY_DEFAULT_USER32_CS*8+3)
#define __USER32_DS     __USER_DS
 #define GDT_ENTRY_TSS 8           /* needs two entries */
#define GDT_ENTRY_LDT 10 /* needs two entries */
#define GDT_ENTRY_TLS_MIN 12
#define GDT_ENTRY_TLS_MAX 14
 #define GDT_ENTRY_PER_CPU 15             /* Abused to load per CPU data from limit */
#define __PER_CPU_SEG               (GDT_ENTRY_PER_CPU * 8 + 3)
 /* TLS indexes for 64bit - hardcoded in arch_prctl */
#define FS_TLS 0
#define GS_TLS 1
 #define GS_TLS_SEL ((GDT_ENTRY_TLS_MIN+GS_TLS)*8 + 3)
 #define GDT_ENTRIES 16

What is TLS ?
Linux dedicates three global descriptor table (GDT) entries for thread-local storage.

Linux GDT instantiation:
                * We need valid kernel segments for data and code in long mode too
                * IRET will check the segment types  kkeil 2000/10/28
                * Also sysret mandates a special GDT layout
                * TLS descriptors are currently at a different place compared to i386.
                * Hopefully nobody expects them at a fixed place (Wine?)
               [GDT_ENTRY_KERNEL32_CS]                      = GDT_ENTRY_INIT(0xc09b, 0, 0xfffff),
               [GDT_ENTRY_KERNEL_CS]                           = GDT_ENTRY_INIT(0xa09b, 0, 0xfffff),
               [GDT_ENTRY_KERNEL_DS]                          = GDT_ENTRY_INIT(0xc093, 0, 0xfffff),
               [GDT_ENTRY_DEFAULT_USER32_CS]        = GDT_ENTRY_INIT(0xc0fb, 0, 0xfffff),
               [GDT_ENTRY_DEFAULT_USER_DS]            = GDT_ENTRY_INIT(0xc0f3, 0, 0xfffff),
               [GDT_ENTRY_DEFAULT_USER_CS]            = GDT_ENTRY_INIT(0xa0fb, 0, 0xfffff),

