Thursday, 23 June 2016

Segmentation in Linux

Linux segmentation (protected mode):Segmentation is used while converting the virtual address to physical address. This conversion goes through segmentation and paging.
In segmentation the segment base address needs to be fetched and added to virtual address. This completes segmentation.

The segment base address for each segment is 0 in Linux. This model is called “flat model”. Flat model is equivalent to disabling segmentation when it comes to translating memory addresses.
An executing Linux process is divided into segments. For example Code segment, data segment etc. Each segment has segment selector corresponding to it. This segment register points to entry in segment descriptor table (segmentation).
There are 2 segment descriptor tables for processes’ code execution.

Local descriptor Table: The LDT is essential to implementing separate address spaces for multiple processes. There will be generally one LDT per user process, describing privately held memory,

Global descriptor table: shared memory and kernel memory will be described by the GDT.
The starting address of these tables in cached in Global descriptor table register (GDTR) and Local descriptor table register (LDTR).
The LDT gets reloaded every time a new process is scheduled:
switch_mm()
..
..
                              /* Load the LDT, if the LDT is different: */
                              if (unlikely(prev->context.ldt != next->context.ldt))
                                             load_LDT_nolock(&next->context);



Interrupt descriptor table : IDT 

Privilege levels (CPL DPL RPL):

X86 systems has a feature of privilege levels. This restricts the memory access, IO ports access and ability to execute certain machine instructions. For kernel this privilege level is 0 and for user space programs it is 3. A code executing cannot change its privilege level itself. Change in privilege level can be done using lcall, int, lret and iret instructions. The raise of privilege level is done by lcall and iret instructions and lowering by lret and iret instructions. This explains the use of int 0x80 done while executing any system call. It is this int instruction which elevates the privilege level from user(0) to kernel(3) space.

CPL is the current privilege level (found in the lower 2 bits of the CS register), RPL is the requested privilege level from the segment selector, and DPL is the descriptor privilege level of the segment (found in the descriptor). All privilege levels are integers in the range 0–3, where the lowest number corresponds to the highest privilege.
The only way to change the processor privilege level (and reload CS) is through lcall, int, lret and iret instructions.
Also while accessing it is checked if following is true
max(CPL, RPL) ≤ DPL
else a General Protection Fault is raised.

Lets see the segmentation and per-CPU GDT under Linux :
Segment.h linux-4.0.5\arch\x86\include\asm
64 bit arch uses 16 GDT entries
 #define GDT_ENTRY_KERNEL32_CS 1
#define GDT_ENTRY_KERNEL_CS 2
#define GDT_ENTRY_KERNEL_DS 3
 #define __KERNEL32_CS   (GDT_ENTRY_KERNEL32_CS * 8)
 /*
 * we cannot use the same code segment descriptor for user and kernel
 * -- not even in the long flat mode, because of different DPL /kkeil
 * The segment offset needs to contain a RPL. Grr. -AK
 * GDT layout to get 64bit syscall right (sysret hardcodes gdt offsets)
 */
#define GDT_ENTRY_DEFAULT_USER32_CS 4
#define GDT_ENTRY_DEFAULT_USER_DS 5
#define GDT_ENTRY_DEFAULT_USER_CS 6
#define __USER32_CS   (GDT_ENTRY_DEFAULT_USER32_CS*8+3)
#define __USER32_DS     __USER_DS
 #define GDT_ENTRY_TSS 8           /* needs two entries */
#define GDT_ENTRY_LDT 10 /* needs two entries */
#define GDT_ENTRY_TLS_MIN 12
#define GDT_ENTRY_TLS_MAX 14
 #define GDT_ENTRY_PER_CPU 15             /* Abused to load per CPU data from limit */
#define __PER_CPU_SEG               (GDT_ENTRY_PER_CPU * 8 + 3)
 /* TLS indexes for 64bit - hardcoded in arch_prctl */
#define FS_TLS 0
#define GS_TLS 1
 #define GS_TLS_SEL ((GDT_ENTRY_TLS_MIN+GS_TLS)*8 + 3)
#define FS_TLS_SEL ((GDT_ENTRY_TLS_MIN+FS_TLS)*8 + 3)
 #define GDT_ENTRIES 16


What is TLS ?
Linux dedicates three global descriptor table (GDT) entries for thread-local storage.


Linux GDT instantiation:
Common.c
               /*
                * We need valid kernel segments for data and code in long mode too
                * IRET will check the segment types  kkeil 2000/10/28
                * Also sysret mandates a special GDT layout
                *
                * TLS descriptors are currently at a different place compared to i386.
                * Hopefully nobody expects them at a fixed place (Wine?)
                */
               [GDT_ENTRY_KERNEL32_CS]                      = GDT_ENTRY_INIT(0xc09b, 0, 0xfffff),
               [GDT_ENTRY_KERNEL_CS]                           = GDT_ENTRY_INIT(0xa09b, 0, 0xfffff),
               [GDT_ENTRY_KERNEL_DS]                          = GDT_ENTRY_INIT(0xc093, 0, 0xfffff),
               [GDT_ENTRY_DEFAULT_USER32_CS]        = GDT_ENTRY_INIT(0xc0fb, 0, 0xfffff),
               [GDT_ENTRY_DEFAULT_USER_DS]            = GDT_ENTRY_INIT(0xc0f3, 0, 0xfffff),
               [GDT_ENTRY_DEFAULT_USER_CS]            = GDT_ENTRY_INIT(0xa0fb, 0, 0xfffff),


From the Intel architecture manual : 
2.1.1 Global and Local Descriptor Tables
When operating in protected mode, all memory accesses pass through either the global descriptor table (GDT) or an optional local descriptor table (LDT) as shown in Figure 2-1. These tables contain entries called segment descriptors. Segment descriptors provide the base address of segments well as access rights, type, and usage information.
Each segment descriptor has an associated segment selector. A segment selector provides the software that uses it with an index into the GDT or LDT (the offset of its associated segment descriptor), a global/local flag (determines whether the selector points to the GDT or the LDT), and access rights information.

To access a byte in a segment, a segment selector and an offset must be supplied. The segment selector provides access to the segment descriptor for the segment (in the GDT or LDT). From the segment descriptor, the processor obtains the base address of the segment in the linear address space. The offset then provides the location of the byte relative to the base address. This mechanism can be used to access any valid code, data, or stack segment, provided the segment is accessible from the current privilege level (CPL) at which the processor is operating. The CPL is defined as the protection level of the currently executing code segment. 

However, the actual path from a segment selector to its associated segment is always
through a GDT or LDT. The linear address of the base of the GDT is contained in the GDT register (GDTR); the linear address of the LDT is contained in the LDT register (LDTR).

2.1.2 System Segments, Segment Descriptors, and Gates
Besides code, data, and stack segments that make up the execution environment of a program or procedure, the architecture defines two system segments: the task-state segment (TSS) and the LDT. The GDT is not considered a segment because it is not accessed by means of a segment selector and segment descriptor. TSSs and LDTs have segment descriptors defined for them.
The architecture also defines a set of special descriptors called gates (call gates, interrupt gates, trap gates, and task gates). These provide protected gateways to system procedures and handlers that may operate at a different privilege level than application programs and most procedures. For example, a CALL to a call gate can provide access to a procedure in a code segment that is at the same or a numerically lower privilege level (more privileged) than the current code segment. To access a procedure through a call gate, the calling procedure1 supplies the selector for the call gate. The processor then performs an access rights check on the call gate, comparing the CPL with the privilege level of the call gate and the destination code segment pointed to by the call gate.
If access to the destination code segment is allowed, the processor gets the segment selector for the destination code segment and an offset into that code segment from the call gate. If the call requires a change in privilege level, the processor also switches to the stack for the targeted privilege level. The segment selector for the new stack is obtained from the TSS for the currently running task. Gates also facilitate transitions between 16-bit and 32-bit code segments, and vice versa.

No comments:

Post a Comment