Monday, 11 June 2018

Linux process creation internals

fork, vfork and clone are the system calls which create a process in Linux.
Lets go through how these are intercepted at linux.

Implementation of vork :
return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, 0,

Implementation of various clone : 
SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
int __user *, parent_tidptr,
int, tls_val,
int __user *, child_tidptr)
SYSCALL_DEFINE5(clone, unsigned long, newsp, unsigned long, clone_flags,
int __user *, parent_tidptr,
int __user *, child_tidptr,
int, tls_val)
SYSCALL_DEFINE6(clone, unsigned long, clone_flags, unsigned long, newsp,
int, stack_size,
int __user *, parent_tidptr,
int __user *, child_tidptr,
int, tls_val)
SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
int __user *, parent_tidptr,
int __user *, child_tidptr,
int, tls_val)
return do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr);

Implementation of fork :
return do_fork(SIGCHLD, 0, 0, NULL, NULL);
/* can not support in nommu mode */

Kernel thread also calls do_fork :
 * Create a kernel thread.
pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
return do_fork(flags|CLONE_VM|CLONE_UNTRACED, (unsigned long)fn,
(unsigned long)arg, NULL, NULL);

So we can see that all these creation of a processes in Linux is done by do_fork call in kernel. Now lets see the do_fork implementation.

do_fork function is called with different flags as per the caller. do_fork function calls copy_process to do the major work.

do_fork --> copy_process
copy the process info :

Now lets see the major functionality achieved in these calls :
dup_task_struct : 
1. Allocates task_struct from kmem_cache : task_struct_cachep
2. Allocates thread_info from kmem_cache : thread_info_cache
3. arch_dup_task_struct copies the parent task_struct exactly to this child.

sched_fork :
1. Initialises the sched_entity for a process. Includes initialisation of vruntime, exec_start
2. Makes the task state as TASK_RUNNING
3. Initialises priority of task, taking care of the nice values.
4. Initialises scheduler class of the process to real time, fair scheduling

copy_files :
1. Copy file descriptor table "fdtable"

copy_sighand :
1. copies the signal handlers of the parent process

copy_signal :
1. Initialises more signal handler values

1. create a copy of the page tables. It does not copy the actual contents of the pages. Pages are newly allocated when a write comes to a page. (COW - Copy on write)
It calls dup_mm :
--> Allocates mm_struct
--> Calls mm_init
--> Allocates pgd_alloc to allocate the page table
It calls pgd_ctor to copy the pgd entries from kernel :
clone_pgd_range(pgd + KERNEL_PGD_BOUNDARY,
swapper_pg_dir + KERNEL_PGD_BOUNDARY,
--> Also calls dup_mmap to copy the VMAs

After copying the various stuffs at last the process is woken using:

No comments:

Post a comment