Monday 11 June 2018

Linux process creation internals

fork, vfork and clone are the system calls which create a process in Linux.
Lets go through how these are intercepted at linux.


Implementation of vork :
#ifdef __ARCH_WANT_SYS_VFORK
SYSCALL_DEFINE0(vfork)
{
return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, 0,
0, NULL, NULL);
}
#endif

Implementation of various clone : 
SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
int __user *, parent_tidptr,
int, tls_val,
int __user *, child_tidptr)
#elif defined(CONFIG_CLONE_BACKWARDS2)
SYSCALL_DEFINE5(clone, unsigned long, newsp, unsigned long, clone_flags,
int __user *, parent_tidptr,
int __user *, child_tidptr,
int, tls_val)
#elif defined(CONFIG_CLONE_BACKWARDS3)
SYSCALL_DEFINE6(clone, unsigned long, clone_flags, unsigned long, newsp,
int, stack_size,
int __user *, parent_tidptr,
int __user *, child_tidptr,
int, tls_val)
#else
SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp,
int __user *, parent_tidptr,
int __user *, child_tidptr,
int, tls_val)
#endif
{
return do_fork(clone_flags, newsp, 0, parent_tidptr, child_tidptr);
}


Implementation of fork :
SYSCALL_DEFINE0(fork)
{
#ifdef CONFIG_MMU
return do_fork(SIGCHLD, 0, 0, NULL, NULL);
#else
/* can not support in nommu mode */
return(-EINVAL);
#endif
}


Kernel thread also calls do_fork :
/*
 * Create a kernel thread.
 */
pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
{
return do_fork(flags|CLONE_VM|CLONE_UNTRACED, (unsigned long)fn,
(unsigned long)arg, NULL, NULL);
}

So we can see that all these creation of a processes in Linux is done by do_fork call in kernel. Now lets see the do_fork implementation.

do_fork function is called with different flags as per the caller. do_fork function calls copy_process to do the major work.

do_fork --> copy_process
-->dup_task_struct
-->copy_creds
-->sched_fork
copy the process info :
-->copy_semundo
-->copy_files
-->copy_fs
-->copy_sighand
-->copy_signal
-->copy_mm
-->copy_namespaces
-->copy_io
-->copy_thread


Now lets see the major functionality achieved in these calls :
dup_task_struct : 
1. Allocates task_struct from kmem_cache : task_struct_cachep
2. Allocates thread_info from kmem_cache : thread_info_cache
3. arch_dup_task_struct copies the parent task_struct exactly to this child.

sched_fork :
1. Initialises the sched_entity for a process. Includes initialisation of vruntime, exec_start
2. Makes the task state as TASK_RUNNING
3. Initialises priority of task, taking care of the nice values.
4. Initialises scheduler class of the process to real time, fair scheduling

copy_files :
1. Copy file descriptor table "fdtable"

copy_sighand :
1. copies the signal handlers of the parent process

copy_signal :
1. Initialises more signal handler values

copy_mm:
1. create a copy of the page tables. It does not copy the actual contents of the pages. Pages are newly allocated when a write comes to a page. (COW - Copy on write)
It calls dup_mm :
--> Allocates mm_struct
--> Calls mm_init
--> Allocates pgd_alloc to allocate the page table
It calls pgd_ctor to copy the pgd entries from kernel :
if (PAGETABLE_LEVELS == 2 ||
(PAGETABLE_LEVELS == 3 && SHARED_KERNEL_PMD) ||
PAGETABLE_LEVELS == 4) {
clone_pgd_range(pgd + KERNEL_PGD_BOUNDARY,
swapper_pg_dir + KERNEL_PGD_BOUNDARY,
KERNEL_PGD_PTRS);
}
--> Also calls dup_mmap to copy the VMAs


After copying the various stuffs at last the process is woken using:
wake_up_new_task(p);

No comments:

Post a Comment