Lesson 6: fork() — Creating a Child Process
fork() is the syscall that doubles the number of processes: it creates an exact copy of the current process. Understanding fork() is essential for every systems engineer — nvidia-docker and the Slurm GPU allocator use it to create isolated environments for each GPU job. When something goes wrong wit
fork() is like photocopying all the contents of your brain and putting them into another body — suddenly there are two people who remember everything up to the moment of the copy. The only difference is that one has a different ID number from the other.
- fork()
- A syscall that creates a child process that is a copy of the parent process. fork() returns 0 to the child, the child's PID to the parent, and -1 if it fails.
- Copy-on-Write (COW)
- An optimization of fork(): the kernel does not physically copy all memory immediately. Memory pages are shared with the parent until one of them writes — then only that page is copied.
- Child Process
- A process created by fork(). The child gets a new PID, but its PPID is the parent's PID. The child has an independent copy of the address space.
- getpid() / getppid()
- Syscalls that return the PID of the current process and its parent's PID respectively. Useful after fork() to distinguish parent from child.
- Address Space Duplication
- fork() creates for the child an independent virtual address space with the same addresses as the parent. Changing a variable in the child does not affect the parent and vice versa (thanks to Copy-on-Write).