Process and Job Control
Chapter 5: Processes and Job Control
A process is Linux's fundamental unit of execution. Understanding the process lifecycle — from fork copying the parent, to exec loading a new program, to exit and parent wait cleanup — is the key to understanding the entire OS kernel. This chapter covers the PID identity system, deep dives into fork/exec/wait syscall internals, explains the process state machine, and builds mastery of ps/top/htop monitoring, signal mechanics, job control, scheduling priorities, and the /proc virtual filesystem.
1. Process Basics: PID / PPID / PGID / SID
Every process has an identity
Every Linux process carries four core IDs:
- PID (Process ID): Kernel-assigned unique identifier, globally non-repeating.
- PPID (Parent PID): PID of the process that created this one. Orphaned processes have their PPID rewritten to 1 (init/systemd).
- PGID (Process Group ID): Groups all processes in a pipeline together so the shell can signal them as a unit.
- SID (Session ID): Usually equals the PID of the session leader (the shell itself). When the terminal disconnects, the entire session receives SIGHUP.
Process tree and pstree
All processes form a tree rooted at PID 1. On modern Linux distributions, PID 1 is systemd, responsible for bootstrapping all userspace services. Use pstree to visualize parent-child relationships:
pstree -p # show PIDs
pstree -u # show usernames
pstree -p $$ # tree rooted at current shell
pstree -p 1 | head # expand from systemd downward
Why is init/systemd PID 1? After kernel initialization, the very first userspace process is exec'd from the kernel thread
kernel_initwith a hard-coded PID of 1. All orphaned processes are adopted by PID 1, ensuring zombie processes can always be reaped.
2. fork / exec / wait Internals
fork: copying yourself
When the shell runs a command, it calls the fork() syscall. The kernel completely copies the parent's address space to the child — but modern kernels use Copy-On-Write (COW) optimization: memory pages are only physically copied when written to; until then, parent and child share the same physical pages. This makes fork extremely cheap.
fork returns twice: in the parent it returns the child's PID (>0), in the child it returns 0, and on error returns -1.
exec: replacing the process image
After fork, the child calls execve() (the actual syscall behind the exec family). The kernel replaces the current process's code, data, and stack segments entirely with the new program, keeping the same PID. exec never returns on success — the original code no longer exists.
wait: reaping zombies
After a child exits, the kernel stores its exit status in the process table, waiting for the parent to call wait() / waitpid(). Until the parent reads it, the child is in Z (Zombie) state — holding a process table slot but consuming no CPU or memory. The slot is fully released only after the parent calls wait.
fork / exec / wait call flow
───────────────────────────────────────────────────────
Shell (parent) Child process
────────────── ──────────────
pid = fork() ──────▶ pid == 0
execve("/bin/ls", ...)
[address space replaced]
ls runs ...
exit(0)
│
wait(&status) ◀── kernel notifies via SIGCHLD
read exit status
release process table entry
───────────────────────────────────────────────────────
Orphan: parent exits first
Child PPID → 1 (systemd), adopted, will be waited on
Zombie: child exited, parent hasn't called wait
Holds process table slot, shown as Z in ps
Fix: send SIGCHLD to parent, or kill the parent
3. Process State Machine
The Linux kernel tracks each process state with a single letter, shown in the STAT/S column of ps and top:
| State | Letter | Meaning | Typical scenario |
|---|---|---|---|
| Running / Runnable | R | On CPU or waiting in the run queue | CPU-intensive programs |
| Interruptible Sleep | S | Waiting for event (I/O, signal, lock); can be woken by signal | Waiting for keyboard input, network data |
| Uninterruptible Sleep | D | Waiting for non-interruptible I/O (usually disk/NFS); cannot be killed | Disk I/O block, NFS timeout |
| Zombie | Z | Exited but parent hasn't called wait | Parent bug, missing waitpid call |
| Stopped | T | Suspended by signal (SIGSTOP/SIGTSTP) | Ctrl+Z, debugger breakpoint |
| Idle Kernel Thread | I | Idle kernel thread (kernel 4.14+) | kworker when idle |
Process state transition diagram
─────────────────────────────────────────────────────────
fork()
│
▼
┌─────────────────┐
│ R (Runnable) │◀─── scheduled ──▶ CPU executing
└────────┬────────┘
│ wait for I/O / event
┌──────▼──────┐ ┌─────────────────┐
│ S (Sleep) │ │ D (Uninterrupt) │
└──────┬──────┘ └────────┬────────┘
│ event ready │ I/O done
└──────────┐ ┌─────────┘
▼ ▼
┌───────────────┐
│ R (Runnable) │
└───────┬───────┘
│ SIGSTOP / Ctrl+Z
┌───────▼───────┐
│ T (Stopped) │
└───────┬───────┘
│ SIGCONT
┌───────▼───────┐
│ R (Runnable) │
└───────┬───────┘
│ exit()
┌───────▼───────┐
│ Z (Zombie) │──── parent wait() ────▶ gone
└───────────────┘
─────────────────────────────────────────────────────────
D-state processes cannot be killed with kill -9. SIGKILL is only handled when a process returns to userspace, but a D-state process is stuck in kernel space waiting for I/O. Solution: wait for I/O to complete, or reboot. Large numbers of D-state processes usually indicate disk or NFS failure.
4. ps Command Deep Dive
# View all processes (BSD style)
ps aux
# View all processes (System V style, includes PPID)
ps -ef
# Filter for specific process
ps aux | grep nginx
ps -ef | grep -v grep | grep sshd
# ASCII process tree
ps axjf
ps -ejH
# Sort by CPU usage (descending)
ps aux --sort=-%cpu | head -15
# Sort by memory usage
ps aux --sort=-%mem | head -10
# Detailed info for specific PID
ps -p 1234 -o pid,ppid,user,stat,cmd
# All processes for a specific user
ps -u www-data
# Show process threads
ps -eLf | grep nginx
STAT column suffix characters:
s= session leader;l= multi-threaded;+= foreground process group;N= low priority (nice > 0); ` mylog.log 2>&1 &
disown: remove job from shell's job table
sleep 999 & disown %1 # terminal close won't send SIGHUP to this process
tmux: modern terminal multiplexer (recommended)
tmux new -s work # new session named work tmux ls # list sessions tmux attach -t work # reconnect
Ctrl+b d # detach, session continues in background
## 8. Process Priority: nice / renice / ionice
```bash
# nice: set priority at launch (-20 highest, 19 lowest)
nice -n 10 ./cpu-heavy-task # lower priority
nice -n -5 ./important-daemon # higher priority (requires root)
# renice: change priority of running process
renice -n 15 -p 1234 # set PID 1234's nice value to 15
renice -n 5 -u worker # set all worker user processes to nice 5
# ionice: I/O scheduling priority
# -c 1 = Realtime (highest)
# -c 2 = Best-effort (default; -n 0-7, 0 highest)
# -c 3 = Idle (only uses I/O when system is idle)
ionice -c 3 tar czf backup.tar.gz /data # backup won't affect other I/O
ionice -c 2 -n 0 -p 1234 # boost I/O priority of PID 1234
9. pmap / lsof: Process Resource Inspection
# pmap: view process memory mappings
pmap 1234 # basic output (address, size, permissions, name)
pmap -x 1234 # extended format with RSS/Dirty columns
pmap -x 1234 | tail -1 # total memory summary
# lsof: list open files (includes network sockets)
lsof -p 1234 # all files opened by PID 1234
lsof -u www-data # files opened by www-data user
lsof /var/log/nginx.log # who is using this file
lsof -i :80 # which process is listening on port 80
lsof -i TCP # all TCP connections
lsof -i TCP:1-1024 # TCP connections on privileged ports
lsof +D /var/log # all open files under directory (slow)
# Find files deleted but not released (disk space not freed)
lsof | grep deleted
10. /proc/PID Virtual Filesystem
# Process command line (args separated by \0)
cat /proc/1234/cmdline | tr '\0' ' '
# Process environment variables
cat /proc/1234/environ | tr '\0' '\n'
# Memory mappings (same source as pmap)
cat /proc/1234/maps
# Process status (Name, Pid, PPid, Threads, VmRSS, VmSize...)
cat /proc/1234/status
# Open file descriptors
ls -la /proc/1234/fd
# 0=stdin, 1=stdout, 2=stderr, 3+=others
readlink /proc/1234/fd/3 # what fd 3 points to
# OOM score (higher = more likely to be killed by kernel OOM)
cat /proc/1234/oom_score
cat /proc/1234/oom_score_adj # manual adjustment (-1000 to 1000)
Debug tip: Use
readlink /proc/PID/cwdto find a process's working directory, andreadlink /proc/PID/exeto find its actual binary path — even if the binary was deleted after startup (shown as(deleted)).
Previous
← Ch4: Text Tools
Next
Ch6: Permissions →