Chapter 16

Processes: Programs Come Alive

Processes: Programs Come Alive

A musical score is just ink on paper. When an orchestra performs it, it comes alive โ€” there is sound, rhythm, memory, and progression through time. A program is the static file sitting on your disk. A process is that score being performed: dynamic, consuming resources, changing state moment by moment.

The same score can be performed by two orchestras simultaneously without interference. Likewise, you can open two Chrome windows at once โ€” both stem from the same executable file, yet they run as completely independent processes. If you think in object-oriented terms: a program is the class, a process is an instance. One class, many possible instances.

Core Concepts

ELF: What a Program Actually Looks Like

On Linux, executable programs use the ELF (Executable and Linkable Format) file format. You can verify this instantly:

$ file /bin/ls
/bin/ls: ELF 64-bit LSB pie executable, x86-64, ...

Inside every ELF file are several named sections. The three most important:

ELF File Layout
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  ELF Header (magic + arch)   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  .text section               โ”‚
โ”‚  Machine instructions (R/O)  โ”‚
โ”‚  This is where your code is  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  .data section               โ”‚
โ”‚  Initialized global vars     โ”‚
โ”‚  e.g.: int x = 42;           โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  .bss section                โ”‚
โ”‚  Uninitialized global vars   โ”‚
โ”‚  e.g.: int y;                โ”‚
โ”‚  (zero bytes on disk!)        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  Other sections (symtab...)  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The .bss name comes from a 1950s assembler directive โ€” "Block Started by Symbol" โ€” and has stuck ever since. It takes up almost no disk space: the file just records "I need X bytes of zeroed memory here." When the OS loads the program, it allocates and zeroes that region on demand.

Process Address Space: From 0 to Max

When a program is loaded, the OS creates an isolated virtual address space for it. On 64-bit Linux the layout looks like this (low addresses at the bottom):

High address
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” 0xFFFFFFFFFFFFFFFF
โ”‚   Kernel space (invisible)   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค 0x7FFFFFFFFFFFFFFF
โ”‚   Stack                      โ”‚  โ† grows downward
โ”‚   Function call frames       โ”‚
โ”‚   Local variables            โ”‚
โ”‚   ...                        โ”‚
โ”‚   โ†“                          โ”‚
โ”‚                              โ”‚
โ”‚   โ†‘                          โ”‚
โ”‚   Heap                       โ”‚  โ† grows upward
โ”‚   malloc / new allocations   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   .bss (zeroed globals)      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   .data (initialized globals)โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   .text (code)               โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
Low address  ~0x0000000000400000

Stack and heap grow toward each other with a large gap between them. If you malloc endlessly without freeing, or recurse too deeply, they collide โ€” producing the Out of Memory or Stack Overflow errors you have almost certainly encountered.

Creating Processes: fork and exec

Linux creates new processes with the classic fork + exec two-step:

Parent process (shell)
    โ”‚
    โ”‚  fork()
    โ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–บ  Child (exact copy of parent)
    โ”‚                                    โ”‚
    โ”‚                                    โ”‚  exec("ls", ...)
    โ”‚                                    โ”‚  Replace self with new program
    โ”‚                                    โ”‚
    โ”‚  wait()                            โ”‚  ls starts executing
    โ”‚โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚  ls finishes, exit()
    โ”‚
  shell resumes

fork() creates a copy-on-write snapshot of the parent โ€” no actual memory is copied until the child tries to write something, at which point only that one page is duplicated. exec() then replaces the entire address space with the new program's image and jumps to its entry point.

The Process State Machine

Every process passes through five states during its lifetime:

          fork()
            โ”‚
            โ–ผ
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚  New   โ”‚
        โ””โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”˜
            โ”‚  Enters ready queue
            โ–ผ
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   CPU frees up
        โ”‚ Ready  โ”‚โ—„โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ””โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”˜                   โ”‚
            โ”‚ Scheduler picks it     โ”‚
            โ–ผ                       โ”‚
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   Time slice ends  โ”‚
        โ”‚Runningโ”‚โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ””โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”˜
        โ”‚       โ”‚
 wait   โ”‚       โ”‚ exit()
 for I/Oโ–ผ       โ–ผ
    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
    โ”‚Blockedโ”‚ โ”‚ Terminatedโ”‚
    โ””โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
        โ”‚ I/O completes
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ–บ Ready

"Blocked" does not mean the process is crashed. It simply means it's waiting for something โ€” a disk read to finish, a network packet to arrive, keyboard input. Once that event occurs, the OS moves it back to the ready queue.

PCB: The Process's Identity Card

The OS tracks every process using a data structure called the PCB (Process Control Block). It contains:

In the Linux kernel the PCB is the task_struct defined in include/linux/sched.h. It has hundreds of fields โ€” everything the kernel needs to know about a process is in there.

Hands-On Verification

# View the process tree with PIDs
pstree -p

# See the address space of the current shell
cat /proc/$$/maps

# Inspect ELF sections of a binary
readelf -S /bin/ls | grep -E "\.text|\.data|\.bss"

# Quick size summary of each section
size /bin/ls
# Output:
#    text    data     bss     dec     hex filename
#  143416    4824    4664  152904   25548 /bin/ls

# Demonstrate fork in Python
python3 -c "
import os
pid = os.fork()
if pid == 0:
    print(f'Child: PID={os.getpid()}, parent={os.getppid()}')
    os._exit(0)
else:
    print(f'Parent: PID={os.getpid()}, child={pid}')
    os.wait()
"

๐Ÿ”ฌ Going Deeper

How Clever Is Copy-on-Write?

If fork() actually duplicated all 500 MB of a parent's memory, the operation would take hundreds of milliseconds โ€” completely unacceptable for a shell spawning commands every second. Copy-on-Write solves this: after fork(), parent and child share every memory page, all marked read-only. The moment either side tries to write to a page, the CPU raises a page fault, the kernel copies just that one page, and execution continues. In the extremely common fork() + exec() pattern (spawning a new program), the child never writes to the old pages at all โ€” they get discarded immediately when exec() replaces the address space.

Process Isolation Is the Foundation of Security

Every process lives in its own virtual address space, enforced by the CPU's page tables. Process A's virtual address 0x401000 and Process B's virtual address 0x401000 map to completely different physical pages. The OS can also intentionally map the same physical page into multiple processes โ€” this is how shared memory (via mmap or shmget) works, enabling high-speed inter-process communication without copying.

Recommended Reading:

Rate this chapter
4.5  / 5  (15 ratings)

๐Ÿ’ฌ Comments