How Linux Processes Work: Fork, Exec, Signals, and Zombies
Understand Linux process fundamentals: fork() and exec() for process creation, the parent-child tree, signals like SIGTERM and SIGKILL, zombie processes, the /proc filesystem, and essential debugging tools.
Infrastructure engineer with 10+ years building production systems on AWS, GCP,…

Every Program Is a Process
Every command you run, every daemon in the background, every container workload -- they're all Linux processes. Understanding how processes are created with fork() and exec(), how they communicate through signals, and why zombie processes exist is fundamental knowledge for anyone debugging production systems or writing software that runs on Linux.
The process model in Linux is elegant and decades old, but it still catches experienced engineers off guard. A stuck process that won't die, a zombie filling up the process table, a child process that inherits a file descriptor it shouldn't have -- these are the kinds of problems you can't fix without understanding the underlying mechanics.
What Is a Linux Process?
Definition: A Linux process is an instance of a running program. Each process has a unique process ID (PID), its own virtual memory space, file descriptors, environment variables, and a parent process. The kernel's scheduler allocates CPU time to processes based on priorities and scheduling policies.
When you type ls in a shell, the shell creates a new process, loads the ls binary into it, and waits for it to finish. That creation happens through two system calls that form the backbone of process management on Linux.
fork() and exec(): How Processes Are Born
Process creation in Linux is a two-step operation:
- fork() -- the parent process creates an exact copy of itself. The child gets a new PID but inherits everything else: memory, open file descriptors, environment variables, signal handlers
- exec() -- the child process replaces its memory with a new program. The PID stays the same, but the code, data, and stack are overwritten with the new binary
#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>
int main() {
pid_t pid = fork();
if (pid == 0) {
// Child process
printf("Child PID: %d\n", getpid());
execl("/bin/ls", "ls", "-l", NULL);
// If execl succeeds, this line never runs
perror("execl failed");
} else if (pid > 0) {
// Parent process
printf("Parent PID: %d, Child PID: %d\n", getpid(), pid);
int status;
waitpid(pid, &status, 0); // Wait for child to finish
printf("Child exited with status %d\n", WEXITSTATUS(status));
} else {
perror("fork failed");
}
return 0;
}
This fork-then-exec pattern is how every process on a Linux system starts. Your shell does it every time you run a command. systemd does it for every service. Even container runtimes use it internally.
Pro tip: Modern Linux uses copy-on-write (COW) for fork(). The child doesn't actually copy the parent's memory -- it shares the same pages until one of them writes. This makes fork() cheap even for processes using gigabytes of RAM. Redis relies on this for background saves.
The Process Tree: Parents and Children
Every process has a parent. The root of the tree is PID 1 -- init or systemd. You can see the tree with pstree:
$ pstree -p
systemd(1)-+-sshd(892)-+-sshd(1204)---bash(1205)---vim(1340)
|-nginx(901)-+-nginx(902)
| |-nginx(903)
|-postgres(845)-+-postgres(850)
|-postgres(851)
Key rules of the parent-child relationship:
- A child's parent PID (PPID) points to the process that called fork()
- If a parent dies, the child is reparented to PID 1 (systemd/init)
- The parent is responsible for calling wait() to collect the child's exit status
- If the parent doesn't call wait(), the dead child becomes a zombie
Process States
A process is always in one of these states, visible in the STAT column of ps aux:
| State | Code | Meaning |
|---|---|---|
| Running | R | Currently executing or in the run queue |
| Sleeping (interruptible) | S | Waiting for an event (most common state) |
| Sleeping (uninterruptible) | D | Waiting for I/O -- cannot be killed with SIGKILL |
| Stopped | T | Stopped by a signal (SIGSTOP or Ctrl+Z) |
| Zombie | Z | Terminated but parent hasn't called wait() |
Watch out: Processes in the
D(uninterruptible sleep) state cannot be killed, not even withkill -9. They're waiting for I/O (usually disk or NFS). If you see processes stuck in D state, the problem is the I/O subsystem -- a dead NFS mount, a failing disk, or a saturated block device. Fix the I/O problem, and the processes will unstick themselves.
Signals: Process Communication
Signals are the primary way to communicate with processes. They're software interrupts delivered by the kernel.
What are the most important Linux signals?
The most critical signals for system administration are SIGTERM (15) for graceful shutdown, SIGKILL (9) for forced termination, SIGHUP (1) for reloading configuration, and SIGCHLD which the kernel sends to a parent when a child process exits. SIGTERM should always be tried before SIGKILL because it allows the process to clean up resources.
| Signal | Number | Default Action | Can Be Caught? | Common Use |
|---|---|---|---|---|
SIGTERM | 15 | Terminate | Yes | Graceful shutdown (default of kill) |
SIGKILL | 9 | Terminate | No | Force kill -- last resort |
SIGHUP | 1 | Terminate | Yes | Reload config (Nginx, Apache) |
SIGINT | 2 | Terminate | Yes | Ctrl+C in terminal |
SIGSTOP | 19 | Stop | No | Pause process (like SIGKILL, can't be caught) |
SIGCONT | 18 | Continue | Yes | Resume a stopped process |
SIGCHLD | 17 | Ignore | Yes | Child process exited -- triggers wait() |
SIGUSR1 | 10 | Terminate | Yes | Application-defined (log rotation, etc.) |
# Send SIGTERM (graceful)
kill 1234
kill -SIGTERM 1234
kill -15 1234
# Send SIGKILL (force -- use only when SIGTERM doesn't work)
kill -9 1234
# Send SIGHUP (reload config)
kill -HUP 1234
# Kill all processes by name
killall nginx
pkill -f "node server.js"
Zombie Processes
A zombie process is a process that has finished executing but still has an entry in the process table. It exists because the kernel needs to keep the exit status around until the parent process collects it with wait().
How to identify and fix zombie processes
- Identify zombies -- run
ps aux | grep Zor look for processes with stateZintop - Find the parent -- run
ps -o ppid= -p ZOMBIE_PIDto get the parent PID - Signal the parent -- send SIGCHLD to the parent:
kill -SIGCHLD PARENT_PID. A well-written parent will call wait() in response - Kill the parent -- if SIGCHLD doesn't work, the parent has a bug. Killing the parent causes the zombie to be reparented to PID 1, which will reap it
- Fix the code -- if you control the parent, ensure it handles SIGCHLD or calls waitpid() to reap children
# Find zombies
ps aux | awk '$8=="Z" {print $0}'
# Count zombies
ps aux | awk '$8=="Z"' | wc -l
# Find parent of a zombie
ps -o ppid= -p 12345
Pro tip: A few zombies are harmless -- they consume no CPU or memory, only a PID table entry. The concern is a process that continuously creates children without reaping them, which can exhaust the PID limit (default 32768, configurable via
/proc/sys/kernel/pid_max).
The /proc Filesystem
The /proc filesystem is a virtual filesystem that exposes kernel and process information as files. Every process gets a directory at /proc/PID/:
# Process command line
cat /proc/1234/cmdline | tr '\0' ' '
# Process environment variables
cat /proc/1234/environ | tr '\0' '\n'
# Open file descriptors
ls -l /proc/1234/fd/
# Memory map
cat /proc/1234/maps
# Process status (state, memory, threads)
cat /proc/1234/status
# Current working directory
readlink /proc/1234/cwd
# Binary being executed
readlink /proc/1234/exe
System-wide information lives directly under /proc:
# Number of CPUs
nproc
cat /proc/cpuinfo | grep processor | wc -l
# Memory stats
cat /proc/meminfo
# Load average
cat /proc/loadavg
# Kernel version
cat /proc/version
Process Management Tools
| Tool | Purpose | Key Flags |
|---|---|---|
ps | Snapshot of current processes | ps aux, ps -ef, ps -o pid,ppid,stat,cmd |
top/htop | Real-time process monitor | Sort by CPU (P), memory (M) |
pstree | Process tree visualization | pstree -p shows PIDs |
strace | Trace system calls | strace -p PID -f (follow forks) |
lsof | List open files by process | lsof -p PID, lsof -i :8080 |
pgrep | Find PIDs by name | pgrep -f "pattern" |
Server and Monitoring Costs
Process monitoring becomes critical at scale. Here's what common monitoring solutions cost:
| Tool | Type | Starting Cost | Notes |
|---|---|---|---|
| Datadog | SaaS APM | $15/host/month | Full process monitoring, infrastructure maps |
| New Relic | SaaS APM | Free tier, then $0.30/GB | 100 GB/month free |
| Prometheus + Grafana | Self-hosted | Free (OSS) | process-exporter for per-process metrics |
| htop/btop | CLI | Free (OSS) | Interactive, no historical data |
| Netdata | Self-hosted | Free (OSS) | Per-process CPU, memory, I/O out of the box |
Frequently Asked Questions
What is the difference between fork() and exec()?
fork() creates a copy of the current process with a new PID. exec() replaces the current process's program with a new one without changing the PID. They're almost always used together: fork() to create a child, then exec() in the child to run a different program. The parent continues running its original code.
Why can't I kill a process with kill -9?
If kill -9 doesn't work, the process is in uninterruptible sleep (D state), waiting for I/O that hasn't completed. The kernel won't deliver SIGKILL until the I/O finishes. Common causes include dead NFS mounts, failing disks, or hung kernel drivers. Fix the underlying I/O issue to unstick the process.
What causes zombie processes and are they harmful?
Zombies occur when a child process exits but its parent hasn't called wait() to collect the exit status. Individual zombies are harmless -- they use no CPU or memory. The risk is a buggy parent that creates thousands of zombies, exhausting the PID table. Fix the parent's code to properly reap children.
What is PID 1 and why is it special?
PID 1 is the init process (systemd on modern systems). The kernel starts it first, and it becomes the adoptive parent of any orphaned process. PID 1 has a special property: the kernel won't send it signals it hasn't explicitly registered handlers for, so you can't accidentally kill it with SIGTERM or SIGKILL.
How do I find which process is using a specific port?
Use ss -tlnp | grep :8080 or lsof -i :8080. Both show the PID and process name. ss is faster and available on all modern Linux systems. lsof gives more detail but may need to be installed separately. On older systems, netstat -tlnp works too.
What happens to child processes when the parent dies?
Orphaned children are reparented to PID 1 (systemd/init), which adopts them and will call wait() when they exit. The children keep running normally. This is why background daemons often fork twice and have the intermediate parent exit -- it makes PID 1 the parent, which handles cleanup correctly.
How do I run a process that survives SSH disconnection?
Use nohup command &, tmux, or screen. nohup makes the process ignore SIGHUP (which is sent when the terminal closes). tmux and screen are better because they let you reattach to the session later. For persistent services, use a systemd unit file instead.
Conclusion
The Linux process model is built on a small set of primitives -- fork(), exec(), wait(), and signals -- that compose into everything from shell pipelines to container runtimes. Knowing these primitives lets you reason about why a process is stuck, where zombies come from, and how to debug misbehaving services. Start by exploring /proc on a running system, use strace to watch system calls, and build intuition about parent-child relationships with pstree. These tools will save you hours the next time a process refuses to behave.
Written by
Abhishek Patel
Infrastructure engineer with 10+ years building production systems on AWS, GCP, and bare metal. Writes practical guides on cloud architecture, containers, networking, and Linux for developers who want to understand how things actually work under the hood.
Related Articles
Linux File Permissions Explained: chmod, chown, and ACLs
Build the mental model for Linux file permissions from scratch. Learn chmod octal and symbolic notation, chown, umask, setuid/setgid/sticky bits, and POSIX ACLs with real-world scenarios.
12 min read
LinuxBash Scripting Best Practices for DevOps Engineers
Write reliable bash scripts with set -euo pipefail, proper quoting, [[ ]] tests, idempotent patterns, cleanup traps, ShellCheck, and knowing when to switch to Python.
10 min read
LinuxThe Linux Networking Stack: From Socket to NIC
Trace a packet through the entire Linux networking stack: socket buffers, the TCP state machine, IP routing, netfilter/iptables, traffic control, and NIC drivers with practical diagnostic tools.
10 min read
Enjoyed this article?
Get more like this in your inbox. No spam, unsubscribe anytime.