Skip to content
Linux

How Linux Processes Work: Fork, Exec, Signals, and Zombies

Understand Linux process fundamentals: fork() and exec() for process creation, the parent-child tree, signals like SIGTERM and SIGKILL, zombie processes, the /proc filesystem, and essential debugging tools.

A
Abhishek Patel10 min read

Infrastructure engineer with 10+ years building production systems on AWS, GCP,…

How Linux Processes Work: Fork, Exec, Signals, and Zombies
How Linux Processes Work: Fork, Exec, Signals, and Zombies

fork() vs vfork() vs posix_spawn() vs clone(): A Side-by-Side

Linux gives you four ways to spawn a new process. They differ in how much of the parent they copy, how fast they return, how they interact with the scheduler, and which one you should actually reach for. Picking the wrong primitive is how "spawn a worker" becomes a 40ms hot-path call that limits throughput to 25 workers/second.

PrimitiveWhat happensTypical latency (Linux 6.x on a modern x86 box)Use when
fork()Duplicate the parent with copy-on-write. Child shares pages until write.30-80 μs for a small process; 200-800 μs for a 2 GB heapYou actually need a copy of the parent's memory (Redis RDB save, shell forking for a pipeline).
vfork()Child shares parent's address space until exec() or _exit(). Parent blocks.~10 μs -- fastest availableAlmost never directly. Mostly a historical primitive; libc uses it internally.
posix_spawn()Single syscall that bundles fork+exec. Skips the page copy entirely on many libcs.~40-120 μs regardless of parent sizeYou want exec() after fork() -- i.e., you are running another program. This is the right default for most system()-style code.
clone()Low-level primitive that lets you choose what to share: memory, fds, pid namespace, etc.Varies with flags; ~20-60 μs for a thread-like shareYou are building a container runtime, a thread library, or doing something unusual (e.g., unshare(CLONE_NEWNET)).

The non-obvious one is posix_spawn(). In 2026 it is the primitive you want 90% of the time for "run another program." Node.js's subprocess spawner, Python 3.11+'s subprocess module (when _USE_POSIX_SPAWN is viable), and Go's os/exec all prefer posix_spawn() when they can, precisely because it avoids the COW page-table duplication of fork(). On a server with a 4 GB RSS parent, this is the difference between 700 μs and 90 μs per spawn.

The Mental Model: Everything Is a Process, Most Processes Are Waiting

The Linux process model is decades old and survives because its primitives compose. A shell pipeline is fork() + pipe() + dup2() + exec(). A web server is socket() + bind() + listen() + fork()/clone(). A container is clone(CLONE_NEWNS | CLONE_NEWPID | CLONE_NEWNET) + pivot_root() + exec(). Same primitives, different combinations. Learn them once and the entire OS stops being magic.

A running Linux system has hundreds to thousands of processes, but at any given instant almost none of them are actually running. Run ps -eo pid,stat,comm | awk '$2 ~ /^[RD]/' on a production box -- you will typically see 5-20 processes in R (runnable) or D (uninterruptible sleep), out of 600+ total. Everything else is sleeping, waiting on I/O, a timer, a signal, or a lock. Understanding where a given process is waiting is most of what debugging a "stuck" system looks like in practice.

The rest of this guide walks the primitives -- fork/exec, signals, states, zombies, /proc -- but always in the context of "what problem does this solve, and what does it look like when it breaks at 3 AM?" That is the only framing that makes the classical Linux process model stick.

fork() and exec(): How Processes Are Born

Process creation in Linux is a two-step operation:

  1. fork() -- the parent process creates an exact copy of itself. The child gets a new PID but inherits everything else: memory, open file descriptors, environment variables, signal handlers
  2. exec() -- the child process replaces its memory with a new program. The PID stays the same, but the code, data, and stack are overwritten with the new binary
#include <unistd.h>
#include <stdio.h>
#include <sys/wait.h>

int main() {
    pid_t pid = fork();

    if (pid == 0) {
        // Child process
        printf("Child PID: %d\n", getpid());
        execl("/bin/ls", "ls", "-l", NULL);
        // If execl succeeds, this line never runs
        perror("execl failed");
    } else if (pid > 0) {
        // Parent process
        printf("Parent PID: %d, Child PID: %d\n", getpid(), pid);
        int status;
        waitpid(pid, &status, 0);  // Wait for child to finish
        printf("Child exited with status %d\n", WEXITSTATUS(status));
    } else {
        perror("fork failed");
    }
    return 0;
}

This fork-then-exec pattern is how every process on a Linux system starts. Your shell does it every time you run a command. systemd does it for every service. Even container runtimes use it internally.

Pro tip: Modern Linux uses copy-on-write (COW) for fork(). The child doesn't actually copy the parent's memory -- it shares the same pages until one of them writes. This makes fork() cheap even for processes using gigabytes of RAM. Redis relies on this for background saves.

The Process Tree: Parents and Children

Every process has a parent. The root of the tree is PID 1 -- init or systemd. You can see the tree with pstree:

$ pstree -p
systemd(1)-+-sshd(892)-+-sshd(1204)---bash(1205)---vim(1340)
           |-nginx(901)-+-nginx(902)
           |            |-nginx(903)
           |-postgres(845)-+-postgres(850)
                           |-postgres(851)

Key rules of the parent-child relationship:

  • A child's parent PID (PPID) points to the process that called fork()
  • If a parent dies, the child is reparented to PID 1 (systemd/init)
  • The parent is responsible for calling wait() to collect the child's exit status
  • If the parent doesn't call wait(), the dead child becomes a zombie

Process States

A process is always in one of these states, visible in the STAT column of ps aux:

StateCodeMeaning
RunningRCurrently executing or in the run queue
Sleeping (interruptible)SWaiting for an event (most common state)
Sleeping (uninterruptible)DWaiting for I/O -- cannot be killed with SIGKILL
StoppedTStopped by a signal (SIGSTOP or Ctrl+Z)
ZombieZTerminated but parent hasn't called wait()

Watch out: Processes in the D (uninterruptible sleep) state cannot be killed, not even with kill -9. They're waiting for I/O (usually disk or NFS). If you see processes stuck in D state, the problem is the I/O subsystem -- a dead NFS mount, a failing disk, or a saturated block device. Fix the I/O problem, and the processes will unstick themselves.

Signals: Process Communication

Signals are the primary way to communicate with processes. They're software interrupts delivered by the kernel.

What are the most important Linux signals?

The most critical signals for system administration are SIGTERM (15) for graceful shutdown, SIGKILL (9) for forced termination, SIGHUP (1) for reloading configuration, and SIGCHLD which the kernel sends to a parent when a child process exits. SIGTERM should always be tried before SIGKILL because it allows the process to clean up resources.

SignalNumberDefault ActionCan Be Caught?Common Use
SIGTERM15TerminateYesGraceful shutdown (default of kill)
SIGKILL9TerminateNoForce kill -- last resort
SIGHUP1TerminateYesReload config (Nginx, Apache)
SIGINT2TerminateYesCtrl+C in terminal
SIGSTOP19StopNoPause process (like SIGKILL, can't be caught)
SIGCONT18ContinueYesResume a stopped process
SIGCHLD17IgnoreYesChild process exited -- triggers wait()
SIGUSR110TerminateYesApplication-defined (log rotation, etc.)
# Send SIGTERM (graceful)
kill 1234
kill -SIGTERM 1234
kill -15 1234

# Send SIGKILL (force -- use only when SIGTERM doesn't work)
kill -9 1234

# Send SIGHUP (reload config)
kill -HUP 1234

# Kill all processes by name
killall nginx
pkill -f "node server.js"

Zombie Processes

A zombie process is a process that has finished executing but still has an entry in the process table. It exists because the kernel needs to keep the exit status around until the parent process collects it with wait().

How to identify and fix zombie processes

  1. Identify zombies -- run ps aux | grep Z or look for processes with state Z in top
  2. Find the parent -- run ps -o ppid= -p ZOMBIE_PID to get the parent PID
  3. Signal the parent -- send SIGCHLD to the parent: kill -SIGCHLD PARENT_PID. A well-written parent will call wait() in response
  4. Kill the parent -- if SIGCHLD doesn't work, the parent has a bug. Killing the parent causes the zombie to be reparented to PID 1, which will reap it
  5. Fix the code -- if you control the parent, ensure it handles SIGCHLD or calls waitpid() to reap children
# Find zombies
ps aux | awk '$8=="Z" {print $0}'

# Count zombies
ps aux | awk '$8=="Z"' | wc -l

# Find parent of a zombie
ps -o ppid= -p 12345

Pro tip: A few zombies are harmless -- they consume no CPU or memory, only a PID table entry. The concern is a process that continuously creates children without reaping them, which can exhaust the PID limit (default 32768, configurable via /proc/sys/kernel/pid_max).

The /proc Filesystem

The /proc filesystem is a virtual filesystem that exposes kernel and process information as files. Every process gets a directory at /proc/PID/:

# Process command line
cat /proc/1234/cmdline | tr '\0' ' '

# Process environment variables
cat /proc/1234/environ | tr '\0' '\n'

# Open file descriptors
ls -l /proc/1234/fd/

# Memory map
cat /proc/1234/maps

# Process status (state, memory, threads)
cat /proc/1234/status

# Current working directory
readlink /proc/1234/cwd

# Binary being executed
readlink /proc/1234/exe

System-wide information lives directly under /proc:

# Number of CPUs
nproc
cat /proc/cpuinfo | grep processor | wc -l

# Memory stats
cat /proc/meminfo

# Load average
cat /proc/loadavg

# Kernel version
cat /proc/version

Process Management Tools

ToolPurposeKey Flags
psSnapshot of current processesps aux, ps -ef, ps -o pid,ppid,stat,cmd
top/htopReal-time process monitorSort by CPU (P), memory (M)
pstreeProcess tree visualizationpstree -p shows PIDs
straceTrace system callsstrace -p PID -f (follow forks)
lsofList open files by processlsof -p PID, lsof -i :8080
pgrepFind PIDs by namepgrep -f "pattern"

Server and Monitoring Costs

Process monitoring becomes critical at scale. Here's what common monitoring solutions cost:

ToolTypeStarting CostNotes
DatadogSaaS APM$15/host/monthFull process monitoring, infrastructure maps
New RelicSaaS APMFree tier, then $0.30/GB100 GB/month free
Prometheus + GrafanaSelf-hostedFree (OSS)process-exporter for per-process metrics
htop/btopCLIFree (OSS)Interactive, no historical data
NetdataSelf-hostedFree (OSS)Per-process CPU, memory, I/O out of the box

A Real Debugging Session: "The App Is Frozen But Not Dead"

Theory is fine. Here is the actual sequence you run when a production process is stuck and nobody knows why. This is the playbook I use on my third cup of coffee at 3 AM.

# Step 1 -- is it alive? What state is it in?
$ ps -o pid,ppid,stat,wchan,cmd -p 4213
  PID  PPID STAT WCHAN           CMD
 4213  4191 Dl   io_schedule     /usr/bin/app --config /etc/app.yaml

# 'D' means uninterruptible sleep, 'l' means multi-threaded.
# WCHAN 'io_schedule' = it is blocked on I/O, waiting for the block layer.

# Step 2 -- what is it waiting on?
$ sudo cat /proc/4213/stack
[<0>] io_schedule+0x46/0x70
[<0>] folio_wait_bit_common+0x129/0x300
[<0>] filemap_read+0x217/0x380
[<0>] nfs_file_read+0x76/0x120   # <-- an NFS read is stuck
[<0>] vfs_read+0x9a/0x190
[<0>] ksys_read+0x5f/0xe0

# Step 3 -- confirm by looking at open files
$ sudo ls -l /proc/4213/fd/ | head
lr-x------ 1 app app 64 Apr 21 03:12 7 -> /mnt/nfs/reports/2026-04/data.parquet
# Bingo -- an NFS mount has gone away.

# Step 4 -- why is it stuck? Check the NFS mount.
$ sudo stat /mnt/nfs/reports
stat: cannot stat '/mnt/nfs/reports': No such device or address

# Confirmed: dead NFS mount. SIGKILL will not help -- the process is in D state.
# Fix the network/NFS issue, or 'umount -l' the stale mount. The process unsticks itself.

This is the single most important debugging skill in production Linux: reading /proc/<pid>/stack (or /proc/<pid>/wchan on older kernels) to see where in the kernel a process is waiting. It turns "the app is frozen" into "the app is in nfs_file_read, which means the NFS server is the problem, not the app." The same technique catches hung mysql clients waiting on a lock, Postgres clients waiting on a slow query, and Node.js processes stuck in epoll_wait when their event loop has a bad descriptor.

Benchmarks: What Process Operations Actually Cost

Order-of-magnitude numbers on a modern Linux 6.8 kernel running on a recent x86-64 server. Use these to sanity-check performance assumptions.

OperationTypical latencyNotes
Context switch between processes1-3 μsCache-hot. Cold cache can hit 10 μs.
Context switch between threads of one process0.5-1.5 μsShared address space skips TLB flush.
fork() on a 10 MB process30-80 μsDominated by page-table copy.
fork() on a 4 GB process400 μs - 1.5 msRedis RDB saves live here.
posix_spawn() to run a new binary40-120 μsAvoids parent's page-table copy entirely.
Signal delivery (kill())1-5 μsCheap. Fine to send thousands per second.
exec() on a small static binary~300 μsDominated by ELF loading + linker.
exec() on a Python 3.12 interpreter20-60 msInterpreter startup dominates. Reuse processes.
clone() to create a new thread10-30 μsglibc's pthread_create adds stack allocation.

Perf rule of thumb: If your app spawns a subprocess per request, you are paying at least 40-120 μs per request just for the spawn, plus the exec'd binary's startup. A Python subprocess per request caps throughput near 30-50 requests/second/core -- not because of your code, but because of interpreter startup. Use a persistent worker pool or a library binding instead.

Failure Modes: Five Real Incidents and Their Root Causes

The PID exhaustion outage. A poorly-written Go worker forked children without reaping, hit the kernel's default pid_max = 32768, and the whole host stopped being able to create processes. Symptoms: fork: Resource temporarily unavailable in ssh, cron, and every service. Fix: kill the parent, raise kernel.pid_max to 4,194,304 via sysctl, add a SIGCHLD handler to the worker.

The zombie that survived PID 1 adoption. In a Docker container, the app process was PID 1. It spawned children but had no SIGCHLD handler. Zombies piled up inside the container until it hit the cgroup PID limit and refused new subprocesses. Fix: run Docker with --init, or use tini/dumb-init as PID 1 inside the container. PID 1 inheriting signal-handling semantics is a real footgun.

The "won't die" Node.js process. Sent SIGTERM. Nothing happened. Sent SIGKILL. Nothing happened. Process was stuck in D state on a stalled fsync() to a failing NVMe. Replaced the drive, process died in the kernel's post-mortem pass.

The runaway fork bomb from a cron job. A misconfigured shell script had while true; do ... & done. Single-user ulimit on processes was unlimited. Fixed by setting LimitNPROC=512 in the systemd unit and running the job as a dedicated user with ulimit -u.

The OOM killer's wrong victim. The machine was OOM-killing Postgres instead of the Java app with a 38 GB heap. Cause: the JVM had oom_score_adj biased negative by a platform script. Fix: explicitly set OOMScoreAdjust=+500 in the JVM's systemd unit so it becomes the preferred victim during OOM events.

Frequently Asked Questions

What is the difference between fork() and exec()?

Fork() creates a copy of the current process with a new PID. exec() replaces the current process's program with a new one without changing the PID. They're almost always used together: fork() to create a child, then exec() in the child to run a different program. The parent continues running its original code.

Why can't I kill a process with kill -9?

If kill -9 doesn't work, the process is in uninterruptible sleep (D state), waiting for I/O that hasn't completed. The kernel won't deliver SIGKILL until the I/O finishes. Common causes include dead NFS mounts, failing disks, or hung kernel drivers. Fix the underlying I/O issue to unstick the process.

What causes zombie processes and are they harmful?

Zombies occur when a child process exits but its parent hasn't called wait() to collect the exit status. Individual zombies are harmless -- they use no CPU or memory. The risk is a buggy parent that creates thousands of zombies, exhausting the PID table. Fix the parent's code to properly reap children.

What is PID 1 and why is it special?

PID 1 is the init process (systemd on modern systems). The kernel starts it first, and it becomes the adoptive parent of any orphaned process. PID 1 has a special property: the kernel won't send it signals it hasn't explicitly registered handlers for, so you can't accidentally kill it with SIGTERM or SIGKILL.

How do I find which process is using a specific port?

Use ss -tlnp | grep :8080 or lsof -i :8080. Both show the PID and process name. ss is faster and available on all modern Linux systems. lsof gives more detail but may need to be installed separately. On older systems, netstat -tlnp works too.

What happens to child processes when the parent dies?

Orphaned children are reparented to PID 1 (systemd/init), which adopts them and will call wait() when they exit. The children keep running normally. This is why background daemons often fork twice and have the intermediate parent exit -- it makes PID 1 the parent, which handles cleanup correctly.

How do I run a process that survives SSH disconnection?

Use nohup command &, tmux, or screen. nohup makes the process ignore SIGHUP (which is sent when the terminal closes). tmux and screen are better because they let you reattach to the session later. For persistent services, use a systemd unit file instead.

Conclusion

The Linux process model is built on a small set of primitives -- fork(), exec(), wait(), and signals -- that compose into everything from shell pipelines to container runtimes. Knowing these primitives lets you reason about why a process is stuck, where zombies come from, and how to debug misbehaving services. Start by exploring /proc on a running system, use strace to watch system calls, and build intuition about parent-child relationships with pstree. These tools will save you hours the next time a process refuses to behave.

A

Written by

Abhishek Patel

Infrastructure engineer with 10+ years building production systems on AWS, GCP, and bare metal. Writes practical guides on cloud architecture, containers, networking, and Linux for developers who want to understand how things actually work under the hood.

Related Articles

Enjoyed this article?

Get more like this in your inbox. No spam, unsubscribe anytime.

Comments

Loading comments...

Leave a comment

Stay in the loop

New articles delivered to your inbox. No spam.