Linux Process Management Explained: ps, top, kill, and Beyond
Learn how Linux processes work, how to inspect them with ps and top, and how to control them with signals so your servers stay responsive.
What you'll learn
- ✓What a Linux process really is
- ✓How PIDs and parent-child relationships work
- ✓How to list and filter processes with ps
- ✓How to monitor live load with top and htop
- ✓How to send signals safely with kill
Prerequisites
- •A terminal you can run commands in
What and Why
A process is a running instance of a program. Every command you type in a shell becomes a process: it gets a numeric Process ID (PID), an owner, a working directory, open files, and a small slice of memory. The kernel schedules processes onto CPUs, suspends them when they wait for I/O, and reaps them when they exit.
Knowing how to inspect and control processes is the difference between guessing why a server is slow and actually fixing it. When a Node app pegs a CPU, when a stuck rsync blocks a deploy, or when a zombie process clutters your tree, you reach for the tools below.
Mental Model
Linux starts a single process at boot called init (today usually systemd, PID 1). Every other process is a descendant of PID 1. When process A starts process B, A is the parent and B is the child. Children inherit environment variables, open file descriptors, and the current directory.
systemd (PID 1)
|-- sshd
| '-- bash (your login shell)
| '-- vim
|-- nginx
| |-- nginx worker
| '-- nginx worker
'-- cron
'-- backup.sh When a child exits, the kernel keeps a tiny record (the exit status) until the parent reads it with wait(). If the parent never reads it, the child becomes a zombie: dead but still in the table. If the parent dies first, the orphan is adopted by PID 1.
Hands-on Example
Open two terminals. In the first, start a long-running process:
sleep 300 &
The shell prints something like [1] 48211. That number is the PID. Now inspect it:
ps -p 48211 -o pid,ppid,user,stat,cmd
STAT is the state code: R running, S sleeping, D uninterruptible sleep (often disk I/O), Z zombie, T stopped. The + suffix means foreground in a terminal.
List every process on the machine with a forest view:
ps -ef --forest | less
ps auxf | head -40
Both work; aux is the BSD style and -ef is the System V style. Filter by name:
pgrep -a sshd
pgrep -fl node
For a live view, use top (built in) or htop (nicer, install separately). Press P in top to sort by CPU, M for memory, 1 to expand per-core stats, and k to kill.
Now send a signal to your sleep:
kill -TERM 48211 # polite request to terminate
kill -KILL 48211 # forced; cannot be caught or ignored
kill -HUP $(pgrep nginx | head -1) # reload config
SIGTERM (15) asks nicely. SIGKILL (9) is the hammer: the kernel terminates the process without giving it a chance to flush buffers. Always try TERM first. SIGHUP (1) is conventionally used to ask daemons to reload their configuration.
To send a signal to every matching process:
pkill -TERM -f "node server.js"
killall -USR1 nginx
You can also pause and resume processes from a terminal: Ctrl+Z sends SIGTSTP (stop), bg resumes in the background, fg brings it back, and jobs lists jobs in the current shell.
Common Pitfalls
Reaching for kill -9 first. It robs the process of any chance to clean up: open files may be left corrupt, database connections leak, and child processes can be orphaned. Try TERM and wait a few seconds before escalating.
Confusing high load average with high CPU. Load includes processes in D state waiting on disk. A box with load 20 and idle CPUs is usually I/O bound, not CPU bound. Check iostat -xz 1 or vmstat 1.
Misreading ps memory columns. VSZ is virtual size (address space reserved), not actual usage. RSS is resident set size, the real RAM in use right now. Sum of RSS across processes overcounts shared libraries.
Forgetting that child processes survive a closed SSH session only if they were detached. Use nohup, disown, tmux, or a proper systemd unit for anything long-lived.
Killing zombies directly. You cannot kill a zombie; it is already dead. Fix the parent so it calls wait(), or restart the parent.
Practical Tips
Use ps -eo pid,etime,user,cmd --sort=-etime | head to find the oldest processes; long-lived runaway scripts often hide there.
Combine pgrep with xargs for batch actions: pgrep -f stuck-worker | xargs -r renice +10.
pidstat 1 (from sysstat) shows per-process CPU, memory, and I/O over time without the flicker of top.
For a tree view without ps, run pstree -p or pstree -ap PID to see how a process was spawned.
When something is wedged in D state, check cat /proc/PID/stack and cat /proc/PID/wchan to see which kernel function it is stuck in.
Wrap-up
Linux process management boils down to three skills: listing what is running, understanding the parent-child tree, and sending the right signal at the right time. Start with ps, pgrep, and top, escalate to htop and pidstat, and reserve kill -9 for genuine emergencies. With those habits you can diagnose load spikes and stuck jobs without rebooting.
Related articles
- Linux Linux Disk Management and LVM: A Hands-on Tutorial
Partition disks, build LVM volume groups, grow filesystems online, and recover safely. The Linux storage stack from physical disks to mounted paths.
- Linux Linux Networking with ip and ss: The Modern Toolkit
Replace ifconfig and netstat with ip and ss. Learn to inspect interfaces, routes, and sockets on modern Linux with clear examples.
- Linux systemd Service Units: A Practical Tutorial
Write, install, and operate systemd service units the right way. Learn unit syntax, restart policies, logging with journalctl, and common gotchas.
- Linux Process Management in Linux: ps, top, kill, and jobs
Learn how to inspect, control, and kill processes on Linux — ps aux, top/htop, signals, foreground and background jobs, nohup, and systemctl basics — with runnable examples.