Chapter 2

Linux Filesystem Deep Dive

Chapter 2: Linux Filesystem Deep Dive

Linux's "everything is a file" philosophy means that understanding the filesystem gives you insight into the entire system's behavior. This chapter starts with the FHS directory structure, dives into inode internals, covers the essential difference between hard and symbolic links, advanced find usage, and disk analysis — the knowledge base every Shell expert must have.

2.1 FHS: Filesystem Hierarchy Standard

The FHS (Filesystem Hierarchy Standard) defines the standard directory structure for Linux systems, ensuring consistency across different distributions. Understanding each directory's purpose is fundamental to system administration.

Directory Full Name / Purpose Typical Contents
/ Root Starting point for all file paths
/bin Essential User Binaries ls, cp, mv, cat, bash (often symlink to /usr/bin on modern systems)
/sbin System Admin Binaries fdisk, ifconfig, fsck, init (commands requiring root)
/usr Unix System Resources /usr/bin (user programs), /usr/lib (libraries), /usr/share (shared data)
/etc System Configuration /etc/passwd, /etc/fstab, /etc/nginx/, /etc/ssh/ (all system-level configs)
/var Variable Data /var/log (logs), /var/spool (queues), /var/cache (cache), /var/www (web root)
/tmp Temporary Files (cleared on reboot) World-writable, often mounted as tmpfs (RAM)
/proc Process Info Virtual FS /proc/cpuinfo, /proc/meminfo, /proc/[PID]/ (kernel data interface)
/sys System Device Virtual FS Kernel-exported device/driver/power interfaces (sysfs)
/dev Device Files /dev/sda (disk), /dev/null, /dev/zero, /dev/tty (terminal)
/home User Home Directories /home/alice/, /home/bob/ (each user's personal space)
/root root User's Home Only accessible by root, not under /home
/opt Optional Third-party Software Manually installed large packages (e.g., /opt/google/chrome)
/srv Service Data Web/FTP service data directories
/boot Boot Loader Files vmlinuz, initrd.img, grub/ (kernel and GRUB config)
/lib Essential Shared Libraries /lib/x86_64-linux-gnu/libc.so.6 (C stdlib etc.)
/mnt Temporary Mount Points Convention for admins to temporarily mount devices
/media Removable Media Mount Points Auto-mounted USB/DVD drives
# 查看根目录结构
ls -la /

# 了解各目录大小
du -sh /* 2>/dev/null | sort -hr | head -20

# 查看当前挂载的文件系统
findmnt
# 或
mount | column -t

# 查看磁盘分区和文件系统类型
lsblk -f

2.2 inode Internals: A File's True Identity

In Linux filesystems, a filename is just a label pointing to an inode. The inode (Index Node) is what actually stores file metadata. Every file (including directories) corresponds to a unique inode number.

What an inode Stores (ASCII Structure Diagram)

┌─────────────────────────────────────────────┐
│                   inode                     │
├─────────────────────────────────────────────┤
│  inode number    : 2097152                  │
│  file type       : regular file (-)         │
│  permissions     : 0644 (rw-r--r--)         │
│  link count      : 1  (硬链接数量)          │
│  owner UID       : 1000 (alice)             │
│  group GID       : 1000 (alice)             │
│  file size       : 4096 bytes               │
│  atime           : 2026-04-25 10:00:00      │  ← 最后访问时间
│  mtime           : 2026-04-20 14:30:00      │  ← 最后修改时间(内容)
│  ctime           : 2026-04-21 09:00:00      │  ← 最后改变时间(元数据)
│  block size      : 4096                     │
│  block count     : 8                        │
│  direct blocks   : [ptr0][ptr1]...[ptr11]   │  ← 直接块指针(12个)
│  single indirect : [ptr → block table]      │  ← 一级间接块
│  double indirect : [ptr → ptr → blocks]     │  ← 二级间接块
│  triple indirect : [ptr → ptr → ptr → ...]  │  ← 三级间接块
└─────────────────────────────────────────────┘
         ↑
         目录项(dentry)中存储:文件名 → inode 编号
         文件名本身不在 inode 中!

Key Insight: The inode does not store the filename — the filename lives in directory entries (dentries), which map names to inode numbers. This is why a file can have multiple names (hard links), and why moving a file (mv) within the same filesystem is instant — only the directory entry changes, no data moves.

# 查看文件的 inode 编号
ls -i filename.txt
# 2097152 filename.txt

# 查看目录内所有文件的 inode 号
ls -li /etc/

# 查看 inode 使用情况(每个文件系统的 inode 总数和已用数)
df -i
# Filesystem       Inodes  IUsed  IFree IUse% Mounted on
# /dev/sda1      6553600  87234 6466366    2% /

# 查找 inode 用尽的迹象(inode 满了但磁盘空间还有)
df -i | awk '$5 == "100%"'

# 查找 inode 号为某个值的文件
find / -inum 2097152 2>/dev/null

2.3 The stat Command: Reading File Metadata

The stat command reads and displays a file's inode information directly — the best tool for understanding a file's true state.

$ stat /etc/passwd
  File: /etc/passwd
  Size: 2847            Blocks: 8          IO Block: 4096   regular file
Device: fd01h/64769d    Inode: 524299      Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2026-04-25 08:12:33.421076800 +0800   ← atime:最后读取时间
Modify: 2026-04-20 14:22:10.123456789 +0800   ← mtime:内容最后修改时间
Change: 2026-04-20 14:22:10.123456789 +0800   ← ctime:元数据最后修改时间

# 三个时间戳的区别:
# atime (access time)  → 每次读取文件时更新
#                        注意:挂载时使用 noatime 选项可禁止更新(提升性能)
# mtime (modify time)  → 文件内容改变时更新(写入数据)
# ctime (change time)  → 文件元数据改变时更新(权限/所有者/链接数)
#                        注意:ctime 不能被人为修改(touch 不影响 ctime)

# 只查看特定字段
stat -c "%n: inode=%i, size=%s, mtime=%y" /etc/passwd

# 格式化输出
stat --format="File: %n | Size: %s bytes | Permissions: %A | Inode: %i" /etc/passwd

# 查看目录的 stat
stat /var/log/

# 使用 touch 修改 atime 和 mtime
touch -a file.txt          # 只更新 atime
touch -m file.txt          # 只更新 mtime
touch -t 202601010000 file.txt  # 设置为指定时间

2.4 Linux File Types Explained

The first character of ls -l output indicates the file type. Linux has 7 file types:

# 文件类型标识符(ls -l 第一列首字符):
# -  普通文件(regular file):文本、二进制、图片等
# d  目录(directory)
# l  软链接(symbolic link)
# c  字符设备(character device):/dev/tty, /dev/null(逐字节 I/O)
# b  块设备(block device):/dev/sda, /dev/loop0(块 I/O,如磁盘)
# p  命名管道(FIFO):进程间通信
# s  Unix 域套接字(socket):/run/docker.sock

# 实际示例:
ls -la /dev/null /dev/sda /tmp /run/docker.sock
# crw-rw-rw-  1 root root 1, 3 ... /dev/null     ← c 字符设备
# brw-rw----  1 root disk 8, 0 ... /dev/sda      ← b 块设备
# drwxrwxrwt 20 root root ...    /tmp             ← d 目录
# srw-rw----  1 root docker ...  /run/docker.sock ← s 套接字

# 用 file 命令识别文件真实类型(不依赖扩展名)
file /bin/ls
# /bin/ls: ELF 64-bit LSB pie executable, x86-64 ...

file /etc/passwd
# /etc/passwd: ASCII text

file /dev/sda
# /dev/sda: block special (8/0)

# 用 find 按文件类型搜索
find /dev -type c     # 所有字符设备
find /tmp -type p     # 所有命名管道
find /run -type s     # 所有套接字

# 创建命名管道
mkfifo /tmp/mypipe
ls -l /tmp/mypipe
# prw-r--r-- 1 user user 0 ... /tmp/mypipe      ← p 管道

Links are a core Linux filesystem concept. Understanding the difference between hard links and symbolic links is the best practical application of inode principles.

# === 硬链接(Hard Link)===
# 创建硬链接:两个文件名指向同一个 inode
ln original.txt hardlink.txt

# 验证:两者 inode 号相同
ls -li original.txt hardlink.txt
# 2097152 -rw-r--r-- 2 user user 100 ... original.txt
# 2097152 -rw-r--r-- 2 user user 100 ... hardlink.txt
#   ↑ 相同 inode        ↑ 链接数为 2

# 删除原文件——数据依然存在(链接数减 1,不为 0 时不删除数据)
rm original.txt
cat hardlink.txt  # 依然可以读取!

# 硬链接的限制:
# 1. 不能跨文件系统(inode 号仅在同一文件系统内唯一)
# 2. 不能对目录创建硬链接(防止循环引用)

# === 软链接(Symbolic Link)===
# 创建软链接:存储目标路径字符串
ln -s /path/to/original.txt symlink.txt
ln -s /usr/bin/python3 /usr/local/bin/python  # 常见用法

# 验证:inode 号不同,l 类型,显示指向
ls -li original.txt symlink.txt
# 2097153 lrwxrwxrwx 1 user user 16 ... symlink.txt -> original.txt
# ↑ 不同 inode   ↑ l 类型

# 软链接可以跨文件系统,可以指向目录
ln -s /mnt/data /home/user/data-link

# 软链接的注意事项:
# - 原文件删除后,软链接变为"悬空链接"(dangling link)
# - 相对路径软链接:相对于链接文件所在目录,而非当前目录

# === 相关命令 ===
# 查看软链接的真实目标
readlink symlink.txt
# /path/to/original.txt

# 解析完整绝对路径(跟随所有软链接)
realpath symlink.txt

# 找出所有悬空软链接
find /path -xtype l  # -xtype l 匹配软链接,但目标不存在

# 统计一个文件的硬链接数
stat -c "%h" /bin/ls  # 通常为 1
Feature Hard Link Symbolic Link
inode relationship Shares same inode Own inode, stores path string
Cross-filesystem Not supported Supported
Point to directory Not supported (prevents loops) Supported
After original deleted Data still accessible Becomes dangling link
ls -l display Same as regular file (link count > 1) l type, shows -> target
Common Use Backups without extra inode Version management, path aliases

find is Linux's most powerful file search tool, supporting multi-condition searches by name, type, size, time, permissions, and more — plus executing arbitrary commands on each found file.

# === 按名称搜索 ===
find /home -name "*.log"           # 查找所有 .log 文件
find /etc -name "*.conf" -type f   # 只找普通文件
find / -name "passwd" 2>/dev/null  # 忽略权限错误
find . -iname "*.TXT"              # 忽略大小写

# === 按类型搜索 ===
find /var -type d -name "log*"     # 目录
find /dev -type b                  # 块设备
find /tmp -type l                  # 所有软链接
find /run -type s                  # Unix 套接字

# === 按大小搜索 ===
find /var/log -size +100M          # 大于 100MB 的文件
find /home -size -1k               # 小于 1KB 的文件(空文件附近)
find /tmp -size +10M -size -1G     # 10MB 到 1GB 之间
find / -empty                      # 空文件或空目录

# === 按时间搜索(单位:天)===
find /var/log -mtime -7            # 7天内修改过的文件
find /home -atime +30              # 30天以上未被访问
find /tmp -mtime +3 -type f        # 3天以上的临时文件
find /etc -newer /etc/passwd       # 比 passwd 更新的文件
find /var/log -mmin -60            # 60分钟内修改(-mmin,分钟)

# === 按权限搜索 ===
find / -perm 0777 -type f 2>/dev/null  # 权限为 777 的文件(安全风险!)
find / -perm -u+s -type f 2>/dev/null  # 有 SUID 位的文件
find / -perm -g+s -type f 2>/dev/null  # 有 SGID 位的文件
find /home -perm /o+w              # 其他用户可写的文件

# === 按所有者搜索 ===
find /home -user alice             # alice 拥有的文件
find /tmp -nouser                  # 没有对应用户的文件(孤立文件)
find / -group docker 2>/dev/null   # docker 组拥有的文件

# === -exec:对每个结果执行命令 ===
# {} 代表当前找到的文件;\ 是 -exec 的结束符
find /tmp -name "*.tmp" -mtime +7 -exec rm {} \;

# -exec 的更安全版本:-execdir(在文件所在目录执行)
find /home -name "*.bak" -execdir rm {} \;

# 将结果传给 xargs(效率更高,批量处理)
find /var/log -name "*.log" -mtime +30 | xargs rm -f
find /home -name "*.jpg" | xargs -I{} cp {} /backup/photos/

# === 组合条件 ===
# -a (AND,默认),-o (OR),! (NOT)
find /home -name "*.txt" -size +1M        # AND:txt 且大于 1MB
find /home \( -name "*.txt" -o -name "*.md" \)  # OR:txt 或 md
find /home ! -name "*.log"                # NOT:不是 .log

# === 深度控制 ===
find /etc -maxdepth 1 -name "*.conf"      # 只搜索一层深
find /usr -mindepth 2 -maxdepth 3         # 只搜索 2-3 层深

# === 实用组合示例 ===
# 找出过去24小时内被修改的文件,并显示它们
find /var/www -mtime -1 -type f -ls

# 找出大于 500MB 的文件,按大小排序
find / -type f -size +500M -printf "%s\t%p\n" 2>/dev/null | sort -rn | head -20

# 找出 SUID/SGID 文件(安全审计)
find / -type f \( -perm -4000 -o -perm -2000 \) -exec ls -la {} \; 2>/dev/null

# 统计目录下各类型文件数量
find /etc -type f | wc -l    # 文件数
find /etc -type d | wc -l    # 目录数

2.7 ls in Depth: Understanding Every Column

$ ls -la /etc/passwd
-rw-r--r-- 1 root root 2847 Apr 20 14:22 /etc/passwd
^ ^^^^^^^   ^ ^^^^ ^^^^ ^^^^ ^^^^^^^^^^^^ ^^^^^^^^^^^^
│ │         │ │    │    │    │             └── 文件名
│ │         │ │    │    │    └── 最后修改时间(mtime)
│ │         │ │    │    └── 文件大小(字节)
│ │         │ │    └── 所属组
│ │         │ └── 所有者
│ │         └── 硬链接数
│ └── 权限位(rwxrwxrwx:所有者/组/其他)
└── 文件类型(- d l c b p s)

# 常用 ls 选项
ls -l                  # 长格式
ls -la                 # 包含隐藏文件(.开头)
ls -lh                 # 人类可读大小(K, M, G)
ls -lS                 # 按文件大小排序
ls -lt                 # 按修改时间排序(最新在前)
ls -ltr                # 按修改时间反向排序(最旧在前)
ls -li                 # 显示 inode 号
ls -lR                 # 递归列出子目录
ls -ld /etc            # 只显示目录本身信息(不展开内容)
ls --color=auto        # 颜色区分文件类型(大多数发行版默认开启)

# 自定义颜色配置
# dircolors -p > ~/.dircolors
# 编辑 ~/.dircolors,然后在 ~/.bashrc 中加:
# eval "$(dircolors ~/.dircolors)"

# tree 命令(更直观的目录结构)
sudo apt install -y tree
tree /etc/nginx -L 2   # 显示 2 层深度
tree -sh               # 显示大小,人类可读
tree -d                # 只显示目录

2.8 Filesystem Types: ext4 vs xfs vs Virtual Filesystems

ext4 vs xfs Comparison

Feature ext4 XFS
Design Goal General purpose, backward compatible with ext2/ext3 High-performance large files, massive parallel I/O
Max Filesystem Size 1 EB 8 EB
Max Single File Size 16 TB 8 EB
Journaling Full journaling Metadata journaling (default), optional data journaling
Small File Performance Good (inline small data in inode) Slightly lower (optimized for large files)
Large File Performance Good Excellent (delayed allocation + striping)
fsck Speed Slow (very slow on large filesystems) Fast
Online Shrink Not supported Not supported
Default On Ubuntu, Debian RHEL, CentOS, Fedora
Best For General desktop/server, OS partition Databases, video streaming, big data storage

Virtual Filesystems

# === tmpfs(内存文件系统)===
# 数据存在 RAM 中,重启后消失,速度极快
# /tmp 通常挂载为 tmpfs
mount | grep tmpfs
# tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=8192M)

# 手动创建 tmpfs 挂载点
sudo mkdir /mnt/ramdisk
sudo mount -t tmpfs -o size=512M tmpfs /mnt/ramdisk
df -h /mnt/ramdisk

# === /proc 虚拟文件系统(procfs)===
# 内核在运行时动态生成的文件,不占磁盘空间
cat /proc/cpuinfo            # CPU 信息
cat /proc/meminfo            # 内存信息
cat /proc/version            # 内核版本
cat /proc/mounts             # 当前挂载点
cat /proc/net/dev            # 网络接口统计
cat /proc/loadavg            # 系统负载
cat /proc/$$/status          # 当前 Shell 进程状态($$ 是当前 PID)
ls /proc/1/                  # PID 1(init/systemd)的所有信息

# === /sys 虚拟文件系统(sysfs)===
# 内核设备模型的接口,可读写来控制硬件
cat /sys/class/net/eth0/speed          # 网卡速度
cat /sys/class/thermal/thermal_zone0/temp  # CPU 温度(单位 millidegrees)
cat /sys/block/sda/queue/scheduler    # 磁盘调度器
echo mq-deadline > /sys/block/sda/queue/scheduler  # 修改调度器(需 root)

# 查看当前所有挂载的文件系统类型
cat /proc/filesystems        # 内核支持的文件系统
mount -t ext4                # 只显示 ext4 类型的挂载
findmnt -t ext4,xfs          # findmnt 过滤特定类型

2.9 du and df: Disk Usage Analysis

df (disk free) reports filesystem-level disk usage, while du (disk usage) reports actual space consumed by directories or files. Used together, they quickly locate disk space problems.

# === df 命令 ===
df -h              # 人类可读格式(K/M/G)
df -H              # 使用 1000 而非 1024 计算
df -T              # 显示文件系统类型
df -i              # 显示 inode 使用情况(而非磁盘空间)
df -h /var         # 只显示 /var 所在文件系统

# df 输出解读:
# Filesystem      Size  Used Avail Use% Mounted on
# /dev/sda1        50G   32G   16G  67% /
# tmpfs           3.9G  1.2M  3.9G   1% /tmp

# === du 命令 ===
du -sh /var/log         # 整个 /var/log 目录的总大小
du -sh /*               # 根目录下各目录大小(快速定位大目录)
du -sh /home/*          # 每个用户的空间占用
du -h --max-depth=2 /   # 递归最多 2 层
du -h --max-depth=1 /var | sort -hr  # 排序后显示(最大的在前)

# 找出当前目录中最大的 10 个文件/目录
du -h . | sort -hr | head -10

# 找出系统中最大的文件(前20个)
find / -type f -printf "%s\t%p\n" 2>/dev/null | sort -rn | head -20 | \
    awk '{printf "%.1fMB\t%s\n", $1/1024/1024, $2}'

# === ncdu — 交互式磁盘使用分析器 ===
sudo apt install -y ncdu   # 安装
ncdu /                     # 扫描根目录(交互式,可以导航进入子目录)
ncdu /var                  # 扫描特定目录

# === 常见磁盘满问题排查流程 ===
# 1. 确认哪个分区满了
df -h

# 2. 定位大目录
du -h --max-depth=1 / 2>/dev/null | sort -hr | head -10

# 3. 进入大目录继续深挖
du -h --max-depth=1 /var | sort -hr

# 4. 找大文件
find /var/log -type f -size +100M -ls 2>/dev/null

# 5. 检查被删除但仍被进程持有的文件(常见于日志轮转)
lsof +L1 | grep deleted

# 6. 清理旧日志(journalctl)
journalctl --vacuum-size=200M    # 保留最近 200MB
journalctl --vacuum-time=7d      # 保留最近 7 天

Disk Full But df Shows Space? Two common causes: (1) inode exhaustion — check with df -i; too many small files depletes inodes; (2) deleted files still held by processes — find with lsof +L1; restarting the process or system releases the space.

Chapter Summary: You now deeply understand Linux filesystem core mechanics: FHS directory standards define each directory's role, inode principles reveal that "filenames are just labels," the hard/soft link distinction lets you use both tools wisely, find's powerful combinations make any file search possible, and du/df keep you in control of disk state at all times. The next chapter focuses on mastering file operation commands.

  Previous
  ← Ch1: Shell Setup


  Next
  Ch3: File Ops →
Rate this chapter
4.6  / 5  (88 ratings)

💬 Comments