Storage and Disk
Chapter 8: Linux Storage and Disk Management
Storage is the foundation of server reliability. This chapter covers the full Linux storage management stack starting from block device basics: fdisk/parted for partitioning, mkfs for filesystem creation, mount/fstab for persistent mounting, LVM for online volume expansion, fsck/dd for repair and cloning, and smartctl for predictive health monitoring. Master these skills and you will handle any disk capacity crisis or storage failure with confidence.
8.1 Storage Basics: Block Devices and Inspection
In Linux everything is a file, including disks. Disks are exposed as block devices under /dev/. SATA/SAS disks are named /dev/sda, /dev/sdb; NVMe SSDs are named /dev/nvme0n1, /dev/nvme1n1. Partitions append a number suffix, e.g. /dev/sda1, /dev/nvme0n1p1.
# === lsblk — 树状显示块设备 ===
lsblk # 基本树状视图
lsblk -f # 显示文件系统类型和 UUID
lsblk -o NAME,SIZE,TYPE,FSTYPE,MOUNTPOINT,UUID # 自定义列
# 示例输出:
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
# sda 8:0 0 500G 0 disk
# ├─sda1 8:1 0 1G 0 part /boot
# ├─sda2 8:2 0 50G 0 part /
# └─sda3 8:3 0 449G 0 part /data
# nvme0n1 259:0 0 1.8T 0 disk
# └─nvme0n1p1 259:1 0 1.8T 0 part /fast
# === blkid — 显示 UUID 和文件系统类型 ===
sudo blkid # 所有块设备
sudo blkid /dev/sda1 # 指定分区
# 输出示例:
# /dev/sda1: UUID="a1b2c3d4-..." TYPE="ext4" PARTUUID="..."
# === df — 磁盘空间使用(已挂载文件系统)===
df -h # 人类可读单位(-h human-readable)
df -hT # 同时显示文件系统类型
df -h /home # 只查看 /home 所在文件系统
df -i # 显示 inode 使用情况(inode 耗尽同样会导致无法写入)
# === du — 目录/文件占用空间 ===
du -sh /var/log # 查看目录总大小
du -sh /var/log/* # 子目录各自大小
du -h --max-depth=1 / # 根目录一层深度
du -sh * | sort -rh | head -20 # 当前目录最大的 20 个文件/目录
# === hdparm — 磁盘详细信息 ===
sudo hdparm -I /dev/sda # 显示磁盘型号、固件版本、支持特性
sudo hdparm -t /dev/sda # 测试磁盘顺序读取速度(不加缓存)
8.2 Partitioning: fdisk / parted / gdisk
Backup before partitioning: Partitioning is destructive. Wrong fdisk/parted commands can permanently destroy data. Before touching any production disk, verify the device name with
lsblkand back up your data. Partitioning an active system disk requires booting from a Live CD/USB.
fdisk — Interactive MBR/GPT Partitioning
# 启动 fdisk 交互式界面
sudo fdisk /dev/sdb
# 常用交互命令:
# m — 显示帮助(所有命令)
# p — 打印当前分区表
# n — 新建分区
# 选 p 主分区(primary)或 e 扩展分区(extended)
# 输入分区号(1-4)
# 输入起始扇区(直接回车使用默认)
# 输入大小:+20G(20G)、+500M(500MB)、直接回车(剩余全部)
# d — 删除分区
# t — 修改分区类型(8e=Linux LVM, 82=swap, 83=Linux)
# w — 写入分区表并退出(危险!操作不可撤销)
# q — 退出不保存
# 分区后通知内核刷新分区表
sudo partprobe /dev/sdb
# 查看分区表(不进入交互界面)
sudo fdisk -l /dev/sdb
parted — GPT Support, Scriptable
# 创建 GPT 分区表(大于 2TB 的磁盘必须用 GPT)
sudo parted /dev/sdb mklabel gpt
# 非交互式创建分区(适合脚本)
sudo parted /dev/sdb mkpart primary ext4 1MiB 21GiB # 1MiB 对齐起点(性能优化)
sudo parted /dev/sdb mkpart primary xfs 21GiB 100% # 剩余全部
# 查看分区信息
sudo parted /dev/sdb print
# 设置分区标志(如 LVM 标志)
sudo parted /dev/sdb set 1 lvm on
# gdisk — GPT 专用(类似 fdisk 的交互界面)
sudo gdisk /dev/sdb
8.3 Filesystems: mkfs / tune2fs / xfs_info
# === mkfs — 格式化分区(创建文件系统)===
# ext4(最常见,兼容性最好)
sudo mkfs.ext4 /dev/sdb1
sudo mkfs.ext4 -L "data-disk" /dev/sdb1 # -L 设置卷标
sudo mkfs.ext4 -m 1 /dev/sdb1 # -m 保留空间比例(默认5%,数据盘设1%)
# XFS(高性能,大文件,RHEL/CentOS 默认)
sudo mkfs.xfs /dev/sdb2
sudo mkfs.xfs -L "fast-data" /dev/sdb2
# Btrfs(支持快照、压缩、RAID)
sudo mkfs.btrfs /dev/sdb3
sudo mkfs.btrfs -L "btrfs-pool" -d raid1 /dev/sdb3 /dev/sdc3 # RAID1
# FAT32(U盘,跨平台兼容)
sudo mkfs.fat -F32 /dev/sdb4
sudo mkfs.vfat /dev/sdb4
# === tune2fs — 调整 ext4 文件系统参数 ===
sudo tune2fs -l /dev/sda1 # 查看文件系统信息
sudo tune2fs -L "new-label" /dev/sda1 # 修改卷标
sudo tune2fs -m 1 /dev/sda1 # 减少保留块(节省空间)
sudo tune2fs -i 0 -c 0 /dev/sda1 # 禁用定期自动 fsck
# === xfs_info — 查看 XFS 文件系统信息 ===
sudo xfs_info /dev/sdb2
sudo xfs_info /mnt/data # 也可以指定挂载点
| Filesystem | Strengths | Best For | Max File/Volume |
|---|---|---|---|
| ext4 | Mature, best compatibility, rich tooling | General purpose, system partition | 16TB / 1EB |
| XFS | High-concurrency writes, large files, online grow | Databases, media, high throughput | 8EB / 8EB |
| Btrfs | Snapshots, transparent compression, built-in RAID, checksums | Desktop, container storage, NAS | 16EB / 16EB |
| FAT32 | Cross-platform (Windows/macOS/Linux) | USB drives, UEFI partition, embedded | 4GB / 2TB |
8.4 Mounting and /etc/fstab
# === mount / umount ===
sudo mkdir -p /mnt/data
sudo mount /dev/sdb1 /mnt/data # 基本挂载
sudo mount -t xfs /dev/sdb2 /mnt/fast # 指定文件系统类型
sudo mount -o ro /dev/sdb1 /mnt/data # 只读挂载
sudo mount -o remount,rw /mnt/data # 重新挂载为读写
sudo mount -o noatime,nodiratime /dev/sdb1 /mnt/data # 禁用访问时间(性能优化)
# 查看已挂载的文件系统
mount | grep sdb
findmnt # 树状显示挂载信息
findmnt /mnt/data # 查看特定挂载点
# 卸载
sudo umount /mnt/data
sudo umount /dev/sdb1
sudo umount -l /mnt/data # -l 延迟卸载(busy 时)
# === /etc/fstab — 持久化挂载配置 ===
# 格式:设备 挂载点 文件系统类型 选项 dump pass
# dump: 0=不备份 1=备份(已基本弃用)
# pass: 0=不检查 1=根分区 2=其他分区(fsck 优先级)
# 推荐用 UUID 而非 /dev/sdb1(设备名可能因硬件变化而改变)
sudo blkid /dev/sdb1 # 获取 UUID
/etc/fstab Complete Example
# /etc/fstab — 文件系统挂载配置
#
# 根分区(系统盘,UUID 方式,ext4)
UUID=a1b2c3d4-1234-5678-abcd-ef0123456789 / ext4 errors=remount-ro 0 1
# Boot 分区
UUID=ABCD-EF01 /boot/efi vfat umask=0077 0 1
# 数据盘(XFS,禁用访问时间记录提升性能)
UUID=b2c3d4e5-2345-6789-bcde-f01234567890 /data xfs defaults,noatime 0 2
# Swap 分区
UUID=c3d4e5f6-3456-789a-cdef-012345678901 none swap sw 0 0
# NFS 远程挂载(网络文件系统)
192.168.1.100:/exports/shared /mnt/shared nfs defaults,_netdev 0 0
# _netdev 告诉系统等网络就绪后再挂载
# tmpfs(内存文件系统,重启清空,适合临时文件)
tmpfs /tmp tmpfs defaults,size=2G 0 0
# 测试 fstab 语法(不实际挂载,避免错误配置导致无法启动)
sudo mount -a --fake # 检查语法(某些发行版支持)
sudo mount -a # 挂载 fstab 中所有 auto 条目
8.5 LVM Logical Volume Management
LVM (Logical Volume Manager) inserts an abstraction layer between physical disks and filesystems, enabling flexible disk management: online resizing without downtime, spanning multiple disks as one logical volume, and snapshot backups. Nearly all production server data disks should use LVM.
LVM Three-Layer Architecture (ASCII Diagram)
# LVM 架构层次图:
#
# 文件系统层 ┌──────────────────────────────────┐
# │ /dev/vg_data/lv_app (ext4) │ 逻辑卷 (LV)
# │ /dev/vg_data/lv_db (xfs) │ Logical Volume
# └──────────────┬───────────────────┘
# │ 属于同一个卷组
# 卷组层 ┌──────────────▼───────────────────┐
# │ vg_data (VG) │ 卷组 (VG)
# │ 总容量 = PV1 + PV2 │ Volume Group
# └──────┬────────────────┬──────────┘
# │ │
# 物理卷层 ┌────────▼──┐ ┌────────▼──┐
# │ /dev/sdb │ │ /dev/sdc │ 物理卷 (PV)
# │ (500 GB) │ │ (500 GB) │ Physical Volume
# └───────────┘ └───────────┘
# ↑ ↑
# 物理磁盘/分区 物理磁盘/分区
PV / VG / LV Complete Operations
# === 第一步:创建物理卷 (PV) ===
sudo pvcreate /dev/sdb /dev/sdc # 将磁盘初始化为 PV(可以是整块盘或分区)
sudo pvs # 简洁查看 PV
sudo pvdisplay /dev/sdb # 详细查看 PV 信息
# === 第二步:创建卷组 (VG) ===
sudo vgcreate vg_data /dev/sdb /dev/sdc # 创建名为 vg_data 的卷组,包含两块盘
sudo vgs # 简洁查看 VG
sudo vgdisplay vg_data # 详细查看 VG
# 向已有卷组添加新磁盘(扩展卷组)
sudo pvcreate /dev/sdd
sudo vgextend vg_data /dev/sdd
# === 第三步:创建逻辑卷 (LV) ===
sudo lvcreate -L 100G -n lv_app vg_data # 创建 100G 的 lv_app 逻辑卷
sudo lvcreate -L 200G -n lv_db vg_data # 创建 200G 的 lv_db 逻辑卷
sudo lvcreate -l 100%FREE -n lv_backup vg_data # 使用卷组全部剩余空间
sudo lvs # 简洁查看 LV
sudo lvdisplay /dev/vg_data/lv_app # 详细查看 LV
# 格式化并挂载
sudo mkfs.ext4 /dev/vg_data/lv_app
sudo mkfs.xfs /dev/vg_data/lv_db
sudo mkdir -p /app /db
sudo mount /dev/vg_data/lv_app /app
sudo mount /dev/vg_data/lv_db /db
# === LV 在线扩容(无需卸载!)===
# 1. 扩展逻辑卷大小
sudo lvextend -L +50G /dev/vg_data/lv_app # 增加 50G
sudo lvextend -L 200G /dev/vg_data/lv_app # 扩展到 200G(总大小)
sudo lvextend -l +100%FREE /dev/vg_data/lv_app # 使用 VG 中所有剩余空间
# 2. 扩展文件系统(ext4)
sudo resize2fs /dev/vg_data/lv_app # ext4 扩展(支持在线扩容)
# 2. 扩展文件系统(XFS,xfs 不支持缩小,只能扩大)
sudo xfs_growfs /db # XFS 用挂载点
# 一步完成(lvextend 带 -r 自动 resize 文件系统)
sudo lvextend -L +50G -r /dev/vg_data/lv_app
# === LV 快照(用于备份)===
sudo lvcreate -L 10G -s -n lv_app_snap /dev/vg_data/lv_app # 创建快照
sudo mount -o ro /dev/vg_data/lv_app_snap /mnt/snap # 只读挂载快照
sudo lvremove /dev/vg_data/lv_app_snap # 删除快照
8.6 fsck: Filesystem Check and Repair
# === 重要:fsck 必须在文件系统未挂载时运行!===
# ext4 文件系统检查
sudo umount /dev/sdb1
sudo e2fsck -f /dev/sdb1 # -f 强制检查(即使标记为 clean)
sudo e2fsck -f -y /dev/sdb1 # -y 自动回答 yes(非交互)
sudo e2fsck -f -n /dev/sdb1 # -n 只读检查,不修改(安全)
# 通用 fsck(自动选择后端)
sudo fsck /dev/sdb1
sudo fsck -t ext4 /dev/sdb1 # 指定文件系统类型
sudo fsck -a /dev/sdb1 # 自动修复(等同 -y)
# XFS 文件系统修复
sudo xfs_repair /dev/sdb2 # XFS 修复(挂载会失败,则卸载后运行)
sudo xfs_repair -n /dev/sdb2 # 只读检查
sudo xfs_repair -L /dev/sdb2 # -L 清空日志(最后手段,可能丢数据)
# 何时需要 fsck?
# 1. 系统崩溃或异常断电后
# 2. mount 时显示文件系统错误
# 3. /etc/fstab 中 pass 字段为 1 或 2 时,系统启动自动运行
# 4. 手动定期维护(tune2fs -c 设置检查间隔)
# 强制下次启动时对根分区进行 fsck
sudo touch /forcefsck # 某些发行版
sudo tune2fs -C 1 /dev/sda1 # 将挂载计数设为 1(触发检查)
How to fsck the root partition? The root partition (
/) cannot be unmounted while the system is running. You must operate from single-user mode or boot from a Live CD/USB. To enter single-user mode: presseat the GRUB menu to edit the boot entry, appendsingleorinit=/bin/bashto thelinuxline, then pressCtrl+Xto boot.
8.7 dd: Disk Cloning and Imaging
dd (disk duplicator) is the low-level data copy tool on Linux, operating directly on byte streams. It can clone disks, create ISO images, test disk speed, and securely wipe data. Use with extreme care: if you swap if (input) and of (output), you will overwrite your data disk.
# === 基本语法 ===
# dd if=输入文件 of=输出文件 bs=块大小 count=块数
# === 磁盘克隆(整盘复制)===
sudo dd if=/dev/sda of=/dev/sdb bs=4M status=progress # 克隆 sda 到 sdb(大小必须相同或更大)
# status=progress:显示进度(Linux 4.x+ 支持)
# === 创建磁盘镜像文件 ===
sudo dd if=/dev/sda of=/backup/sda.img bs=4M status=progress
# 压缩备份(节省存储空间)
sudo dd if=/dev/sda bs=4M | gzip -c > /backup/sda.img.gz
# 使用 pigz 并行压缩(更快)
sudo dd if=/dev/sda bs=4M | pigz > /backup/sda.img.gz
# === 恢复镜像到磁盘 ===
sudo dd if=/backup/sda.img of=/dev/sdb bs=4M status=progress
gunzip -c /backup/sda.img.gz | sudo dd of=/dev/sdb bs=4M status=progress
# === 写入 ISO 到 U 盘 ===
sudo dd if=ubuntu-22.04.iso of=/dev/sdc bs=4M status=progress conv=fdatasync
# conv=fdatasync 确保所有数据写入完成再退出(避免拔出时数据丢失)
# === 测试磁盘写入速度 ===
dd if=/dev/zero of=/tmp/test bs=1G count=1 oflag=direct
# oflag=direct 绕过缓存,测试真实写入速度
# === 测试磁盘读取速度 ===
dd if=/dev/sda of=/dev/null bs=4M count=1000 status=progress
# === 安全擦除磁盘(写入随机数据)===
sudo dd if=/dev/urandom of=/dev/sdb bs=4M status=progress
# 注意:大磁盘耗时极长,SSD 建议用 hdparm --security-erase
# === 只备份分区而非整盘 ===
sudo dd if=/dev/sda1 of=/backup/boot.img bs=4M status=progress
8.8 smartctl: Predictive Disk Health
# 安装
sudo apt install smartmontools
# === 基本健康检查 ===
sudo smartctl -H /dev/sda # 快速健康状态(PASSED/FAILED)
sudo smartctl -a /dev/sda # 完整 SMART 属性报告
sudo smartctl -i /dev/sda # 磁盘基本信息
# === 运行自检 ===
sudo smartctl -t short /dev/sda # 短测试(1-2分钟,后台运行)
sudo smartctl -t long /dev/sda # 长测试(数小时,全盘扫描)
sudo smartctl -t conveyance /dev/sda # 运输测试(模拟运输损坏)
# 查看测试结果
sudo smartctl -l selftest /dev/sda
# === 关键 SMART 属性解读 ===
# Reallocated_Sector_Ct (ID 5) — 重映射扇区数。非0即硬盘有坏扇区,越多越危险
# Spin_Retry_Count (ID 10) — 主轴启动重试次数。非0说明机械部件可能有问题
# UDMA_CRC_Error_Count (ID 199)— 接口传输错误。非0可能是数据线问题
# Current_Pending_Sector (ID 197)— 待重映射扇区(不稳定扇区)。非0紧急关注
# Offline_Uncorrectable (ID 198)— 无法纠错的扇区。非0极度危险,立即备份!
# Power_On_Hours (ID 9) — 通电总小时数。判断磁盘寿命
# Temperature_Celsius (ID 194)— 磁盘温度。超过 55°C 需要关注
# NVMe SSD 健康检查
sudo smartctl -a /dev/nvme0n1
# 启用自动监控(smartd 守护进程)
sudo systemctl enable smartd
sudo systemctl start smartd
# 配置文件:/etc/smartd.conf
8.9 Software RAID: mdadm Overview
| RAID Level | Min Disks | Capacity | Fault Tolerance | Performance | Best For |
|---|---|---|---|---|---|
| RAID 0 | 2 | 100% | None (any disk = total loss) | Fastest read/write | Temp data, cache |
| RAID 1 | 2 | 50% | 1 disk | Read faster, write same | System disk, high reliability |
| RAID 5 | 3 | (N-1)/N | 1 disk | Fast read, parity write overhead | Balanced capacity/reliability |
| RAID 6 | 4 | (N-2)/N | 2 disks | Fast read, higher write overhead | Large storage, high reliability |
| RAID 10 | 4 | 50% | 1 disk per mirror pair | Fast read/write (RAID 0+1) | Databases, high IOPS |
# 安装 mdadm
sudo apt install mdadm
# 创建 RAID 1(镜像,两块盘)
sudo mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc
# 创建 RAID 5(三块盘)
sudo mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb /dev/sdc /dev/sdd
# 查看 RAID 状态
cat /proc/mdstat
sudo mdadm --detail /dev/md0
# 保存 RAID 配置
sudo mdadm --detail --scan >> /etc/mdadm/mdadm.conf
# 格式化并挂载 RAID 设备(与普通磁盘相同)
sudo mkfs.ext4 /dev/md0
sudo mount /dev/md0 /mnt/raid
8.10 Disk Performance Analysis
# === hdparm — 简单速度测试 ===
sudo hdparm -t /dev/sda # 缓冲读测试(排除 OS 缓存)
sudo hdparm -T /dev/sda # 纯缓存读测试
# === iostat — I/O 统计(sysstat 包)===
sudo apt install sysstat
iostat # 基本 I/O 统计
iostat -dx 2 # -d 磁盘,-x 扩展信息,每2秒刷新
iostat -dx /dev/sda 2 # 只看 sda
# 关键指标解读:
# %util — 设备繁忙率(接近100%说明磁盘成为瓶颈)
# await — 平均 I/O 等待时间(ms,机械盘正常 **Chapter Summary:** This chapter covered the complete Linux storage management stack: understanding the disk landscape with `lsblk/blkid`, precise partitioning with `fdisk/parted`, choosing the right filesystem with `mkfs`, persistent mounting with `fstab`, flexible online expansion with LVM, emergency repair with `fsck/e2fsck`, cloning and backup with `dd`, predictive failure detection with `smartctl`, and redundancy protection with RAID. These skills form the core competency of server storage operations. The next chapter enters Shell scripting: variables and control flow.
Previous
← Ch7: Network
Next
Ch9: Variables →