测试磁盘性能

查看磁盘类型

1
2
3

$ lsblk -d -o name,rota,type,size,model
NAME ROTA TYPE  SIZE MODEL
sda     1 disk  1.8T PERC H740P Mini

ROTA=1：这是旋转磁盘。

测试方法

顺序写吞吐测试（逼近最大写入速度）

1	fio --name=seqwrite --rw=write --bs=1M --size=5G --numjobs=4 --iodepth=32 --direct=1 --runtime=60 --group_reporting

随机读 IOPS 测试（逼近最大并发处理能力）

1	fio --name=randread --rw=randread --bs=4k --size=5G --numjobs=4 --iodepth=64 --direct=1 --runtime=60 --group_reporting

混合读写测试（模拟数据库负载）

1	fio --name=mixrw --rw=randrw --rwmixread=70 --bs=4k --size=5G --numjobs=4 --iodepth=32 --direct=1 --runtime=60 --group_reporting

磁盘的测试结果

由于是旋转磁盘，iodepth 总是 1（设成其他值不会生效）

单线程读写文件：

点击展开代码

    fio_bs_test.shview raw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
#!/bin/bash

# 测试参数
DEVICE="./testfile"   # 修改为你要测试的文件或设备路径
RUNTIME=30                       # 每个测试运行时间（秒）
# BLOCK_SIZES=("4k" "16k" "64k" "256k" "1M" "4M" "16M" "32M" "64M" "128M")  # 测试块大小列表
BLOCK_SIZES=("128M" "64M" "32M" "16M" "4M" "1M" "256k" "64k" "16k" "4k")  # 测试块大小列表
LOG_FILE="fio_bs_output.log"        # 原始输出日志文件
PERFORMANCE_LOG="fio_bs_performance.log"  # 性能结果日志文件

# 输出表头
printf "%-8s | %-10s | %-8s | %-10s | %-10s\n" "RW" "BlockSize" "IOPS" "BW(MiB/s)" "AvgLat(ms)" | tee "$PERFORMANCE_LOG"
echo "-------------------------------------------------------------" | tee -a "$PERFORMANCE_LOG"

echo "" > "$LOG_FILE"  # 清空日志文件

# 循环测试不同块大小
for RW in read write; do
  for BS in "${BLOCK_SIZES[@]}"; do
    OUTPUT=$(fio --name=bs_test \
                 --filename="$DEVICE" \
                 --rw=$RW \
                 --bs=$BS \
                 --size=1G \
                 --time_based \
                 --runtime=$RUNTIME \
                 --numjobs=1 \
                 --direct=1 \
                 --ioengine=psync \
                 --group_reporting)
    echo "----------------------------------------------" >> "$LOG_FILE"
    echo "$OUTPUT" >> "$LOG_FILE"
    echo "" >> "$LOG_FILE"

    # 提取关键指标
    read IOPS BW BWUNIT LAT LAT_UNIT <<< $(echo "$OUTPUT" | awk '
        /IOPS=/ {match($0, /IOPS= *([0-9.]+)/, iops)}
        /BW=/ {
            match($0, /BW= *([0-9.]+)([KMG]iB)\/s/, bwinfo)
            bwval=bwinfo[1]; bwunit=bwinfo[2]
        }
        /clat \(/ {match($0, /avg= *([0-9.]+),/, lat); match($0, /\(([^)]+)\)/, lat_unit)}
        END {print iops[1], bwval, bwunit, lat[1], lat_unit[1]}
    ')

    # 延迟单位换算
    if [ "$LAT_UNIT" = "usec" ]; then
        LAT_MS=$(awk "BEGIN {printf \"%.2f\", $LAT/1000}")
    elif [ "$LAT_UNIT" = "msec" ]; then
        LAT_MS=$LAT
    else
        LAT_MS="Unknown"
    fi

    # 带宽单位换算为 MiB/s
    case "$BWUNIT" in
        "KiB") BW_MIB=$(awk "BEGIN {printf \"%.2f\", $BW/1024}") ;;
        "MiB") BW_MIB=$BW ;;
        "GiB") BW_MIB=$(awk "BEGIN {printf \"%.2f\", $BW*1024}") ;;
        *) BW_MIB="Unknown" ;;
    esac

    # 输出结果行
    printf "%-8s | %-10s | %-8s | %-10s | %-10s\n" "$RW" "$BS" "$IOPS" "$BW_MIB" "$LAT_MS" | tee -a "$PERFORMANCE_LOG"
  done
done

# 删除 fio 创建的测试文件
rm -f "$DEVICE"
echo "测试完成，结果已保存到 $LOG_FILE 和 $PERFORMANCE_LOG"

RW       | BlockSize  | IOPS     | BW(MiB/s)  | AvgLat(ms)
-----------------------------------------------------------
read     | 4k         | 194      | 0.76       | 5.14
read     | 16k        | 239      | 3.74       | 4.17
read     | 64k        | 347      | 21.7       | 2.88
read     | 256k       | 271      | 67.9       | 3.68
read     | 1M         | 94       | 94.9       | 10.53
read     | 4M         | 14       | 57.2       | 69.74
read     | 16M        | 6        | 97.6       | 163.96
read     | 32M        | 3        | 111        | 288.99
read     | 64M        | 1        | 112        | 573.70
read     | 128M       | 0        | 112        | 1147.00

RW       | BlockSize  | IOPS     | BW(MiB/s)  | AvgLat(ms)
-----------------------------------------------------------
write    | 4k         | 1884     | 7.36       | 0.53
write    | 16k        | 1452     | 22.7       | 0.69
write    | 64k        | 793      | 49.6       | 1.26
write    | 256k       | 307      | 76.8       | 3.24
write    | 1M         | 100      | 100        | 9.96
write    | 4M         | 18       | 73.7       | 53.99
write    | 16M        | 5        | 93.8       | 169.43
write    | 32M        | 3        | 101        | 313.24
write    | 64M        | 1        | 107        | 594.46
write    | 128M       | 0        | 102        | 1249.35

当 BlockSize=32M 以后，写入性能基本达到顶峰（110 MiB/s），和旋转磁盘的参数基本一致。

多线程读写同一个文件，BS=64KiB：

点击展开代码

    fio_mt_test.shview raw
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
#!/bin/bash

DEVICE="./mt_testfile"   # 测试文件路径
RUNTIME=30            # 每个测试运行时间（秒）
THREADS=(1 2 4 8 16 32 64 72 120)  # 测试线程数列表
BLOCK_SIZE="64k"      # 块大小，可根据需要修改
LOG_FILE="fio_thread_output.log"
PERFORMANCE_LOG="fio_thread_performance.log"

# 输出表头
printf "%-8s | %-8s | %-10s | %-10s | %-10s\n" "RW" "Threads" "IOPS" "BW(MiB/s)" "AvgLat(ms)" | tee "$PERFORMANCE_LOG"
echo "---------------------------------------------------------------" | tee -a "$PERFORMANCE_LOG"

echo "" > "$LOG_FILE"  # 清空日志文件

for RW in read write; do
  for THREAD in "${THREADS[@]}"; do
    OUTPUT=$(fio --name=thread_test \
                 --filename="$DEVICE" \
                 --rw=$RW \
                 --bs=$BLOCK_SIZE \
                 --size=5G \
                 --time_based \
                 --runtime=$RUNTIME \
                 --numjobs=$THREAD \
                 --direct=1 \
                 --ioengine=psync \
                 --group_reporting)
    echo "---------------------------------------------------------------" >> "$LOG_FILE"
    echo "$OUTPUT" >> "$LOG_FILE"
    echo "" >> "$LOG_FILE"

    # 提取关键指标
    read IOPS BW BWUNIT LAT LAT_UNIT <<< $(echo "$OUTPUT" | awk '
        /IOPS=/ {match($0, /IOPS= *([0-9.]+)/, iops)}
        /BW=/ {
            match($0, /BW= *([0-9.]+)([KMG]iB)\/s/, bwinfo)
            bwval=bwinfo[1]; bwunit=bwinfo[2]
        }
        /clat \(/ {match($0, /avg= *([0-9.]+),/, lat); match($0, /\(([^)]+)\)/, lat_unit)}
        END {print iops[1], bwval, bwunit, lat[1], lat_unit[1]}
    ')

    # 延迟单位换算
    if [ "$LAT_UNIT" = "usec" ]; then
        LAT_MS=$(awk "BEGIN {printf \"%.2f\", $LAT/1000}")
    elif [ "$LAT_UNIT" = "msec" ]; then
        LAT_MS=$LAT
    else
        LAT_MS="Unknown"
    fi

    # 带宽单位换算为 MiB/s
    case "$BWUNIT" in
        "KiB") BW_MIB=$(awk "BEGIN {printf \"%.2f\", $BW/1024}") ;;
        "MiB") BW_MIB=$BW ;;
        "GiB") BW_MIB=$(awk "BEGIN {printf \"%.2f\", $BW*1024}") ;;
        *) BW_MIB="Unknown" ;;
    esac

    # 输出结果行
  printf "%-8s | %-8s | %-10s | %-10s | %-10s\n" "$RW" "$THREAD" "$IOPS" "$BW_MIB" "$LAT_MS" | tee -a "$PERFORMANCE_LOG"
  done
done

rm -f "$DEVICE"
echo "测试完成，结果已保存到 $LOG_FILE 和 $PERFORMANCE_LOG"

RW       | Threads  | IOPS       | BW(MiB/s)  | AvgLat(ms)
------------------------------------------------------------
read     | 1        | 826        | 51.6       | 1.21
read     | 2        | 1300       | 81.3       | 1.54
read     | 4        | 1681       | 105        | 2.38
read     | 8        | 1778       | 111        | 4.49
read     | 16       | 1789       | 112        | 8.93
read     | 32       | 1790       | 112        | 17.86
read     | 64       | 1789       | 112        | 35.73
read     | 72       | 1790       | 112        | 40.18
read     | 120      | 1789       | 112        | 66.98

RW       | Threads  | IOPS       | BW(MiB/s)  | AvgLat(ms)
------------------------------------------------------------
write    | 1        | 847        | 52.9       | 1.18
write    | 2        | 1367       | 85.5       | 1.46
write    | 4        | 1757       | 110        | 2.27
write    | 8        | 1786       | 112        | 4.47
write    | 16       | 1788       | 112        | 8.92
write    | 32       | 1788       | 112        | 17.79
write    | 64       | 1788       | 112        | 35.09
write    | 72       | 1788       | 112        | 39.54
write    | 120      | 1784       | 112        | 64.50

当线程数增加，IO 性能随之提高，可能原因是 64KiB 小块数据大量提交到 I/O 队列，操作系统能更好地完成读写路径优化。
但达到8线程的时候，就基本到达性能顶峰了。

注意：fio 多线程写入同一个文件是没有加锁的，如果超过 page cache (一般是 4 KB)，那么可能乱序写入。

概念（以 fio 为例）

ioengine

ioengine（I/O 引擎）是 fio 提供以执行读写任务的底层接口。不同的引擎代表不同的 I/O 模型，比如同步、异步、内存映射、零拷贝等。

引擎名称	类型	特点与用途
sync	同步	默认方式，每次 I/O 都等待完成，适合简单测试
psync	同步	使用 pread/pwrite，可指定偏移，略快
libaio	异步	Linux 异步 I/O，适合高性能 SSD/NVMe
io_uring	异步	新一代 Linux 异步接口，低延迟、高并发
mmap	内存映射	将文件映射到内存，适合大文件顺序访问
splice	零拷贝	用于高效数据传输，减少 CPU 和内存开销
windowsaio	异步	Windows 原生异步 I/O，适合多线程写入
net	网络	用于网络 I/O 测试，如 socket 传输
sg	SCSI	用于直接访问 SCSI 设备

每种 ioengine 都依赖操作系统提供的底层 I/O 接口。例如：

ioengine 类型	操作系统要求	是否异步	说明
sync / psync	所有系统	❌	使用标准阻塞 I/O，几乎总是可用
libaio	Linux，需安装 libaio 库	✅	依赖 Linux 的异步 I/O 接口
io_uring	Linux ≥ 5.1，推荐 ≥ 5.4	✅	依赖新内核特性和 liburing 库
windowsaio	Windows	✅	使用 Windows 原生异步 I/O
mmap	所有主流系统	❌	使用内存映射，适合顺序读写
posixaio	POSIX 兼容系统	✅	使用 aio_read / aio_write 接口

你可以指定任意 ioengine （默认值是 sync / psync），但它是否能运行，必须得到操作系统的支持。这包括内核版本、系统接口、库文件等。如果系统不支持，fio 会报错或自动回退。

查看支持列表：

1	fio --enghelp

这会列出当前系统上可用的 ioengine，但注意：列出来 ≠ 能用，还要看运行时是否报错。

实际测试：

1	fio --name=test --ioengine=io_uring --rw=write --size=1G --bs=1M

如果不支持，会报错，例如：

1	fio: pid=132756, err=38/file:engines/io_uring.c:1351, func=io_queue_init, error=Function not implemented

iodepth

--iodepth 是传递给内核的参数。

如果你不显式设置 --iodepth，那么 fio 会根据所选的 I/O 引擎（--ioengine）来决定默认值

I/O 引擎	默认 iodepth
sync / psync / vsync	1（同步 I/O，只能一个一个处理）
libaio / io_uring	1，但可以设置更高以启用异步并发
mmap / pread / pwrite	1
windowsaio（Windows）	1
sg（SCSI generic）	1

fio 并不会主动维护队列，队列是内核的特性。

I/O 引擎	队列位置	是否异步	说明
psync / sync	无队列（直接调用）	否	每次写入调用 write()，无排队机制
libaio	内核空间	✅ 是	使用 Linux AIO，队列在内核中，由 io_submit() 提交
io_uring	用户 + 内核共享	✅ 是	使用环形缓冲区，用户空间提交，内核空间处理
mmap / null	用户空间	❌ 否	模拟或跳过实际 I/O，不涉及内核队列

对于支持异步 I/O 的引擎（如 libaio 或 io_uring），你可以设置更高的 iodepth（如 32、64、128）来模拟高并发负载；
对 SSD 或 NVMe 设备，高 iodepth 能显著提升 IOPS 和吞吐量；
对机械硬盘，提升有限，但仍可用于测试调度策略和队列行为。

但如果你的 ioengine 是 sync 或 psync，这些是同步阻塞 I/O，根本不支持高并发，所以 iodepth 实际上不会生效。

参数	说明
–name=seqwrite	定义测试任务的名称为 seqwrite，用于标识输出结果
–rw=write	设置为顺序写入模式（sequential write），数据按顺序写入磁盘
–bs=1M	每次 I/O 操作的块大小为 1MB，适合测试吞吐量
–size=5G	每个线程写入的总数据量为 5GB（不是总共，是每个 job）
–numjobs=4	启动 4 个并发线程（job），模拟多线程写入场景
–iodepth=32	每个线程的 I/O 队列深度为 32，表示最多可同时挂起 32 个 I/O 请求本例中每个线程会发起 5G/1M=5120 个 I/O 请求
–direct=1	绕过系统缓存，直接对磁盘进行读写，更真实地反映设备性能
–runtime=60	测试持续时间为 60 秒，优先于 –size， 1. 即使数据写完也继续写更多数据直到时间结束 < br>2. 如果没有写完，则时间到就结束
–group_reporting	汇总所有线程的测试结果，输出整体性能指标而不是每个线程单独显示

可能的代码实现：

#include <libaio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>

#define FILE_PATH "testfile.bin"
#define BLOCK_SIZE 4096
#define IODEPTH 4  // 控制并发请求数量

int main() {
    int fd = open(FILE_PATH, O_CREAT | O_WRONLY | O_DIRECT, 0644);
    if (fd < 0) {
        perror("open");
        return 1;
    }

    // io_setup 是 libaio 的函数
    io_context_t ctx = 0;
    if (io_setup(IODEPTH, &ctx) < 0) {
        perror("io_setup");
        return 1;
    }

    struct iocb *iocbs[IODEPTH];
    struct iocb iocb_array[IODEPTH];
    char *buffers[IODEPTH];

    for (int i = 0; i < IODEPTH; i++) {
        // 分配对齐内存
        posix_memalign((void**)&buffers[i], BLOCK_SIZE, BLOCK_SIZE);
        memset(buffers[i], 'A' + i, BLOCK_SIZE);

        // 初始化 iocb
        io_prep_pwrite(&iocb_array[i], fd, buffers[i], BLOCK_SIZE, i * BLOCK_SIZE);
        iocbs[i] = &iocb_array[i];
    }

    // 提交所有请求
    int ret = io_submit(ctx, IODEPTH, iocbs);
    if (ret < 0) {
        perror("io_submit");
        return 1;
    }

    // 等待所有请求完成
    struct io_event events[IODEPTH];
    io_getevents(ctx, IODEPTH, IODEPTH, events, NULL);

    // 清理
    for (int i = 0; i < IODEPTH; i++) {
        free(buffers[i]);
    }
    io_destroy(ctx);
    close(fd);

    printf("All %d I/O requests completed.\n", IODEPTH);
    return 0;
}

延迟

指标	含义	描述
slat	Submission Latency	从 fio 发起 I/O 请求到内核接收该请求的时间。通常很短，单位是微秒（usec）。
clat	Completion Latency	从内核接收请求到 I/O 操作完成的时间。这个是最能反映存储设备性能的部分。
lat	Total Latency	总延迟，即 slat + clat，表示从 fio 发起请求到 I/O 完成的整个过程。

结果分析

$ fio --name=seqwrite --rw=write --bs=1M --size=5G --numjobs=2 --direct=1 --runtime=60 --group_reporting
seqwrite: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
...
fio-3.41
Starting 2 processes
seqwrite: Laying out IO file (1 file / 5120MiB)
seqwrite: Laying out IO file (1 file / 5120MiB)
Jobs: 2 (f=2): [W(2)][100.0%][w=112MiB/s][w=112 IOPS][eta 00m:00s]
seqwrite: (groupid=0, jobs=2): err= 0: pid=70688: Sun Sep  7 22:37:47 2025
  write: IOPS=111, BW=111MiB/s (117MB/s)(6685MiB/60016msec); 0 zone resets
    clat (usec): min=9606, max=74763, avg=17922.82, stdev=1446.89
     lat (usec): min=9628, max=74791, avg=17951.49, stdev=1446.80
    clat percentiles (usec):
     |  1.00th=[15008],  5.00th=[16319], 10.00th=[16909], 20.00th=[17433],
     | 30.00th=[17695], 40.00th=[17695], 50.00th=[17957], 60.00th=[17957],
     | 70.00th=[18220], 80.00th=[18482], 90.00th=[19006], 95.00th=[19268],
     | 99.00th=[21365], 99.50th=[22414], 99.90th=[32375], 99.95th=[40109],
     | 99.99th=[74974]
   bw (KiB/s): min=96062, max=116736, per=100.00%, avg=114150.20, stdev=1061.35, samples=238
   iops        : min=   92, max=  114, avg=110.77, stdev= 1.18, samples=238
  lat (msec)   : 10=0.03%, 20=97.43%, 50=2.53%, 100=0.01%
  cpu          : usr=0.22%, sys=0.75%, ctx=6717, majf=0, minf=67
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,6685,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=111MiB/s (117MB/s), 111MiB/s-111MiB/s (117MB/s-117MB/s), io=6685MiB (7010MB), run=60016-60016mse

🧾 测试配置解析
bash
fio –name=seqwrite –rw=write –bs=1M –size=10G –numjobs=1 –direct=1 –runtime=60 –group_reporting
参数含义
rw=write 顺序写入
bs=1M 每次写入块大小为 1MiB
numjobs=1 单线程写入
direct=1 使用 Direct I/O，绕过页缓存
ioengine=psync 使用同步 I/O（每次 pwrite()）
iodepth=1 每次只挂起一个 I/O 请求（同步模式下默认如此）
📊 性能结果概览
指标数值说明
IOPS 96 每秒执行 96 次写入操作
带宽 96.2 MiB/s（101 MB/s）每秒写入约 96 MiB 数据
总写入量 5772 MiB 在 60 秒内完成的写入总量
延迟（avg clat） 10.36 ms 每次写入的平均完成时间
CPU 使用率 usr=0.46%, sys=1.23% CPU 负载极低，瓶颈不在 CPU
⏱ 延迟分布分析
50% 的写入延迟低于 9.9 ms

95% 的写入低于 12.5 ms

99.95% 的写入延迟达到了 22.9 ms

最慢的写入高达 28.2 ms

尾部延迟略高，说明偶尔会有磁盘响应变慢的情况，可能是设备内部缓存刷新或寻址造成。

📈 带宽波动情况
平均带宽：约 96 MiB/s

最小带宽：60 MiB/s

最大带宽：104 MiB/s

标准差：6.2 MiB/s → 表明带宽相对稳定，但仍有轻微波动

🧠 深层解读
✅ 为什么 IOPS ≈ 带宽（MiB/s）？
因为你设置了 bs=1M，每次写入 1MiB 数据，所以：

Code
IOPS × Block Size = Bandwidth
96 IOPS × 1 MiB = 96 MiB/s
✅ 为什么 Direct I/O？
绕过页缓存，测试的是磁盘的真实物理性能，避免被内存加速 “欺骗”。

✅ 为什么使用 psync？
psync 是同步写入，每次调用 pwrite()，适合模拟数据库或日志系统的写入行为。但它无法并发挂起多个请求，限制了吞吐。

📌 性能瓶颈分析
磁盘类型：如果是 HDD，这个结果（96 MiB/s）非常合理；如果是 SSD，则偏低，可能受限于同步 I/O 或单线程。

IO 引擎限制：psync 是阻塞式，无法发挥磁盘的并发能力。

线程数限制：只有一个线程，磁盘可能未被充分利用。