Monitoring InnoDB Dirty Pages and Page Flushing

InnoDB dirty pages are buffer pool pages that have been modified in memory but not yet written back to their tablespace files on disk. Every write operation (INSERT, UPDATE, DELETE) first modifies the relevant buffer pool pages and logs the change to the redo log — the actual file write happens asynchronously in the background by InnoDB page cleaner threads. The number of dirty pages directly determines how much work a crash recovery would require (all dirty pages must be re-applied from redo logs) and how long a sharp checkpoint stall would last. The BUFFER POOL AND MEMORY section of SHOW ENGINE INNODB STATUS reports "Modified db pages" — the current dirty page count. This guide explains how to monitor dirty page accumulation, control the flush rate to prevent I/O spikes, and diagnose write performance issues caused by checkpoint pressure.

What Dirty Pages Are and Why They Matter

Modified db pages vs Total Buffer Pool Pages

The "Modified db pages" count in the BUFFER POOL AND MEMORY section shows the current number of dirty pages in the buffer pool. Dividing this by the total "Database pages" count gives the dirty page ratio — the fraction of cached data awaiting flush. A dirty ratio below 25% is typically healthy; above 75%, the flush system is struggling to keep pace with writes and checkpoint stalls become likely. Track the "Pages flushed up to" LSN value alongside the "Modified db pages" count: if flushed-up-to LSN is advancing slowly relative to the current LSN, the page cleaner is not keeping pace. The SHOW STATUS variables Innodb_buffer_pool_pages_dirty and Innodb_buffer_pool_bytes_dirty provide the same information and can be monitored continuously with Prometheus mysqld_exporter.

Pages Flushed Up To: Tracking the Flush Frontier

"Pages flushed up to" in the LOG section of SHOW ENGINE INNODB STATUS is the LSN up to which all pages that need to be flushed (according to the mini-transaction commit records) have been written to disk. The difference between "Log sequence number" and "Pages flushed up to" — the flush frontier lag — measures the volume of redo log that still has corresponding dirty pages outstanding. A growing flush frontier lag indicates that the page cleaner is falling behind the write rate, which will eventually trigger adaptive flushing at elevated intensity and, if left unaddressed, synchronous checkpoint stalls. Monitor the flush frontier lag alongside innodb_io_capacity utilisation; if utilisation is near 100% and the lag is growing, the storage subsystem itself is the bottleneck.

Controlling the Flush Rate

innodb_io_capacity and innodb_io_capacity_max Settings

innodb_io_capacity (default 200 IOPS) tells the InnoDB page cleaner how many I/O operations per second it may use for background flushing. The default of 200 is calibrated for spinning hard disks from 2010 and is severely undersized for modern SSD and NVMe storage. Set innodb_io_capacity to 2000–5000 for SATA SSDs and 10000–20000 for NVMe drives, based on the drive's rated random write IOPS. innodb_io_capacity_max (default 2000) caps emergency flushing during checkpoint pressure — set it to 2–4× innodb_io_capacity. Undersizing innodb_io_capacity causes dirty page accumulation and periodic checkpoint stalls; oversizing wastes I/O bandwidth on unnecessary background flushing that competes with foreground queries.

innodb_max_dirty_pages_pct and Flush Trigger Thresholds

innodb_max_dirty_pages_pct (default 90%) sets the target maximum dirty page ratio in the buffer pool. When the dirty page percentage exceeds this threshold, InnoDB aggressively flushes pages until the ratio drops below the target. innodb_max_dirty_pages_pct_lwm (default 10%) is the low watermark below which InnoDB considers flushing completely idle. The gap between these two values defines the normal operating range for dirty pages. Reducing innodb_max_dirty_pages_pct to 75% on write-heavy servers provides an earlier flush trigger, reducing the probability of checkpoint stalls while slightly increasing background I/O. On read-heavy servers, keeping dirty pages near 90% is fine since writes are infrequent enough that the checkpoint never approaches its critical age.

Diagnosing Write Performance Issues

I/O Spike Patterns from Checkpoint Stalls

Checkpoint stall-induced I/O spikes appear in application monitoring as sudden latency increases affecting all write queries simultaneously, typically lasting 100–500 ms. The pattern is distinctive: latency is normal for minutes, then spikes sharply, then returns to normal — repeating every few minutes. This pattern indicates InnoDB is running at its maximum innodb_io_capacity during the spike (flushing aggressively to advance the checkpoint) and then idling until the checkpoint age builds up again. The fix is to set innodb_io_capacity and innodb_io_capacity_max high enough that continuous background flushing keeps the checkpoint age below 50%, eliminating the cycle of idle accumulation and emergency flushing. Use iostat -x 1 during a spike to confirm that disk write throughput maxes out during the I/O spike.

Double Write Buffer and Its Interaction with Flushing

The InnoDB doublewrite buffer protects against partial page writes (torn pages) caused by a crash mid-write by first writing dirty pages to the doublewrite buffer area, then writing them to their final tablespace locations. This doubles the write I/O for dirty page flushing, which can be significant on write-heavy workloads. In MySQL 8.0.20+, the doublewrite buffer is stored in a dedicated file (innodb_doublewrite_dir, default #innodb_dblwr/) rather than the system tablespace, and can be placed on a separate fast device to reduce contention with tablespace writes. On file systems that natively guarantee atomic page writes (e.g., ZFS, some RAID controllers), innodb_doublewrite=OFF can be set safely to eliminate the 2× write amplification. Never disable doublewrite on ext4 or XFS without confirming atomic write guarantees with your storage vendor.