You don’t pick a database because the logo looks friendly. You pick it because at 02:13, under a thundering batch job,
you need it to keep answering “yes” faster than your pager can vibrate.
The uncomfortable truth: both PostgreSQL and Percona Server (MySQL) can be fast. The difference is
how quickly you get to fast, and how many levers you must pull before reality stops punishing you.
This is a tuning-centric, production-first comparison: which engine needs more knobs for good speed, which knobs matter,
and how to diagnose bottlenecks without turning your config into a haunted house.
The blunt thesis: who needs more knobs
If you define “good speed” as respectable performance without heroic tuning, PostgreSQL usually gets there
with fewer “must-touch” settings—for a single primary, sane schema, and non-chaotic query patterns. It has knobs, sure,
but many are about predictability and guardrails rather than “flip these 17 switches or suffer.”
Percona Server (a hardened MySQL distribution, primarily InnoDB-based) can be blisteringly fast, but it more often asks you to
make explicit choices earlier: buffer pool sizing, redo log sizing, durability tradeoffs, flushing behavior, replication settings,
instrumentation overhead. Some of those defaults are safe; some are conservative; some are perfectly fine until your workload changes
and then they quietly become a tax.
So who needs more knobs? In practice:
-
Percona Server tends to need more explicit tuning to reach “good speed” for write-heavy OLTP, high concurrency,
and replication-heavy topologies—especially when latency matters and the dataset no longer fits in memory. -
PostgreSQL tends to need fewer knobs for “good speed” on mixed workloads and complex queries—but it demands ongoing
hygiene (autovacuum, bloat control, stats) or it will age like milk in a warm server room.
The hidden kicker: the question isn’t “which has more knobs,” it’s “which knobs are unavoidable and how expensive is getting them wrong.”
Wrong InnoDB flushing settings can turn a fast system into a stuttering one. Wrong autovacuum behavior can make PostgreSQL look like it’s
“randomly slow” when it’s actually drowning in dead tuples.
First joke (short, and earned): Tuning databases is like tuning guitars—most people twist the pegs until it sounds different, not better.
A few interesting facts (and why they matter)
These aren’t trivia for a pub quiz. They explain why the tuning surface area feels different.
-
PostgreSQL’s MVCC design stores old row versions in the heap until vacuum reclaims them. That’s why
autovacuum tuning is performance tuning, not housekeeping. -
InnoDB’s buffer pool is the performance center of gravity for MySQL/Percona. If it’s undersized, everything becomes I/O-bound;
if it’s oversized without headroom, the OS page cache and filesystem start fighting back. -
Percona Server exists because “stock MySQL” wasn’t enough for many operators—it historically emphasized observability,
extra diagnostics, and performance improvements while remaining compatible. That culture shows in how many tunables and metrics you can expose. -
PostgreSQL got parallel query in major increments starting in 9.6+, and it’s still a place where a few knobs
(likemax_parallel_workers_per_gather) can make a report query go from “coffee break” to “blink.” -
MySQL’s replication lineage pushed a lot of operational tuning: row vs statement, binlog format, sync settings, group replication variants.
Topology is part of performance. -
PostgreSQL WAL and checkpoints behave differently than InnoDB redo/undo. Postgres checkpoint tuning is often about smoothing write bursts;
InnoDB redo tuning is often about sustaining write throughput without thrashing. -
Percona introduced/advanced features like slow query log enhancements and extra status variables that make “measure first” more practical.
The flip side: you can measure yourself into overhead if you enable everything. -
Both ecosystems learned the hard way that “bigger caches” aren’t always faster. Memory pressure, NUMA effects, and allocator behavior
can turn “more RAM for DB” into “more latency for everyone.”
Baseline tuning philosophy: default behavior vs production reality
What “good speed” actually means in ops
In production, speed isn’t a single number. It’s:
- p95 and p99 latency under real concurrency
- throughput stability during compaction/vacuum/checkpoints
- predictable tail behavior during backups, replication catch-up, and failover
- time-to-diagnose when it slows down (observability as performance)
PostgreSQL defaults: polite, conservative, and sometimes naive
PostgreSQL defaults are designed for “runs on a laptop” and “doesn’t brick the machine.” They won’t automatically grab most of your RAM.
That’s good manners. It’s also why new installations can be underwhelming until you set a few key parameters.
The bright side: once you set memory and checkpoint behavior appropriately, PostgreSQL often behaves predictably. You’ll still tune
for specific query patterns and maintenance load, but the system is less likely to require a long list of “or else” toggles.
Percona Server defaults: capable, but expecting you to choose a stance
Percona Server inherits a lot from MySQL defaults, and many are safe-but-not-fast. But Percona’s audience often runs serious OLTP,
so the operational assumption is different: you will configure buffer pool, redo logs, flush behavior, and replication. If you don’t,
the database will still run—but you’ll be leaving speed on the table, and sometimes the table catches fire under load.
PostgreSQL: the knobs that actually move the needle
1) Memory: shared_buffers, work_mem, maintenance_work_mem
PostgreSQL has multiple memory domains. This confuses people who grew up on “set buffer pool and walk away.”
Your job is to avoid death by a thousand work_mem allocations while keeping enough cache and maintenance headroom.
-
shared_buffers: primary shared cache used by PostgreSQL. Common starting point: 20–30% of RAM on Linux.
More is not always better; the OS page cache still matters. -
work_mem: per sort/hash node, per query, per worker. Too high and concurrency turns into swap roulette.
Too low and you spill to disk. Use it intentionally. - maintenance_work_mem: vacuum and index builds. Too low makes maintenance slow; too high can starve other work.
2) WAL and checkpoints: stop the write cliff
PostgreSQL performance complaints often read like: “It’s fast, then it’s not, then it’s fast again.”
That’s usually checkpoints, background writer behavior, or storage saturation.
- checkpoint_timeout and max_wal_size: increase to reduce checkpoint frequency.
- checkpoint_completion_target: aim to spread checkpoint I/O over time.
- wal_compression: can help when WAL volume is high and CPU is available.
3) Autovacuum: either you tune it, or it tunes you
Autovacuum is a background maintenance system that prevents bloat and transaction ID wraparound. If it falls behind, you get:
bloated tables, slow queries, index inefficiency, and “mysterious” I/O.
Key levers:
- autovacuum_max_workers: more workers helps on many tables; too many can spike I/O.
- autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor: lower them for hot tables.
- autovacuum_vacuum_cost_limit and …_delay: control how aggressive it is.
4) Planner realism: statistics and “effective_cache_size”
PostgreSQL’s planner is often excellent, but it’s not psychic. If stats are stale or too coarse, it chooses the wrong plan.
If it believes the cache is smaller than reality, it may favor plans that do too much I/O.
- effective_cache_size: tell the planner how much cache it can assume (shared buffers + OS cache).
- default_statistics_target and per-column stats targets: raise for skewed distributions.
5) Connection handling: the “too many backends” classic
PostgreSQL uses a process-per-connection model (with variations and improvements over time). High connection counts increase memory usage
and context switching. You don’t fix this with wishful thinking; you fix it with pooling.
- max_connections: don’t set it to 5000 because you can.
- Use a pooler (like pgBouncer) when your app opens many short-lived connections.
Percona Server: the knobs you can’t ignore
1) InnoDB buffer pool: the big one
If you tune only one thing in Percona Server/MySQL, tune this. The buffer pool caches data and indexes; it’s where your read IOPS go to retire.
Typical guidance: 60–75% of RAM on a dedicated DB host, leaving room for OS, connections, replication, and filesystem cache (especially if using O_DIRECT choices differ).
- innodb_buffer_pool_size: set it intentionally.
- innodb_buffer_pool_instances: helps concurrency on large pools; too many instances adds overhead.
2) Redo logs and flushing: durability vs throughput is not a moral debate
Write performance in InnoDB is heavily shaped by redo log sizing and flushing behavior.
You can make it fast by taking durability risks. Sometimes that’s acceptable; often it’s not.
- innodb_redo_log_capacity (or older log file sizing): too small causes frequent checkpoints and churn.
- innodb_flush_log_at_trx_commit: 1 is safest; 2 is common for performance with some risk; 0 is spicy.
- sync_binlog: binlog durability also matters if you rely on replication or point-in-time recovery.
3) I/O capacity and dirty pages: teach the engine your storage
InnoDB tries to adapt, but it still needs hints. Modern NVMe behaves differently than networked SSD or cloud block devices with burst credits.
- innodb_io_capacity and innodb_io_capacity_max: set them to realistic IOPS capabilities.
- innodb_max_dirty_pages_pct and related settings: control how much dirty data can accumulate before flushing storms begin.
4) Threading and concurrency: less “magic,” more choices
MySQL has a long history of concurrency tuning. Many workloads are fine with defaults, until they’re not.
Then you discover you’re CPU-bound on mutexes or suffering from thread scheduling overhead.
- thread_cache_size: reduces thread creation overhead.
- max_connections: too high creates memory pressure and lock contention.
5) Replication and binlog: performance is topology-aware
PostgreSQL replication tuning exists too, but MySQL/Percona operational life often revolves around replication.
Binlog format, commit order, and replica apply parallelism affect both speed and correctness expectations.
- binlog_format: ROW is common for correctness; STATEMENT can be smaller but riskier; MIXED is compromise.
- replica_parallel_workers (or older slave_*): helps replicas catch up, but watch for contention.
Workload patterns: where each engine “just works” (and where it doesn’t)
High-concurrency OLTP with simple queries
Percona Server (InnoDB) is in its natural habitat here. With a right-sized buffer pool and sane flush settings, it can sustain heavy write rates.
PostgreSQL can also do OLTP well, but you’ll pay attention to autovacuum and index design earlier because dead tuples accumulate under update-heavy patterns.
If your workload is “updates everywhere, all the time,” PostgreSQL needs maintenance as a first-class concern. If you treat it like a set-and-forget,
your performance story will degrade over months.
Complex queries, analytics, joins, and “why is this report slow?”
PostgreSQL generally has the edge in query planner sophistication, extensibility, and the ease of expressing complex SQL.
It’s not that MySQL can’t do joins; it’s that you tend to encounter more sharp edges sooner in query planning and indexing choices.
This is where PostgreSQL tuning is less about “more knobs” and more about stats quality, proper indexes, and
avoiding plan regressions after upgrades or schema changes.
Write spikes and batch jobs
PostgreSQL checkpoint/WAL behavior can create periodic I/O bursts if poorly tuned.
Percona can also suffer from flushing storms when dirty page management is mis-set, or when storage lies about its performance.
Under batch loads, your goal is the same: smooth the write pattern, keep latency predictable, and prevent the storage device from becoming
the single point of truth and suffering.
Operational simplicity vs operational control
PostgreSQL can feel simpler because fewer settings are “mandatory.” But its operational correctness depends on vacuuming and sensible connection management.
Percona feels more “knob-rich” because it offers many explicit control points, especially around durability and I/O.
Pick based on who will run it at 02:13. If you have a team that likes explicit controls and strong observability, Percona’s ecosystem is comfortable.
If you value sane defaults and powerful SQL with less tuning surface to reach acceptable performance, PostgreSQL tends to be friendlier—assuming you commit to vacuum hygiene.
Storage and OS: the part you wanted to ignore
You can’t out-tune bad storage. You can only make it fail in more interesting ways.
Both PostgreSQL and Percona are brutally honest: if your storage latency is inconsistent, your p99 will look like a horror movie.
Latency beats throughput for most OLTP
Most OLTP workloads care more about consistent sub-millisecond to low-millisecond latency than raw MB/s.
A device that does 3 GB/s sequential but occasionally stalls for 200 ms is a prankster, not a database disk.
Filesystem and mount choices matter (but less than you fear)
Modern Linux filesystems are fine. What matters is aligning your DB’s I/O pattern with the storage stack:
- Watch out for double caching (DB cache + OS cache) when memory is tight.
- Be careful with aggressive writeback settings that create periodic stalls.
- Understand whether you’re on local NVMe, network block storage, or “cloud SSD” with burst behavior.
One operational quote to keep you honest
Hope is not a strategy.
— General Gordon R. Sullivan
It’s a management quote, but SREs adopted it because it hurts in a useful way. Don’t “hope” your defaults are fine. Measure and decide.
Practical tasks: commands, outputs, and decisions (12+)
These are the tasks you run when a system is slow and you need to decide what to change without starting a configuration folklore club.
Each task includes: a command, typical output, what it means, and the decision you make.
Task 1: Confirm whether you’re CPU-bound or I/O-bound (Linux)
cr0x@server:~$ iostat -xz 1 3
Linux 6.5.0 (db1) 12/30/2025 _x86_64_ (16 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
32.10 0.00 7.90 18.50 0.00 41.50
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
nvme0n1 420.0 28500.0 10.0 2.3 3.10 67.9 780.0 112000.0 50.0 6.0 18.20 143.6 15.2 98.0
Meaning: High %iowait and near-100% device utilization with high w_await points to a write-latency problem.
This is not a “more indexes” moment.
Decision: Prioritize write path tuning (Postgres checkpoints/WAL, InnoDB flushing/redo) and storage investigation before query tweaks.
Task 2: Check memory pressure and swapping (Linux)
cr0x@server:~$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
3 0 0 120432 98200 8123456 0 0 120 2400 920 2100 35 8 40 17 0
6 1 8192 20480 11000 7012000 0 256 1024 8900 1200 4800 42 10 25 23 0
Meaning: Swap-out (so) activity and dropping free memory under load suggests memory overcommit.
For PostgreSQL this often means work_mem multiplied by concurrency. For MySQL, oversized buffer pool plus high connections can do it.
Decision: Reduce per-connection memory risk (pooling, lower work_mem, lower max_connections) before “adding RAM” as a reflex.
Task 3: PostgreSQL — see top waits (what are backends stuck on?)
cr0x@server:~$ psql -X -c "SELECT wait_event_type, wait_event, count(*) FROM pg_stat_activity WHERE state='active' GROUP BY 1,2 ORDER BY 3 DESC LIMIT 10;"
wait_event_type | wait_event | count
-----------------+---------------------+-------
IO | DataFileRead | 18
Lock | transactionid | 7
LWLock | WALWrite | 5
| | 0
Meaning: Lots of DataFileRead means cache misses / I/O-bound reads. WALWrite indicates WAL write pressure.
transactionid lock waits can signal contention or long transactions.
Decision: If IO dominates, inspect cache hit rates and storage latency. If WALWrite dominates, tune WAL/checkpoints and confirm fsync latency.
Task 4: PostgreSQL — check buffer cache hit ratio (directional, not a religion)
cr0x@server:~$ psql -X -c "SELECT datname, blks_hit, blks_read, round(100.0*blks_hit/nullif(blks_hit+blks_read,0),2) AS hit_pct FROM pg_stat_database ORDER BY hit_pct ASC;"
datname | blks_hit | blks_read | hit_pct
-----------+----------+-----------+---------
appdb | 93210012 | 8200455 | 91.93
postgres | 1023012 | 12045 | 98.84
Meaning: 92% can be fine or awful depending on workload. The point is trend and correlation with latency.
A sudden drop suggests working set growth, bad query patterns, or insufficient memory.
Decision: If hit rate drops and reads spike, consider index improvements or more memory (shared_buffers + OS cache), but validate first with query plans.
Task 5: PostgreSQL — identify autovacuum lag and bloat risk
cr0x@server:~$ psql -X -c "SELECT relname, n_dead_tup, n_live_tup, last_autovacuum, last_autoanalyze FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 10;"
relname | n_dead_tup | n_live_tup | last_autovacuum | last_autoanalyze
----------------+------------+------------+-------------------------+-------------------------
orders | 8923012 | 40211234 | | 2025-12-30 00:41:13+00
order_items | 6112001 | 88012210 | 2025-12-29 22:10:02+00 | 2025-12-30 00:40:59+00
sessions | 4100122 | 1802210 | | 2025-12-30 00:40:15+00
Meaning: High dead tuples and missing last_autovacuum on hot tables is a red flag. Queries slow down because indexes and heap scans get fatter,
and visibility checks get more expensive.
Decision: Tune autovacuum per-table (scale factors, cost limits) and fix long transactions that prevent cleanup.
Task 6: PostgreSQL — find long transactions blocking vacuum
cr0x@server:~$ psql -X -c "SELECT pid, usename, now()-xact_start AS xact_age, state, left(query,80) AS q FROM pg_stat_activity WHERE xact_start IS NOT NULL ORDER BY xact_age DESC LIMIT 5;"
pid | usename | xact_age | state | q
------+---------+-------------+--------+--------------------------------------------------------------------------------
8421 | app | 03:12:44 | idle | BEGIN; SELECT * FROM customers WHERE id=...
9122 | app | 00:18:02 | active | UPDATE orders SET status='paid' WHERE id=...
Meaning: An “idle in transaction” session holding snapshots can prevent vacuum from removing dead tuples,
leading to bloat and performance decay.
Decision: Fix the application transaction scope. Kill sessions only as an emergency measure, and then fix the code.
Task 7: PostgreSQL — check checkpoint behavior
cr0x@server:~$ psql -X -c "SELECT checkpoints_timed, checkpoints_req, checkpoint_write_time, checkpoint_sync_time, buffers_checkpoint FROM pg_stat_bgwriter;"
checkpoints_timed | checkpoints_req | checkpoint_write_time | checkpoint_sync_time | buffers_checkpoint
-------------------+-----------------+-----------------------+----------------------+--------------------
120 | 95 | 18234012 | 3100221 | 90211234
Meaning: High checkpoints_req means you’re hitting WAL size limits and forcing checkpoints.
Big checkpoint write/sync times correlate with latency spikes.
Decision: Increase max_wal_size, tune checkpoint_timeout, and smooth with checkpoint_completion_target.
If storage can’t keep up, fix storage or lower write amplification (batching, fewer indexes, schema changes).
Task 8: MySQL/Percona — see what threads are waiting on (quick view)
cr0x@server:~$ mysql -e "SHOW PROCESSLIST\G" | head -n 40
*************************** 1. row ***************************
Id: 12091
User: app
Host: 10.0.1.25:52144
db: appdb
Command: Query
Time: 12
State: Waiting for table metadata lock
Info: ALTER TABLE orders ADD COLUMN note TEXT
*************************** 2. row ***************************
Id: 12102
User: app
Host: 10.0.1.18:51902
db: appdb
Command: Query
Time: 10
State: update
Info: UPDATE orders SET status='paid' WHERE id=...
Meaning: Metadata lock waits can freeze traffic during schema changes. This can look like “the database is slow,” but it’s really “DDL is blocking.”
Decision: Use online schema change approaches where required; schedule DDL; reduce long-running transactions that hold locks.
Task 9: MySQL/Percona — check InnoDB buffer pool health
cr0x@server:~$ mysql -e "SHOW ENGINE INNODB STATUS\G" | sed -n '1,120p'
*************************** 1. row ***************************
Type: InnoDB
Name:
Status:
=====================================
2025-12-30 01:12:09 INNODB MONITOR OUTPUT
=====================================
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 274877906944
Buffer pool size 16777216
Free buffers 1024
Database pages 16776190
Old database pages 6192000
Modified db pages 210400
Pages read 98200123, created 2210021, written 81230111
...
Meaning: Very low free buffers and high read rates can be normal under load, but if reads are high and latency is high,
the pool may be too small for the working set or the query patterns are forcing scans.
Decision: If memory allows, increase innodb_buffer_pool_size. If not, fix queries and indexes first; don’t just starve the OS.
Task 10: MySQL/Percona — confirm redo log pressure and flushing
cr0x@server:~$ mysql -e "SHOW GLOBAL STATUS LIKE 'Innodb_os_log_fsyncs'; SHOW GLOBAL STATUS LIKE 'Innodb_log_waits'; SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';"
+---------------------+----------+
| Variable_name | Value |
+---------------------+----------+
| Innodb_os_log_fsyncs| 18220031 |
+---------------------+----------+
+------------------+-------+
| Variable_name | Value |
+------------------+-------+
| Innodb_log_waits | 12044 |
+------------------+-------+
+--------------------------------+-------+
| Variable_name | Value |
+--------------------------------+-------+
| innodb_flush_log_at_trx_commit | 1 |
+--------------------------------+-------+
Meaning: Innodb_log_waits indicates sessions waiting for redo log space / flush behavior.
With innodb_flush_log_at_trx_commit=1, fsync latency is directly in the transaction path.
Decision: If waits are significant, increase redo capacity and validate storage fsync latency. Consider durability tradeoffs only with a clear RPO/RTO agreement.
Task 11: MySQL/Percona — find top statements quickly (Performance Schema summary)
cr0x@server:~$ mysql -e "SELECT DIGEST_TEXT, COUNT_STAR, ROUND(SUM_TIMER_WAIT/1e12,2) AS total_s FROM performance_schema.events_statements_summary_by_digest ORDER BY SUM_TIMER_WAIT DESC LIMIT 3\G"
*************************** 1. row ***************************
DIGEST_TEXT: SELECT * FROM orders WHERE customer_id = ? ORDER BY created_at DESC LIMIT ?
COUNT_STAR: 1822011
total_s: 9120.55
*************************** 2. row ***************************
DIGEST_TEXT: UPDATE order_items SET status = ? WHERE order_id = ?
COUNT_STAR: 801122
total_s: 6122.17
Meaning: High total time can be “called often” or “each call is slow.” Digest summaries help you pick battles.
Decision: Take the worst digests and run EXPLAIN, check indexes, and measure rows examined vs rows returned.
Task 12: PostgreSQL — find top total-time queries (pg_stat_statements)
cr0x@server:~$ psql -X -c "SELECT query, calls, round(total_exec_time::numeric,1) AS total_ms, round(mean_exec_time::numeric,2) AS mean_ms FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5;"
query | calls | total_ms | mean_ms
-----------------------------------------------------------+--------+----------+---------
SELECT * FROM orders WHERE customer_id=$1 ORDER BY ... | 1822011| 812034.2 | 0.45
UPDATE order_items SET status=$1 WHERE order_id=$2 | 801122| 610221.7 | 0.76
Meaning: Same story as MySQL digests: find the real time sinks. PostgreSQL gives you mean time and total time; both matter.
Decision: For high-call queries, look for index support and avoid unnecessary row fetch. For slow mean time, inspect plans and I/O.
Task 13: PostgreSQL — inspect a query plan for “lies” and misestimates
cr0x@server:~$ psql -X -c "EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id=123 ORDER BY created_at DESC LIMIT 50;"
Limit (cost=0.43..12.86 rows=50 width=128) (actual time=0.210..4.882 rows=50 loops=1)
Buffers: shared hit=120 read=320
-> Index Scan Backward using idx_orders_customer_created on orders (cost=0.43..9012.12 rows=35210 width=128) (actual time=0.207..4.865 rows=50 loops=1)
Index Cond: (customer_id = 123)
Buffers: shared hit=120 read=320
Planning Time: 0.220 ms
Execution Time: 4.910 ms
Meaning: The read=320 buffers indicate physical reads during execution. If this query is hot and that number grows, you’re I/O-bound.
Misestimates show up when “rows=” is wildly off from “actual rows=”.
Decision: If misestimates are severe, increase stats target or add extended statistics. If reads dominate, increase cache or reduce working set with better indexes.
Task 14: MySQL/Percona — check replication lag and apply pressure
cr0x@server:~$ mysql -e "SHOW REPLICA STATUS\G" | egrep 'Seconds_Behind_Source|Replica_SQL_Running|Replica_IO_Running|Last_SQL_Error'
Replica_IO_Running: Yes
Replica_SQL_Running: Yes
Seconds_Behind_Source: 87
Last_SQL_Error:
Meaning: Lag without errors means the replica can’t apply fast enough. This can be I/O, CPU, or single-threaded apply constraints.
Decision: Increase replica apply parallelism where safe, reduce write amplification, and ensure replica storage isn’t slower than primary.
Task 15: Check filesystem latency symptoms (Linux dmesg)
cr0x@server:~$ dmesg -T | tail -n 8
[Mon Dec 30 01:11:44 2025] nvme nvme0: I/O 123 QID 6 timeout, completion polled
[Mon Dec 30 01:11:44 2025] blk_update_request: I/O error, dev nvme0n1, sector 182291234 op 0x1:(WRITE) flags 0x800 phys_seg 32 prio class 0
Meaning: If the kernel logs timeouts or errors, stop blaming the database.
Decision: Treat as an infrastructure incident: replace device, fix controller/firmware, validate cloud volume health, and only then revisit DB knobs.
Fast diagnosis playbook
When things are slow, you do not start by editing configs. You start by proving where time goes.
This playbook is tuned for “production is on fire, but we still want to be adults about it.”
First: Is it the database, or is the database waiting on something else?
-
Check storage latency and saturation (
iostat -xz, cloud volume metrics). If%utilis pegged and awaits are high,
your “slow query” may be a “slow disk” story. -
Check memory pressure (
vmstat, swapping). Swapping turns both engines into tragic performance art. -
Check CPU saturation (load average vs CPU count,
top,pidstat).
If CPU is pegged, ask: is it query execution, background maintenance, or compression/encryption overhead?
Second: Are sessions waiting on locks or on I/O?
- PostgreSQL:
pg_stat_activitywaits (IO vs Lock vs LWLock). Lots of Lock waits means contention; lots of IO waits means cache/storage. - Percona:
SHOW PROCESSLISTstate + InnoDB status. Metadata locks, row locks, and log waits tell different stories.
Third: Identify top queries by total time, not by vibes
- PostgreSQL:
pg_stat_statements, thenEXPLAIN (ANALYZE, BUFFERS)on the worst offenders. - Percona: Performance Schema digest summary, then
EXPLAINand index review.
Fourth: Check the engine-specific “silent killers”
- PostgreSQL: autovacuum lag, dead tuples, long transactions, checkpoint frequency.
- Percona: redo log waits, buffer pool misses, flushing storms, replication lag causing load on primary.
Second joke (and we’re done): If your database is slow and you haven’t checked disk latency, you’re basically debugging with interpretive dance.
Three corporate-world mini-stories (anonymized, technically real)
Mini-story 1: The incident caused by a wrong assumption
A mid-size SaaS company ran PostgreSQL for transactional traffic and had recently added a “customer activity timeline” feature.
It was a classic append-heavy table with periodic updates for status changes. Everything looked fine in staging.
In production, p99 latency started creeping up over weeks. The team assumed it was “just growth” and planned a vertical scale.
They bumped CPU and RAM, patted themselves on the back, and waited for the graphs to behave.
They didn’t. The extra RAM helped for a while, then the system returned to its slow drift. During the next incident, someone finally ran
pg_stat_user_tables and saw a mountain of dead tuples on a handful of hot tables. Autovacuum wasn’t keeping up.
Worse: there were long-lived transactions from a background worker that held snapshots open for hours while it processed a queue.
The wrong assumption was simple: “Autovacuum is automatic, so it’s handled.” It is automatic the way laundry is automatic if you own a washing machine.
You still have to put clothes in it, and you really should not leave wet towels in there for a week.
The fix was boring and surgical: shorten transaction scope in the worker, lower vacuum scale factors on the hot tables, raise autovacuum worker count,
and schedule a controlled VACUUM (FULL) only where absolutely necessary. Performance stabilized. No heroic hardware purchase required.
Mini-story 2: The optimization that backfired
An e-commerce platform ran Percona Server with heavy write traffic: orders, payments, inventory reservations. They were chasing higher throughput,
and someone proposed a “safe” change: increase the buffer pool from “large” to “nearly all RAM.” The idea was to reduce disk reads.
The change went out during a low-traffic window. Nothing exploded. The next day, under peak load, the system started stalling.
Not slowly. Stalling in a way that made application timeouts look like network issues. CPU wasn’t pegged. Disk throughput looked fine.
Everyone stared at dashboards and learned nothing.
The culprit: memory pressure. By giving InnoDB almost all RAM, they starved the OS and left insufficient headroom for per-connection memory,
replication buffers, and filesystem behavior. The host started swapping intermittently, which turned fsync and page flush into unpredictable latency spikes.
The database wasn’t “slow.” It was occasionally frozen.
Rolling back the buffer pool size fixed the immediate symptoms. The longer-term fix was more nuanced:
tune buffer pool to leave real headroom, cap max connections, and introduce connection pooling at the application tier.
They also stopped doing DDL at noon, which, while unrelated to the buffer pool, improved everyone’s happiness.
The lesson: “more cache” is not a universal good. It’s a trade. The OS is part of the system, not an annoying detail you can evict.
Mini-story 3: The boring but correct practice that saved the day
A financial services team ran both engines: PostgreSQL for reporting and MySQL/Percona for a legacy OLTP system.
They weren’t the flashiest team. They were, however, quietly effective.
Their secret weapon was a strict, dull routine: weekly review of top SQL by total time, a monthly check for bloat and vacuum health on Postgres,
and a quarterly validation of recovery time on both systems. They kept configs under version control and required a change note explaining
the intended effect and rollback plan for every tuning modification.
One day, a storage firmware issue increased write latency. Both databases slowed down, but the team identified it quickly because they had baseline
measurements and knew what “normal” fsync and checkpoint behavior looked like. They didn’t waste hours tweaking database settings to compensate for a dying device.
The operational win wasn’t heroism. It was discipline: a known-good baseline, a habit of measuring before changing, and rehearsed recovery.
The incident report was short. The weekend was mostly saved. That’s what success looks like in production—quiet.
Common mistakes: symptom → root cause → fix
1) PostgreSQL p99 spikes every few minutes
Symptom: Latency spikes correlate with bursts of write I/O.
Root cause: Checkpoints too frequent or too “spiky” due to low max_wal_size or poor checkpoint smoothing.
Fix: Increase max_wal_size, increase checkpoint_timeout within reason, set checkpoint_completion_target (often 0.7–0.9), and verify storage fsync latency.
2) PostgreSQL slowly gets worse over weeks
Symptom: Queries that were fast become consistently slower; disk usage grows unexpectedly.
Root cause: Autovacuum can’t keep up; dead tuples accumulate; long transactions prevent cleanup; stats get stale.
Fix: Tune autovacuum for hot tables (lower scale factors), fix long transactions, raise autovacuum workers carefully, and reanalyze critical tables.
3) Percona/MySQL stalls under write load with “log waits”
Symptom: Throughput collapses during peak writes; status shows log waits increasing.
Root cause: Redo log too small or storage fsync is slow; flush settings push fsync into the commit path.
Fix: Increase redo capacity, confirm NVMe/cloud volume latency, and revisit durability knobs only with an explicit business decision.
4) MySQL traffic pauses during schema change
Symptom: Many threads show “Waiting for table metadata lock.”
Root cause: DDL blocks or is blocked by long-running transactions; metadata locks serialize access.
Fix: Use online schema change methods, kill/avoid long transactions, schedule DDL in low-traffic windows, and design migrations to be lock-aware.
5) Both engines: “we increased max_connections and it got worse”
Symptom: More timeouts, higher latency, increased context switching, possible swapping.
Root cause: Concurrency overload: too many active sessions saturate CPU, memory, or lock manager; queues move from the app to the database.
Fix: Add pooling, set sane connection limits, implement backpressure, and scale out reads where possible instead of scaling connections into chaos.
6) Queries become slower after an “index optimization”
Symptom: Write latency increases; replication lag increases; CPU and I/O rise.
Root cause: Too many indexes increase write amplification; the “optimized read” isn’t worth the global cost.
Fix: Remove unused indexes, consolidate multi-column indexes, and validate with real workload stats (not guesses).
Checklists / step-by-step plan
Step-by-step: getting PostgreSQL to “good speed” with minimal knob-twisting
- Set shared_buffers to a reasonable fraction (start around 20–30% RAM on a dedicated host).
- Set effective_cache_size to reflect shared buffers + OS cache (commonly 50–75% RAM).
- Set work_mem conservatively, then selectively raise per role/session for heavy queries; don’t set it globally high.
- Enable and use pg_stat_statements to rank queries by total time.
- Tune checkpoints (max_wal_size, checkpoint_timeout, checkpoint_completion_target) to avoid I/O cliffs.
- Audit autovacuum health weekly; apply per-table settings for hot tables.
- Use connection pooling if you have many short connections or high fan-out services.
- Baseline storage latency and watch it like it’s part of your application (it is).
Step-by-step: getting Percona Server to “good speed” without magical thinking
- Size innodb_buffer_pool_size intentionally; leave headroom for OS and concurrency.
- Set redo capacity so you’re not checkpointing constantly; validate log waits.
- Decide durability explicitly: innodb_flush_log_at_trx_commit and sync_binlog aligned with business RPO.
- Set innodb_io_capacity to match storage reality, not marketing specs.
- Enable digest-based statement summaries and use them regularly.
- Keep max_connections sane; fix app connection behavior instead of inflating the limit.
- Plan replication tuning (parallel apply, binlog format) as part of performance, not an afterthought.
- Practice schema changes with lock awareness; metadata locks are not a surprise feature.
A decision checklist: “Do I need more knobs, or fewer problems?”
- If you can’t measure query hot spots, tuning is gambling. Enable the right stats first.
- If you can’t describe your durability requirements, don’t touch flush knobs.
- If you can’t bound concurrency, your database will do it for you—with latency.
- If your storage latency is inconsistent, your tuning results will be inconsistent too.
FAQ
1) Which one is faster out of the box?
Depends on the workload, but PostgreSQL often feels “fine” sooner because fewer settings are required to avoid obvious traps.
Percona/MySQL often needs buffer pool and redo/flush choices to reach its best behavior under heavy OLTP.
2) Is PostgreSQL “self-tuning” because of autovacuum?
No. Autovacuum is automatic, not omniscient. Hot tables frequently need per-table tuning, and long transactions can sabotage it.
Treat vacuum health as part of performance management.
3) Is Percona Server just MySQL with extra knobs?
It’s MySQL-compatible and historically focused on performance and observability enhancements. The “extra knobs” are often there because operators asked for control.
More control is good—until it becomes ungoverned complexity.
4) What’s the single most important PostgreSQL tuning lever?
If forced to pick one: keeping maintenance healthy—autovacuum effectiveness plus preventing checkpoint spikes. Memory matters, but bloat and write cliffs ruin everything.
5) What’s the single most important Percona/MySQL tuning lever?
innodb_buffer_pool_size. If it’s wrong, you’ll either thrash disk or starve the OS and stall. Close second: redo/flush configuration aligned with storage.
6) Should I set PostgreSQL shared_buffers to 80% RAM like a buffer pool?
Usually no. PostgreSQL benefits from OS cache and needs headroom for work_mem, maintenance, and the rest of the system.
Start smaller and measure; “all RAM to shared_buffers” is a common self-inflicted wound.
7) Can I fix slow queries by adding indexes until it’s fast?
You can, briefly, until writes and replication get slower and the buffer cache becomes a junk drawer.
Indexes are a cost. Use query stats to justify each one, and remove the dead weight.
8) Why do my PostgreSQL queries get slower even though CPU is idle?
Often I/O waits, locks, or bloat. Check waits in pg_stat_activity, look at dead tuples, and confirm storage latency.
CPU idle is not a sign of health; it’s sometimes a sign of waiting.
9) Why do my MySQL queries get “stuck” during migrations?
Metadata locks. DDL needs locks, and long transactions can hold them. Use lock-aware migration methods and keep transactions short.
10) Which needs more knobs for good speed on cloud block storage?
Percona/MySQL tends to demand more explicit I/O and durability tuning because fsync behavior is so central to commit latency.
PostgreSQL also cares deeply about fsync latency (WAL), but it’s often fewer knobs to get to stable behavior—assuming checkpoint tuning is done.
Next steps that won’t embarrass you
If you’re choosing between PostgreSQL and Percona Server strictly on “who needs more knobs,” you’re asking the right question in the wrong way.
Ask instead: what knobs are mandatory for my workload, and how quickly can my team diagnose issues when the workload shifts.
Practical next steps:
- Pick one representative workload (read-heavy OLTP, write-heavy OLTP, mixed, reporting) and benchmark with production-like concurrency.
- Enable query statistics (pg_stat_statements or Performance Schema digests) and create a weekly “top SQL” ritual.
- Establish baselines for storage latency, checkpoint/flush behavior, and replication lag. Write them down.
- Limit concurrency deliberately with pooling and sane connection caps. Don’t let the app DDoS the database politely.
- Change one thing at a time, measure, and keep a rollback plan. Version-control your config like it matters—because it does.
PostgreSQL rewards steady maintenance and good SQL hygiene. Percona Server rewards explicit choices and tight operational control.
Neither is “faster by nature.” The faster one is the one you can operate without guessing.