You copy a SELECT from MySQL to PostgreSQL (or the other way around), run it against “the same data,” and suddenly you’re staring at a progress bar that feels like it’s powered by interpretive dance. Everyone swears nothing changed. The app “only” switched drivers. The query is “simple.” Meanwhile your p95 is on fire and the CEO has rediscovered the refresh button.
This is normal. Not acceptable, but normal. MySQL and PostgreSQL can both be excellent, but they make different bets: on statistics, on concurrency control, on when to use indexes, on join choices, and on what “same” means when your schema and data distribution are slightly different. If you operate production systems, the right question isn’t “which database is faster?” It’s “why did the optimizer choose violence today?”
Fast diagnosis playbook
When a query is fast in MySQL and brutal in PostgreSQL (or vice versa), resist the temptation to start “tuning” knobs. First find which of three buckets you’re in: bad plan, blocked, or IO-bound. Then you can fix the actual problem instead of rearranging the server’s feng shui.
First: prove it’s a plan problem (or not)
- Capture the plan:
- Postgres:
EXPLAIN (ANALYZE, BUFFERS) - MySQL:
EXPLAIN ANALYZE(8.0+) andSHOW WARNINGSif needed
- Postgres:
- Look for cardinality lies: estimated rows vs actual rows. If it’s off by 10x, you’re already in trouble. If it’s off by 100x, you’re doing incident response.
- Identify the dominating node: the step with the highest actual time (Postgres) or highest “actual time”/loops (MySQL 8.0 analyze). That’s where you focus.
Second: eliminate blocking and long transactions
- Postgres: check
pg_stat_activityforwait_eventand long transactions. - MySQL: check
SHOW PROCESSLISTand InnoDB lock waits. - In both: if there’s a writer holding locks or a long transaction preventing cleanup, you can optimize forever and still lose.
Third: decide if you’re IO-bound, CPU-bound, or memory-bound
- If the plan shows lots of buffer misses / reads, you’re IO-bound.
- If CPU is pegged and the plan shows large sorts/hashes, you’re CPU/memory-bound.
- If latency spikes correlate with checkpointing, flushing, or fsync pressure, you’re storage-bound.
Paraphrased idea from Werner Vogels: “Design for failure as a normal condition.” Apply that to query performance too—design and operate assuming plans will occasionally go wrong.
Why “the same query” is rarely the same
In production, “same query” usually means “same SQL string.” Databases care about the rest: schema details, collation rules, data types, indexes, statistics, configuration, and concurrency. A query can be logically equivalent while being physically very different.
Data types and implicit casts: the silent plan-killers
PostgreSQL is strict about types and will happily introduce an implicit cast that prevents index usage. MySQL is more permissive and sometimes “helps” by converting values; sometimes that “help” becomes a full scan with a smile.
Classic example: comparing a uuid column to a text parameter in Postgres can force a cast on the column side, making the index unusable. MySQL might compare strings and use an index if collation and prefix rules line up. Or it might not. Consistency is a feature you have to earn.
Collations and text comparisons
MySQL’s collations (especially case-insensitive defaults) can change indexability and comparisons. PostgreSQL’s collation behavior depends on libc/ICU settings and can affect sort order, index operator classes, and performance. If you migrated data and didn’t migrate collations intentionally, you didn’t migrate the workload—you relocated it and hoped it would behave.
Parameterization and plan caching
PostgreSQL can choose a generic plan for prepared statements that is “okay on average” but bad for a common parameter value. MySQL’s behavior differs depending on the connector and whether you’re using server-side prepares. If your app changed how it binds parameters, your plan may change dramatically even if the SQL string looks identical.
Joke #1: Query optimizers are like toddlers: they can be brilliant, but if you surprise them with new data distribution, they throw the plan on the floor.
Historical context: why these differences exist
- Fact 1: PostgreSQL descends from the POSTGRES project at UC Berkeley (late 1980s), which treated query planning and extensibility as first-class design goals.
- Fact 2: MySQL started in the mid-1990s with a focus on speed and simplicity for web workloads; early versions had limited subquery support and a simpler optimizer.
- Fact 3: InnoDB became the default MySQL storage engine in 5.5; before that, many deployments used MyISAM with different locking and durability characteristics.
- Fact 4: PostgreSQL’s MVCC has long been central, but it requires vacuuming to reclaim space—performance can degrade if you ignore that lifecycle.
- Fact 5: MySQL 8.0 introduced a more capable data dictionary and improved optimizer features; advice from 5.6-era blogs can be actively harmful.
- Fact 6: PostgreSQL 12+ improved partitioning and planner behavior for partitions; older versions could plan partitioned queries poorly.
- Fact 7: MySQL’s replication has historically been statement-based or row-based with different tradeoffs; read replicas can behave differently under load due to apply lag and metadata locking.
- Fact 8: PostgreSQL’s JIT (LLVM) can accelerate some queries but also add overhead; it’s workload-dependent and can surprise you in latency-sensitive systems.
Optimizer behavior that changes everything
Cardinality estimation: the database is guessing, and it guesses differently
Most “fast in DB A, slow in DB B” stories boil down to row-count estimation. The optimizer chooses join order and join algorithms based on estimated rows. If it guesses wrong, it picks the wrong weapon.
PostgreSQL: uses statistics from ANALYZE: per-column stats, most-common-values, histograms, correlation, and (optionally) extended statistics across columns. It’s good, but it’s not psychic. If your predicates involve multiple columns with correlation, default stats can under-estimate badly.
MySQL: uses engine statistics; InnoDB stats can be persistent or recalculated; histograms exist and help in 8.0. Without histograms, InnoDB may assume uniform distribution and make tragic choices when data is skewed.
Join algorithms and when each engine prefers them
PostgreSQL has a strong toolkit: nested loop, hash join, merge join. It can also do bitmap index scans to combine multiple indexes. MySQL historically leaned heavily on nested loops (including block nested loop) and has improved, but its approach differs.
This matters because “one big join” can be fast if the planner chooses hash join over large sets, or catastrophic if it chooses nested loop with a misestimated inner result.
CTEs, subqueries, and optimization fences
Older PostgreSQL versions treated CTEs (WITH) as optimization fences: they were materialized and prevented predicate pushdown. Newer versions can inline CTEs, but not always. MySQL treats derived tables differently and has its own rules for materialization.
If your “same SQL” uses CTEs, the planner’s interpretation can be the difference between milliseconds and “cancelled by on-call.”
Sorting, hashing, and memory knobs
Postgres uses work_mem per sort/hash operation, per query node, per parallel worker. One query can spawn multiple sorts; set it too high and you’ll memory-spike. Set it too low and you spill to disk and cry.
MySQL uses different memory structures (join buffer, sort buffer, tmp tables). It will create on-disk temporary tables depending on size and data types. Both engines can appear “CPU slow” when they’re actually “spilling to disk slowly.”
Indexes: same idea, different sharp edges
B-tree is not a promise, it’s a contract with fine print
Both engines use B-tree indexes for most things. But how they use them differs: index-only scans, covering indexes, visibility maps, clustered-ish behavior, and whether the engine can apply conditions at the index layer.
PostgreSQL: index-only scans depend on visibility
Postgres can do an index-only scan if it can prove tuples are visible without hitting the heap. That depends on the visibility map, which depends on vacuuming. If your table churns and vacuum can’t keep up, your “covering index” isn’t covering at runtime. The plan may still say index-only, but the heap fetches will reveal the truth.
MySQL/InnoDB: the primary key is the table
InnoDB stores rows clustered by the primary key. Secondary indexes point to the primary key, not directly to the row location. That means secondary index lookups may require extra hops. It also means primary key choice affects locality, page splits, and cache behavior. A UUID primary key can be fine with the right format and insertion strategy, but it’s not free.
Multi-column indexes: order matters more than people admit
MySQL and Postgres both care about the leading columns. But their optimizers differ in how they combine indexes (bitmap scans in Postgres can be very effective; MySQL has index merge but it’s not a universal substitute). If a query is fast in Postgres because it bitmap-ANDs two single-column indexes, the same schema in MySQL may require a composite index to avoid scanning.
Functional indexes and expression support
Postgres is comfortable with expression indexes (like lower(email)) and partial indexes (like WHERE deleted_at IS NULL). MySQL has functional indexes (via generated columns) and can do similar things, but the ergonomics differ. If your query relies on one engine’s “natural” pattern, porting it can be deceptively slow until you re-express the index strategy.
MVCC, vacuuming, and the hidden tax of dead rows
Both MySQL (InnoDB) and PostgreSQL use MVCC. They pay the bill differently.
PostgreSQL bloat: when “deleted” isn’t gone
In Postgres, updates create new row versions. Deletes mark rows dead. Space is reclaimed later. If autovacuum can’t keep up—because of long-running transactions, bad settings, or just too much churn—your table and indexes bloat. Plans that used to be index-friendly become IO-bound. Vacuum debt is like credit card debt: the minimum payments feel fine until they don’t.
MySQL purge and history list length
InnoDB also keeps old versions in undo logs and purges them when safe. Long transactions can grow the history list length, creating extra work for purge and increasing read overhead for some workloads. It’s a different shape of pain, but still pain.
Visibility and “why is it reading so much?”
If a Postgres query starts doing a lot of heap fetches, you often have a vacuum/visibility problem. If an InnoDB workload starts stalling on purge or shows growing undo pressure, you have transaction lifecycle problems. Neither is fixed by adding yet another index “just to be safe.”
Locks and isolation: when correctness costs latency
Different defaults, different surprises
PostgreSQL’s default isolation level is READ COMMITTED. MySQL’s default (often) is REPEATABLE READ for InnoDB. That can change what reads see and how long certain metadata and gap locks are held. The result: the same “read query” can be blocked differently depending on concurrent writes and transaction boundaries.
Metadata locks and DDL
MySQL’s metadata locking can cause surprising stalls when DDL runs. PostgreSQL has its own locking behaviors around schema changes. The operational rule is the same: treat DDL like you treat deployments—planned, tested, and timed. If you run “quick ALTER” at noon, you’re auditioning for the on-call rotation.
Joke #2: Nothing ages you faster than an “online schema change” that turns out to be “online, as in: everyone is online watching it block.”
Caching and IO: your storage is part of the database
At scale, query time is often storage time with better branding.
Buffer pools vs shared buffers: different math, same physics
MySQL/InnoDB has the buffer pool, typically sized large (60–80% of RAM on dedicated boxes, adjusted for workload and other memory needs). PostgreSQL has shared_buffers plus the OS page cache doing heavy lifting. This affects warm vs cold behavior. A query that is “fast” on a warm MySQL buffer pool might be “brutal” on a Postgres instance that relies more on OS cache and is competing with other processes or has different memory tuning.
Checkpoints, fsync, and write amplification
Both engines are durable by default, but their write patterns differ. PostgreSQL checkpoints can produce bursts of IO unless tuned; MySQL’s redo log and doublewrite buffer have their own IO footprints. If your storage has limited IOPS or high latency variance, the database will faithfully convert that into user-facing misery.
Storage gotchas SREs actually hit
- Cloud block storage with burst credits: “fast until it isn’t.”
- Network-attached volumes with noisy neighbors: great for backups, questionable for hot data.
- Misaligned RAID / filesystem settings: you can benchmark your way into self-deception.
- Compression and encryption: excellent features, not free.
Practical tasks: commands, outputs, and decisions (12+)
These are the things you do at 02:00 when the dashboard is red and theory is taking a smoke break. Each task includes a command, a realistic snippet of output, what it means, and what decision you make next.
Task 1: PostgreSQL — get the real plan with timing and buffers
cr0x@server:~$ psql -d appdb -c "EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT o.id FROM orders o JOIN customers c ON c.id=o.customer_id WHERE c.segment='enterprise' AND o.created_at > now()-interval '7 days';"
Hash Join (cost=1250.11..9821.44 rows=42000 width=8) (actual time=38.221..412.903 rows=39812 loops=1)
Output: o.id
Hash Cond: (o.customer_id = c.id)
Buffers: shared hit=10231 read=821
-> Seq Scan on public.orders o (cost=0.00..7260.00 rows=500000 width=16) (actual time=0.020..220.111 rows=480321 loops=1)
Output: o.id, o.customer_id
Filter: (o.created_at > (now() - '7 days'::interval))
Rows Removed by Filter: 120034
Buffers: shared hit=8121 read=701
-> Hash (cost=1025.00..1025.00 rows=18000 width=8) (actual time=37.902..37.904 rows=18210 loops=1)
Buffers: shared hit=2110 read=120
Planning Time: 1.102 ms
Execution Time: 418.771 ms
What it means: The sequential scan on orders dominates and reads 701 buffers from disk. Your filter on created_at isn’t using an index (or the planner decided not to).
Decision: Check whether an index on (created_at) or (created_at, customer_id) exists and is selective. Also check statistics and whether the last 7 days is a large fraction of the table.
Task 2: MySQL — get the plan and actual timing
cr0x@server:~$ mysql -e "EXPLAIN ANALYZE SELECT o.id FROM orders o JOIN customers c ON c.id=o.customer_id WHERE c.segment='enterprise' AND o.created_at > NOW() - INTERVAL 7 DAY\G"
*************************** 1. row ***************************
EXPLAIN: -> Nested loop inner join (cost=12453.20 rows=40000) (actual time=0.112..52.883 rows=39812 loops=1)
-> Index lookup on c using idx_segment (segment='enterprise') (cost=210.12 rows=18000) (actual time=0.041..6.233 rows=18210 loops=1)
-> Index lookup on o using idx_customer_created (customer_id=c.id, created_at>(now() - interval 7 day)) (cost=0.62 rows=3) (actual time=0.002..0.002 rows=2 loops=18210)
What it means: MySQL is driving from customers using idx_segment and then probing orders via a composite index. That’s why it’s fast.
Decision: If Postgres is scanning orders, you likely need the composite index (or better stats) to encourage an index-driven plan.
Task 3: PostgreSQL — check existing indexes on a table
cr0x@server:~$ psql -d appdb -c "\d+ orders"
Table "public.orders"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
-------------+--------------------------+-----------+----------+---------+---------+--------------+-------------
id | bigint | | not null | | plain | |
customer_id | bigint | | not null | | plain | |
created_at | timestamp with time zone | | not null | now() | plain | |
Indexes:
"orders_pkey" PRIMARY KEY, btree (id)
"idx_orders_customer_id" btree (customer_id)
What it means: There’s no index on created_at and no composite on (customer_id, created_at).
Decision: Add a composite index that matches your join + filter pattern, or a partial index if you only query recent data.
Task 4: PostgreSQL — create a composite index without taking the world down
cr0x@server:~$ psql -d appdb -c "CREATE INDEX CONCURRENTLY idx_orders_customer_created ON orders (customer_id, created_at);"
CREATE INDEX
What it means: Index build completed without blocking reads/writes (it still takes time and IO).
Decision: Re-run EXPLAIN (ANALYZE, BUFFERS). If the plan changes and latency improves, ship it. If not, you likely have a stats/selectivity issue or a different bottleneck.
Task 5: PostgreSQL — refresh statistics now (when it’s safe)
cr0x@server:~$ psql -d appdb -c "ANALYZE orders; ANALYZE customers;"
ANALYZE
ANALYZE
What it means: Planner stats were refreshed. If you loaded data recently or distribution changed, this can fix bad plans.
Decision: If estimates were wildly wrong, re-check the plan. If correlation exists across columns, consider extended statistics.
Task 6: PostgreSQL — find estimate vs actual mismatch quickly
cr0x@server:~$ psql -d appdb -c "EXPLAIN (ANALYZE) SELECT * FROM orders WHERE created_at > now()-interval '7 days';"
Seq Scan on orders (cost=0.00..7260.00 rows=5000 width=64) (actual time=0.012..189.442 rows=480321 loops=1)
Planning Time: 0.211 ms
Execution Time: 205.118 ms
What it means: The estimate (5,000 rows) is off from actual (480,321 rows) by ~96x. That’s the smoking gun.
Decision: Increase stats target for created_at, add extended stats, or redesign the predicate/indexing (e.g., partition by time).
Task 7: PostgreSQL — raise stats target for a skewed column
cr0x@server:~$ psql -d appdb -c "ALTER TABLE orders ALTER COLUMN created_at SET STATISTICS 1000; ANALYZE orders;"
ALTER TABLE
ANALYZE
What it means: Postgres will keep richer stats for created_at. Planning gets slower, execution often gets faster (worth it for hot queries).
Decision: If the query is high-frequency and sensitive, this is a reasonable trade. If it’s ad-hoc analytics, maybe not.
Task 8: MySQL — confirm index usage and access type
cr0x@server:~$ mysql -e "EXPLAIN SELECT o.id FROM orders o JOIN customers c ON c.id=o.customer_id WHERE c.segment='enterprise' AND o.created_at > NOW() - INTERVAL 7 DAY;"
id select_type table type possible_keys key key_len rows Extra
1 SIMPLE c ref idx_segment idx_segment 1022 18210 Using where
1 SIMPLE o range idx_customer_created idx_customer_created 16 39812 Using where; Using index
What it means: type=ref/range and a specific key indicates index access. “Using index” suggests a covering index for selected columns.
Decision: If you see type=ALL unexpectedly, you’re scanning. Then you inspect index definitions and predicate sargability.
Task 9: MySQL — check InnoDB buffer pool hit rate behavior
cr0x@server:~$ mysql -e "SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_reads'; SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read_requests';"
Variable_name Value
Innodb_buffer_pool_reads 1842231
Variable_name Value
Innodb_buffer_pool_read_requests 987654321
What it means: reads are physical reads; read_requests are logical. If physical reads rise rapidly during the slow query window, you’re going to disk.
Decision: If hit rate is poor, increase buffer pool (if RAM allows), fix working set (indexes, query), or address IO latency.
Task 10: PostgreSQL — check for blocking and long transactions
cr0x@server:~$ psql -d appdb -c "SELECT pid, usename, state, wait_event_type, wait_event, now()-xact_start AS xact_age, query FROM pg_stat_activity WHERE datname='appdb' ORDER BY xact_age DESC LIMIT 5;"
pid | usename | state | wait_event_type | wait_event | xact_age | query
------+--------+--------+-----------------+---------------+----------+--------------------------------------------
9123 | app | active | Lock | transactionid | 00:41:22 | SELECT ... FROM orders ...
8877 | app | idle in transaction | Client | ClientRead | 02:13:10 | UPDATE customers SET ...
What it means: An “idle in transaction” session has been open for hours. That can block vacuum and cause bloat, and it can indirectly wreck read performance.
Decision: Fix the app transaction handling. In the short term, terminate the session if it’s safe and you understand the blast radius.
Task 11: MySQL — find lock waits and who’s blocking
cr0x@server:~$ mysql -e "SHOW PROCESSLIST;"
Id User Host db Command Time State Info
221 app 10.0.1.10:42218 appdb Query 28 Waiting for table metadata lock ALTER TABLE orders ADD COLUMN foo INT
305 app 10.0.1.11:53122 appdb Query 3 Sending data SELECT ... FROM orders ...
What it means: A DDL is waiting on a metadata lock; it can also cause other sessions to stack up depending on timing and lock ordering.
Decision: Pause/kill the DDL if it’s impacting production and re-plan it. For the future, use online schema change methods appropriate to your version.
Task 12: PostgreSQL — check table and index bloat indicators
cr0x@server:~$ psql -d appdb -c "SELECT relname, n_live_tup, n_dead_tup, last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 5;"
relname | n_live_tup | n_dead_tup | last_autovacuum
-------------+------------+------------+----------------------------
orders | 5200000 | 2100000 | 2025-12-29 01:12:03+00
events | 22000000 | 9800000 |
What it means: Lots of dead tuples, and one table hasn’t been autovacuumed recently (or ever). That can inflate scans and degrade index-only scans.
Decision: Investigate autovacuum settings and long transactions; consider a manual VACUUM (ANALYZE) during a maintenance window and fix the root cause.
Task 13: Linux — confirm whether you’re IO-latency bound right now
cr0x@server:~$ iostat -xz 1 3
avg-cpu: %user %nice %system %iowait %steal %idle
18.22 0.00 5.11 24.33 0.00 52.34
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s w_await aqu-sz %util
nvme0n1 820.0 65536.0 0.0 0.00 18.40 79.90 210.0 18432.0 25.12 12.33 98.70
What it means: High %util and high r_await/w_await with noticeable %iowait. Storage is saturated or slow.
Decision: Stop blaming the optimizer for physics. Reduce reads (better indexes, avoid scans), add cache, or upgrade the storage class / provisioned IOPS.
Task 14: PostgreSQL — see whether sorts/hashes spill to disk
cr0x@server:~$ psql -d appdb -c "EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders ORDER BY created_at DESC LIMIT 100000;"
Sort (cost=980000.00..1030000.00 rows=5200000 width=64) (actual time=2100.551..2320.112 rows=100000 loops=1)
Sort Key: created_at DESC
Sort Method: external merge Disk: 512000kB
Buffers: shared hit=12000 read=88000
Execution Time: 2355.991 ms
What it means: “external merge Disk” means it spilled. This is often why a query is brutal in one engine: the memory setting and working set don’t match.
Decision: Fix by adding an index that matches the order ((created_at DESC)), reducing result set, or carefully increasing work_mem for that workload.
Three corporate mini-stories from the trenches
1) Incident caused by a wrong assumption: “An index is an index”
The company wanted to move a read-heavy service from MySQL to PostgreSQL to simplify operations. The SQL layer was “portable.” The schema was “the same.” The rollout plan was to dual-write and switch reads.
In staging, it looked fine. In production, a single endpoint started timing out. Same query. Same WHERE clause. Different reality. PostgreSQL picked a sequential scan and a hash join that looked reasonable on paper. Actual rows were nowhere near the estimate.
The root cause was mundane: in MySQL, the team had a composite index that matched the query shape. In PostgreSQL, the migration created single-column indexes but missed the composite. MySQL’s index merge and nested-loop pattern was masking the mistake; PostgreSQL wanted an index that supported both the join key and the time filter. The Postgres planner also underestimated the selectivity of the time predicate because the distribution had a cliff (a retention job had recently backfilled a week of data).
The fix wasn’t a heroic knob-twiddle. It was building the right composite index concurrently, running ANALYZE, and adding a guardrail: any query that joins and filters must have an index review. They also added extended statistics for correlated columns. After that, plans stabilized and the endpoint stopped cosplaying as a denial-of-service vector.
2) Optimization that backfired: “Let’s force the planner”
A different team had a slow analytics query in PostgreSQL. Someone discovered that disabling nested loops made the query faster on their laptop. They put SET enable_nestloop=off into the application session setup. It felt clever. It was also a trap.
For a week it looked like a win. Then traffic shifted: one customer’s dataset was much smaller than average, and nested loops were actually the right choice. With nested loops disabled, Postgres chose hash joins and big sorts, and memory usage spiked. The database started spilling to disk, IO latency climbed, and suddenly unrelated queries slowed down. The “fix” had turned one query’s performance preference into a global policy for that connection pool.
They rolled it back and did the boring work: improved stats, added an index, and rewrote the query to be more selective earlier. The lesson stuck: plan hints (or pseudo-hints via session settings) are a last resort. They age poorly. Data changes. Workloads change. Your cleverness will not outlive the next quarter.
3) Boring but correct practice that saved the day: “Measure, then change one thing”
A payments system ran both MySQL and PostgreSQL for different services. After a deployment, MySQL latency jumped for a set of queries that were normally sub-50ms. The immediate reaction was to tune buffer pool and blame the network.
The on-call followed a runbook: confirm blocking, confirm IO saturation, capture plans, compare before/after. They pulled the slow query log and found the same query now doing “Using temporary; Using filesort.” Then they checked schema diff: a migration had dropped a composite index and replaced it with two single-column indexes, under the belief that “the optimizer can combine them.” Sometimes it can. This time it didn’t.
They restored the composite index, confirmed with EXPLAIN that the plan reverted, and watched latency recover. No dramatic heroics. No parameter roulette. Just evidence, one change, and a rollback path. It’s not glamorous, but neither is missing payroll because your database is busy sorting.
Common mistakes: symptom → root cause → fix
-
Symptom: Postgres uses Seq Scan, MySQL uses an index and is fast.
Root cause: Missing composite index in Postgres, or predicate not sargable due to casts/functions.
Fix: Create the composite index that matches join + filter; ensure predicates compare same types; consider expression index if needed. -
Symptom: Postgres estimates 5k rows, actually reads 500k; plan is awful.
Root cause: Stale/insufficient statistics; correlated columns; skewed distribution.
Fix:ANALYZE; raise stats target; add extended stats; consider partitioning on time or tenant. -
Symptom: “Index-only scan” but still slow; heap fetches are huge.
Root cause: Visibility map not set due to vacuum lag; frequent updates/deletes.
Fix: Fix autovacuum and transaction lifetimes; run maintenance vacuum; reduce churn or redesign updates. -
Symptom: MySQL plan flips between good and bad randomly.
Root cause: InnoDB stats recalculation/persistence differences; parameter sensitivity; histogram missing.
Fix: Enable persistent stats; add histograms for skewed columns; stabilize with correct composite indexes. -
Symptom: Query is slow only sometimes; otherwise fine.
Root cause: IO bursts during checkpoint/flush; cache eviction; noisy neighbor storage latency.
Fix: Smooth write IO (checkpoint tuning, redo sizing), improve storage class, reduce working set, add cache. -
Symptom: Reads blocked or timing out during deploy.
Root cause: DDL locking / metadata locks; long transactions holding locks.
Fix: Plan DDL windows; use online schema techniques; set timeouts; fix app transactions. -
Symptom: Postgres gets slower over weeks without code changes.
Root cause: Table/index bloat and vacuum debt; long-lived transactions prevent cleanup.
Fix: Audit autovacuum, transaction management; periodic reindex where needed; remove “idle in transaction.” -
Symptom: MySQL CPU spikes with “Sending data,” low disk activity.
Root cause: Large join buffer usage, inefficient join order, missing covering index, or row-by-row lookups.
Fix: Add covering/composite indexes; rewrite query; verify join order with EXPLAIN ANALYZE; avoid functions on indexed columns.
Checklists / step-by-step plan
When a query is fast in MySQL and slow in PostgreSQL
- Get the plan with real execution: Postgres
EXPLAIN (ANALYZE, BUFFERS). Identify dominant node. - Compare estimates vs actual: if off by >10x, fix stats before anything else.
- Check indexes and their shape: verify composite indexes exist for join + filter. Postgres often wants the composite where MySQL got away with index merge.
- Confirm predicate sargability: no casts on indexed columns, no
WHERE date(created_at)=...without expression indexes. - Investigate vacuum/visibility: dead tuples, heap fetches, long transactions.
- Check memory spills: sorts/hashes spilling to disk; adjust indexing or query design before raising memory settings.
- Only then tune knobs:
work_mem, parallelism, autovacuum thresholds—based on measured bottlenecks.
When a query is fast in PostgreSQL and slow in MySQL
- EXPLAIN in MySQL: look for
type=ALL, “Using temporary,” “Using filesort.” - Confirm composite indexes: Postgres bitmap scans can combine indexes well; MySQL often needs a single composite index to avoid temp tables.
- Check collations and data types: string comparisons and implicit conversions can disable indexes in MySQL too.
- Inspect InnoDB stats and histograms: skewed columns without histograms can mislead the optimizer.
- Validate buffer pool health: if buffer pool reads climb during slowness, it’s IO-bound, not “bad SQL.”
- Check metadata locks: DDL or long transactions can stall innocently written queries.
Plan changes safely
- Change one thing at a time (index, stats, query rewrite, or config).
- Measure before/after with the same parameters and representative data.
- Have rollback: drop index concurrently (Postgres) or drop index (MySQL) after verifying alternatives.
- Document the reason in the migration. Future-you deserves evidence, not vibes.
FAQ
1) Is PostgreSQL slower than MySQL?
No. Both can be extremely fast. PostgreSQL often wins on complex queries and rich indexing options; MySQL often shines with straightforward OLTP patterns. The deciding factor is whether the planner has the right indexes and stats for your data distribution.
2) Why does PostgreSQL do a sequential scan when an index exists?
Because it believes the scan is cheaper. Common reasons: the predicate is not selective, the index is not aligned with the filter, statistics are wrong, or the query applies a function/cast that prevents index usage.
3) Why does MySQL choose a terrible plan even with indexes?
InnoDB statistics can be misleading on skewed data without histograms; some query shapes push MySQL into temp tables and filesorts; and sometimes the “best” index is a composite that matches the WHERE + JOIN, not two single-column indexes.
4) Should I use hints to force plans?
Avoid it unless you’re cornered. Forced plans age badly as data changes. Fix indexes, stats, and query shape first. If you must hint, isolate it, document it, and add monitoring so you know when it becomes wrong.
5) Does PostgreSQL need more maintenance than MySQL?
Different maintenance. PostgreSQL needs vacuum discipline and attention to long transactions. MySQL needs attention to InnoDB undo/purge behavior, buffer pool sizing, and operational practices around DDL and metadata locks. Neither is “set and forget” in production.
6) Why do prepared statements change performance in PostgreSQL?
Postgres may use a generic plan for prepared statements that doesn’t adapt to parameter values. If one parameter value is common and highly selective (or not), a generic plan can be consistently wrong. Sometimes using custom plans (or avoiding premature prepares) matters.
7) Why does the same query behave differently on a replica?
Replication lag, different cache warmth, different IO contention, and replay/apply overhead can all change latency. Also, replicas often have different settings (intentionally or accidentally), especially around durability and memory.
8) Should I partition tables to fix slow queries?
Partitioning can help if you’re filtering by partition key and your engine can prune partitions effectively. It also adds operational complexity. If your problem is missing indexes or bad stats, partitioning is a fancy way to avoid doing the basics.
9) Which database is more predictable for query performance?
Predictability comes from discipline: stable schemas, correct indexes, refreshed stats, controlled transaction lifetimes, and consistent configuration. Either engine can be predictable if you operate it like you mean it.
Conclusion: what to do next week
If you’re dealing with a cross-engine performance shock, do three things with intent:
- Make plans visible: capture
EXPLAIN ANALYZE(and buffers in Postgres) for the slow cases, not the happy path. - Fix the shape, not the superstition: add the composite index that matches join + filter; remove implicit casts; refresh stats; address vacuum/transaction debt.
- Operate the database like a system: check IO latency, cache health, blocking, and maintenance. The optimizer is not your enemy; it’s your employee. Give it accurate data and proper tools.
Then do the boring thing that keeps systems alive: write down what happened, why it happened, and what guardrails prevent it next time. Production doesn’t reward memory. It rewards repeatability.