WordPress doesn’t “randomly get slow.” It gets slow in very specific, repeatable ways, usually inside the database. One day your homepage loads in 400 ms, the next day it’s 8 seconds and your PHP workers are piling up like luggage at a canceled flight.
The hard part isn’t picking MySQL versus MariaDB. The hard part is catching the query killers that WordPress (and its plugins) quietly introduce: missing indexes, autoload bloat, pathological LIKEs, lock storms, tmp table thrash, and cache assumptions that were never true. You can fix most of this without rewriting the site, and without turning your database into an artisanal science project.
MySQL vs MariaDB: what actually matters for WordPress
WordPress is not a benchmark suite. It’s a collection of predictable query patterns (core) plus unpredictable query patterns (plugins) plus the occasional self-inflicted wound (custom code, bad imports, “helpful” optimizations).
Picking MySQL or MariaDB matters less than people want it to. Both run InnoDB, both can do very good things, and both can do very bad things if you let a few tables grow without guardrails.
Quick differences that show up in production
- Defaults and version behavior differ. MySQL 8 defaults (and optimizer changes) can behave differently than MariaDB 10.x/11.x in edge cases. For WordPress, most pain is still schema and workload, not optimizer genius.
- Observability tools differ. MySQL 8’s Performance Schema and sys schema are solid and widely documented. MariaDB has equivalents, but you’ll sometimes reach for different views or enable different plugins.
- Query cache nostalgia is a trap. If you’re choosing MariaDB because you heard “query cache helps WordPress,” stop. You don’t fix a modern dynamic site by resurrecting a global mutex from 2009.
- Thread pool can matter. MariaDB’s thread pool (depending on edition/build) can help under high connection churn. But if you’re spawning 800 PHP-FPM children and letting each open a connection, you’ve built a connection factory, not a website.
- GTID/replication semantics differ. This affects failover/HA more than raw query latency, but WordPress outages are often “the database” even when the trigger is replication lag.
What you should decide on (instead of brand loyalty)
For WordPress, your real decision points are:
- Can you get a modern major version? Old MySQL 5.6/5.7 and old MariaDB builds have fewer diagnostic tools and more footguns. Run something current and supported.
- Do you have on-call familiarity? The best database is the one your team can debug at 02:00 without performing interpretive dance on a live replica.
- Do you have a plan for schema/index drift? WordPress installs are snowflakes. Your query killers are not theoretical—they’re in that snowflake.
Interesting facts & historical context (the short, useful kind)
- MySQL was acquired by Sun in 2008, and Sun was acquired by Oracle in 2010—this is the corporate genealogy behind many “MySQL vs MariaDB” arguments.
- MariaDB was started by MySQL’s original creators as a fork to keep a community-driven path alive after the Oracle acquisition.
- WordPress began life in the MySQL 4 era, which explains some long-lived schema choices (like heavy reliance on meta tables) that weren’t designed for modern analytics-style queries.
- InnoDB became the default MySQL storage engine in MySQL 5.5; before that, MyISAM was common. Some legacy “optimization tips” still assume MyISAM behavior and are wrong for WordPress today.
- MySQL 8 removed the query cache entirely; MariaDB kept a form of it longer. That divergence fueled myths about “MariaDB is faster for WordPress,” even when the bottleneck is PHP or missing indexes.
- Performance Schema matured dramatically in MySQL 5.7 and 8; if you’ve only used it once and hated it, you probably used it in its awkward adolescence.
- WordPress’s
wp_postmetatable design is intentionally generic (key/value pairs), which makes plugins easy to write and databases harder to keep fast at scale. - UTF8MB4 became the practical default for modern WordPress to support full Unicode (including emojis, which your marketing team will use whether you like it or not).
The WordPress query killers (and why they hurt)
Most “WordPress database performance” incidents reduce to a small set of failure modes. The trick is recognizing them fast and applying boring fixes.
1) Autoload bloat in wp_options
WordPress loads autoload='yes' options on every request. Every request. If that set grows to megabytes of serialized arrays—thanks, plugins—you’ve added a tax to every page view before you even run a single “interesting” query.
Symptoms: TTFB creeps up, CPU rises, queries look small but frequent, object cache misses become expensive, and admin pages feel like they’re wading through syrup.
2) The meta-table tax: wp_postmeta and friends
Meta tables are flexible, but they turn “find posts with X” into “scan lots of rows with repeated joins.” If a plugin builds a query with multiple meta conditions and no supportive indexes, it can pin CPU and churn buffer pool pages.
3) Missing or wrong indexes (especially on meta_key + meta_value)
WordPress core indexes are minimal. Plugins often query on columns that are not indexed (or are indexed in the wrong order), creating full table scans. As data grows, you go from “fine” to “fire” with no warning besides the slow log you didn’t enable.
4) ORDER BY RAND() and other joyless crimes
This one deserves a special callout: it looks innocent and feels like a shortcut. It is also a performance time bomb on any table with more than “a cute amount” of rows.
Joke #1: ORDER BY RAND() is like asking the database to shuffle a warehouse one box at a time.
5) Locking and contention: updates, transients, cron, and admin behavior
WordPress has write patterns that can cause contention: transient cleanup, scheduled tasks (WP-Cron), plugin migrations, and bulk operations in admin. If you have traffic and an admin running a bulk edit, you can get lock waits and a request pile-up.
6) Tmp tables and disk spills
When MySQL/MariaDB can’t keep intermediate results in memory, it spills to disk. If your tmp dir is on slow storage or you’re saturating IOPS, you’ll see latency explode during sorts, grouping, and complex joins. WordPress itself can trigger this via search, reporting plugins, and “filter everything” admin screens.
7) Cache confusion: object cache vs page cache vs database cache
WordPress caching is a layered system, and teams regularly confuse them:
- Page cache (CDN, reverse proxy) reduces PHP and DB work for anonymous traffic.
- Object cache (Redis/Memcached) reduces repeated reads within and across requests.
- Database cache (buffer pool) keeps frequently accessed pages in memory.
When you “turn on caching” and nothing improves, it’s usually because you turned on the wrong layer or your cache hit rate is bad because autoload/meta patterns generate too many unique keys.
Fast diagnosis playbook (first/second/third)
This is the playbook for when the site is slow and you’re trying to avoid making it worse. The goal is to identify whether you’re CPU-bound, I/O-bound, lock-bound, or connection-bound—fast.
First: confirm the bottleneck class (CPU, I/O, locks, or connections)
- CPU-bound: high database CPU, lots of “Sending data,” many full scans, buffer pool not helping because queries are inefficient.
- I/O-bound: high read/write latency, buffer pool misses, tmp table disk spills, slow storage or saturated volume.
- Lock-bound: many threads in “Waiting for table metadata lock” or “Waiting for row lock,” long-running transactions.
- Connection-bound: too many connections/threads, context switching, max connections reached, PHP-FPM stampede.
Second: capture the top offenders (don’t guess)
- Enable/inspect slow query log (short window) or use Performance Schema views.
- Pull
SHOW PROCESSLIST/information_schemaactivity snapshots. - Run
EXPLAINon the worst query and look for “Using temporary; Using filesort” and full scans.
Third: apply the lowest-risk fix that removes the most pain
- If it’s autoload bloat: reduce
autoload, clean transients, confirm size drop. - If it’s missing indexes: add targeted composite indexes (with a rollback plan).
- If it’s locking: find the long transaction and stop/avoid it; tune timeouts cautiously.
- If it’s I/O: increase buffer pool (if memory allows), fix tmp table sizing, move tmpdir to fast storage.
- If it’s connections: add pooling strategy (or at least reduce PHP-FPM concurrency) and set sane limits.
Practical tasks: commands, output, decisions (12+)
These are real tasks you can run today on a typical Linux host. Each includes: the command, what typical output means, and what decision to make next.
Task 1: Identify server flavor and version (you can’t tune what you can’t name)
cr0x@server:~$ mysql -e "SELECT VERSION() AS version, @@version_comment AS comment, @@version_compile_machine AS arch;"
+-------------------------+------------------------------+------+
| version | comment | arch |
+-------------------------+------------------------------+------+
| 8.0.36 | MySQL Community Server - GPL | x86_64 |
+-------------------------+------------------------------+------+
What it means: MySQL 8.0.36. If this were MariaDB, comment would mention MariaDB and version formats often include -MariaDB.
Decision: Use MySQL 8 tooling (Performance Schema, sys schema). If you’re on something old, budget time for an upgrade because many “mystery slowness” issues are easier to see on modern versions.
Task 2: Check current connection pressure
cr0x@server:~$ mysql -e "SHOW GLOBAL STATUS LIKE 'Threads_connected'; SHOW GLOBAL STATUS LIKE 'Max_used_connections'; SHOW VARIABLES LIKE 'max_connections';"
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| Threads_connected | 186 |
+-------------------+-------+
+----------------------+-------+
| Variable_name | Value |
+----------------------+-------+
| Max_used_connections | 412 |
+----------------------+-------+
+-----------------+------+
| Variable_name | Value|
+-----------------+------+
| max_connections | 500 |
+-----------------+------+
What it means: You’re operating close to the limit at peak. If PHP-FPM has 300 workers and each uses its own connection, you can hit the ceiling under a traffic spike.
Decision: If Threads_connected is regularly > 60–70% of max_connections, reduce app concurrency and/or add a proxy/pooler strategy (or at least persistent connections tuned safely). Also check thread cache next.
Task 3: Confirm whether slow query logging is on (and where)
cr0x@server:~$ mysql -e "SHOW VARIABLES LIKE 'slow_query_log%'; SHOW VARIABLES LIKE 'long_query_time'; SHOW VARIABLES LIKE 'log_output';"
+---------------------+----------------+
| Variable_name | Value |
+---------------------+----------------+
| slow_query_log | OFF |
| slow_query_log_file | /var/log/mysql/slow.log |
+---------------------+----------------+
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| long_query_time | 10.000000 |
+-----------------+-------+
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_output | FILE |
+---------------+-------+
What it means: Slow log is off and threshold is too high for WordPress diagnostics.
Decision: Turn it on temporarily with a low threshold (0.5–1s) during peak, then turn it back down. Capture evidence instead of vibes.
Task 4: Enable slow query log for a short diagnostic window
cr0x@server:~$ mysql -e "SET GLOBAL slow_query_log = 'ON'; SET GLOBAL long_query_time = 0.5; SET GLOBAL log_queries_not_using_indexes = 'ON';"
...output omitted...
What it means: New slow queries will be recorded. log_queries_not_using_indexes can be noisy; you’re using it as a flashlight, not as a lifestyle.
Decision: Let it run 10–30 minutes under typical traffic, then analyze. Disable or raise thresholds after you have enough samples.
Task 5: Summarize slow log offenders quickly
cr0x@server:~$ sudo mysqldumpslow -s t -t 10 /var/log/mysql/slow.log
Reading mysql slow query log from /var/log/mysql/slow.log
Count: 212 Time=3.12s (661s) Lock=0.00s (0s) Rows=10234.0 (2169608), root[root]@localhost
SELECT SQL_NO_CACHE * FROM wp_postmeta WHERE meta_key = 'S' AND meta_value LIKE '%S%' ORDER BY meta_id DESC LIMIT N
Count: 88 Time=1.44s (126s) Lock=0.03s (2s) Rows=1.0 (88), wp[wp]@10.0.2.10
SELECT option_value FROM wp_options WHERE autoload = 'yes'
What it means: You have a meta search pattern doing a LIKE '%...%' on meta_value (pain) and a frequent autoload query that may be pulling a huge payload.
Decision: For the meta search: fix the feature or constrain it; indexes won’t save a leading-wildcard LIKE on a long text column. For autoload: measure autoload size and prune.
Task 6: Check top current queries and lock states
cr0x@server:~$ mysql -e "SHOW FULL PROCESSLIST;"
+-----+------+----------------+------+---------+------+------------------------------+-----------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+------+----------------+------+---------+------+------------------------------+-----------------------+
| 821 | wp | 10.0.2.10:53214| wpdb | Query | 12 | Waiting for table metadata lock | ALTER TABLE wp_postmeta ... |
| 844 | wp | 10.0.2.11:53302| wpdb | Query | 8 | Sending data | SELECT ... FROM wp_posts ...|
| 865 | wp | 10.0.2.12:54100| wpdb | Query | 15 | Waiting for row lock | UPDATE wp_options SET ... |
+-----+------+----------------+------+---------+------+------------------------------+-----------------------+
What it means: There’s an ALTER TABLE blocking metadata and an UPDATE waiting on row locks. That’s a contention incident, not a “needs more RAM” incident.
Decision: Stop doing DDL during peak. If a plugin is migrating schema live, pause/rollback or do it in a maintenance window. Find the blocking transaction next.
Task 7: Find InnoDB lock waits and the blocker
cr0x@server:~$ mysql -e "SHOW ENGINE INNODB STATUS\G" | sed -n '1,160p'
...
LATEST DETECTED DEADLOCK
------------------------
*** (1) TRANSACTION:
TRANSACTION 18392019, ACTIVE 14 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 5 lock struct(s), heap size 1136, 3 row lock(s)
MySQL thread id 865, OS thread handle 140055, query id 998201 10.0.2.12 wpdb updating
UPDATE wp_options SET option_value = '...' WHERE option_name = '_transient_timeout_wc_count_comments'
*** (2) TRANSACTION:
TRANSACTION 18392011, ACTIVE 22 sec fetching rows
...
What it means: Transients are involved; a plugin (here, WooCommerce-like behavior) is hammering wp_options and colliding.
Decision: Move transients to Redis/Memcached (persistent object cache) to reduce database writes, and/or reduce cron concurrency. Also ensure wp_options.option_name is indexed (it should be).
Task 8: Measure buffer pool pressure (are you I/O bound?)
cr0x@server:~$ mysql -e "SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read%'; SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages%';"
+----------------------------------+----------+
| Variable_name | Value |
+----------------------------------+----------+
| Innodb_buffer_pool_read_requests | 94811234 |
| Innodb_buffer_pool_reads | 2211903 |
+----------------------------------+----------+
+----------------------------------+--------+
| Variable_name | Value |
+----------------------------------+--------+
| Innodb_buffer_pool_pages_total | 524288 |
| Innodb_buffer_pool_pages_free | 120 |
| Innodb_buffer_pool_pages_dirty | 18240 |
+----------------------------------+--------+
What it means: Reads vs read requests gives you a rough hit ratio: 1 - reads/read_requests. Here it’s decent, but pages_free is tiny (buffer pool fully used).
Decision: If your working set doesn’t fit and disk latency is high, increase innodb_buffer_pool_size (within safe RAM limits). If hit ratio is already good, focus on query efficiency and locking instead.
Task 9: Confirm tmp table spills (classic hidden latency)
cr0x@server:~$ mysql -e "SHOW GLOBAL STATUS LIKE 'Created_tmp%tables'; SHOW VARIABLES LIKE 'tmp_table_size'; SHOW VARIABLES LIKE 'max_heap_table_size';"
+-------------------------+--------+
| Variable_name | Value |
+-------------------------+--------+
| Created_tmp_disk_tables | 92341 |
| Created_tmp_tables | 145220 |
+-------------------------+--------+
+-------------------+----------+
| Variable_name | Value |
+-------------------+----------+
| tmp_table_size | 16777216 |
+-------------------+----------+
+-------------------+----------+
| Variable_name | Value |
+-------------------+----------+
| max_heap_table_size | 16777216 |
+-------------------+----------+
What it means: A high ratio of disk tmp tables often correlates with slow sorts/grouping. 16MB limits are conservative.
Decision: If you have RAM headroom, raise both sizes together (they effectively cap memory tmp tables). Also locate what queries are causing it—don’t just crank knobs and hope.
Task 10: Inspect the worst query plan (EXPLAIN, not prayer)
cr0x@server:~$ mysql -e "EXPLAIN SELECT p.ID FROM wp_posts p JOIN wp_postmeta pm ON pm.post_id=p.ID WHERE pm.meta_key='color' AND pm.meta_value='blue' AND p.post_type='product' AND p.post_status='publish' ORDER BY p.post_date DESC LIMIT 20;"
+----+-------------+-------+------+---------------+------+---------+-------------------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+-------------------+--------+-----------------------------+
| 1 | SIMPLE | pm | ref | meta_key | meta_key | 767 | const | 980000 | Using where; Using filesort |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 8 | wpdb.pm.post_id | 1 | Using where |
+----+-------------+-------+------+---------------+------+---------+-------------------+--------+-----------------------------+
What it means: It’s scanning nearly a million rows in wp_postmeta for that meta_key and then filtering by meta_value. The filesort suggests extra work for ordering.
Decision: Add a composite index that matches the filter pattern (commonly (meta_key, meta_value(…prefix…), post_id) depending on data types and selectivity). Then re-run EXPLAIN and validate reduced rows.
Task 11: Measure autoload payload size in wp_options
cr0x@server:~$ mysql -e "SELECT ROUND(SUM(LENGTH(option_value))/1024/1024,2) AS autoload_mb, COUNT(*) AS autoload_rows FROM wp_options WHERE autoload='yes';"
+-------------+--------------+
| autoload_mb | autoload_rows|
+-------------+--------------+
| 14.73 | 3129 |
+-------------+--------------+
What it means: ~15MB of options are loaded on every request. That’s not “a little overhead.” That’s a recurring invoice.
Decision: Identify the largest autoload rows and disable autoload for non-essential options, or remove stale options from dead plugins. Target < 1–2MB for most sites; bigger sites can justify more, but 15MB is usually accidental.
Task 12: Find the biggest autoload offenders (safely)
cr0x@server:~$ mysql -e "SELECT option_name, ROUND(LENGTH(option_value)/1024,1) AS kib FROM wp_options WHERE autoload='yes' ORDER BY LENGTH(option_value) DESC LIMIT 15;"
+-----------------------------------+--------+
| option_name | kib |
+-----------------------------------+--------+
| myplugin_big_cache_blob | 2048.4 |
| theme_mods_mytheme | 612.7 |
| rewrite_rules | 512.2 |
| widget_custom_html | 498.9 |
+-----------------------------------+--------+
What it means: One plugin is stuffing multi-megabyte “cache” into autoload. Also, some core options (like rewrite_rules) can be large but usually not multi-megabyte.
Decision: For plugin blobs: set autoload to ‘no’ (if safe), migrate to object cache, or stop the plugin from persisting huge computed data. For core options: consider flushing and regenerating rewrite rules (carefully) if corrupted.
Task 13: Check table sizes and growth (spot the real monster)
cr0x@server:~$ mysql -e "SELECT table_name, ROUND((data_length+index_length)/1024/1024,1) AS total_mb FROM information_schema.tables WHERE table_schema='wpdb' ORDER BY (data_length+index_length) DESC LIMIT 10;"
+--------------+----------+
| table_name | total_mb |
+--------------+----------+
| wp_postmeta | 8420.5 |
| wp_posts | 2150.2 |
| wp_options | 390.4 |
| wp_comments | 280.7 |
+--------------+----------+
What it means: wp_postmeta dominates. That’s normal at scale—but it means index choices matter and “meta queries everywhere” will punish you.
Decision: Focus optimization on wp_postmeta access patterns and reduce meta-heavy queries in high-traffic endpoints (search, filters, product listings).
Task 14: Check for long-running transactions (hidden lock bombs)
cr0x@server:~$ mysql -e "SELECT trx_id, trx_state, trx_started, TIMESTAMPDIFF(SECOND,trx_started,NOW()) AS trx_age_s, trx_mysql_thread_id FROM information_schema.innodb_trx ORDER BY trx_started ASC LIMIT 10;"
+--------+----------+---------------------+----------+---------------------+
| trx_id | trx_state| trx_started | trx_age_s| trx_mysql_thread_id |
+--------+----------+---------------------+----------+---------------------+
| 183910 | RUNNING | 2025-12-31 10:21:02 | 1843 | 512 |
+--------+----------+---------------------+----------+---------------------+
What it means: A transaction has been open for ~30 minutes. In WordPress land, that’s suspicious. It can block purge, cause history list growth, and amplify lock contention.
Decision: Identify the query/user holding it (processlist), and fix the application behavior (often a report/export, a plugin migration, or an admin job).
Task 15: Check disk latency and saturation (because physics still runs the datacenter)
cr0x@server:~$ iostat -xz 1 5
Linux 6.8.0 (server) 12/31/2025 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
22.1 0.0 6.4 18.8 0.0 52.7
Device r/s w/s rkB/s wkB/s await svctm %util
nvme0n1 980.0 420.0 52000.0 18000.0 12.4 0.8 98.7
What it means: Very high utilization and meaningful await. If tmp tables spill or buffer pool misses spike, you’ll feel this immediately as site latency.
Decision: Reduce disk work first (better indexes, buffer pool sizing, avoid tmp spills) before you buy faster disks. If you’re already on NVMe and still pegged, you’re probably doing silly queries at scale.
Schema hotspots: wp_options, wp_postmeta, wp_posts
wp_options: autoload and transients
What kills you: bloated autoload + transient churn.
- Autoload: treat it like a startup config, not a cache store.
- Transients: if you don’t have a persistent object cache, transients land in
wp_optionsand turn into writes under traffic.
Fix strategy:
- Measure autoload size, cut it down, and keep it down.
- Move transients to Redis/Memcached (persistent object cache plugin + backend).
- Stop plugins from writing large blobs as autoloaded options.
wp_postmeta: make meta queries pay rent
What kills you: repeated joins and unindexed filters.
WordPress core indexes typically include post_id and meta_key. Plugins often filter on meta_value, sort by it, or do range queries against numeric values stored as strings. That’s how you end up with a database that’s “fine until it isn’t.”
Fix strategy:
- Add selective composite indexes matching your real workload (not a generic “index everything” spree).
- For numeric ranges: ensure queries cast consistently, and consider storing numeric values in dedicated tables if you ever get a chance. If not, index a prefix or rethink the feature.
- For text searches on
meta_value: stop doing leading-wildcard LIKE. Use proper search solutions or limit scope.
wp_posts: predictable patterns, still easy to hurt
Queries around post_type, post_status, post_date, and taxonomy joins are common. Core indexes often help, but custom queries that sort on unindexed expressions or join excessive meta conditions can still cause tmp table spills and filesorts.
Joke #2: A database “optimization” that adds five joins to avoid one query is like taking three flights to avoid a layover.
Tuning that works on both (and what differs)
Here’s the opinionated rule: tune from evidence, not from blog-post vibes. For WordPress, most wins come from schema and workload shaping. After that, memory sizing and concurrency limits.
Start with these knobs (because they actually matter)
1) innodb_buffer_pool_size
If your DB server is dedicated to MySQL/MariaDB, a common baseline is 60–75% of RAM for the buffer pool. On mixed-use boxes, be conservative. WordPress performance often improves more from fitting hot pages in memory than from any other single change.
2) innodb_log_file_size / redo log sizing
Too small redo logs can cause checkpoint pressure during write bursts (imports, cron storms). Too large can increase crash recovery time. Size it intentionally based on write patterns; don’t set it to “whatever someone tweeted.”
3) tmp table sizing and tmpdir placement
If disk tmp tables are high, raising tmp_table_size and max_heap_table_size (together) can help—up to a point. Also ensure your tmpdir is on fast local storage, not a slow network mount.
4) Concurrency: keep connections sane
WordPress doesn’t benefit from unlimited concurrency. If you saturate the database, latency rises and throughput often drops. Cap PHP-FPM workers based on measured DB capacity, not on “CPU cores times 20.”
5) Observability settings (so you can prove improvements)
Turn on what you need when you need it: slow logs, Performance Schema consumers (where appropriate), and basic OS metrics. Leave yourself breadcrumbs.
Where MySQL and MariaDB differ in practice
- Optimizer differences: occasionally you’ll see a query plan diverge. When that happens, the fix is usually still: add the right index, rewrite the worst query (often inside a plugin), or change the access pattern.
- Instrumentation: on MySQL 8, lean on Performance Schema and sys schema. On MariaDB, you may prefer
information_schemaand engine status, plus MariaDB’s own views depending on build. - Thread pool availability: can change how well the server behaves under connection storms. It doesn’t excuse connection storms.
One reliability idea worth keeping in your head, attributed to John Allspaw (paraphrased idea): “Reliability is what you do before the incident, not during it.”
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
They ran a busy content site with WordPress and a popular commerce plugin bolted on. Traffic wasn’t massive, but it was spiky: campaigns, email blasts, and the occasional mention on a big forum.
A new engineering manager pushed a database “standardization” effort. The plan: replace MariaDB with MySQL 8 across the fleet because “MySQL 8 is faster and more modern.” That part wasn’t crazy. The wrong assumption was: the database is interchangeable as long as the schema is the same.
They migrated cleanly, tests passed, and the first week looked fine. Then a sale started. Checkout latency went from sub-second to several seconds, and they began seeing intermittent “Error establishing a database connection.” The team blamed the new engine and started rolling back in a panic.
The real issue: during the migration, they also changed SQL modes and defaults. A plugin query that had been limping along with implicit conversions now got a different query plan and started using a terrible index path. Under load, it caused tmp table spills and pegged I/O. The same query on MariaDB had been “lucky” enough to pick a less awful plan.
The fix was unglamorous: add a composite index that matched the filter, and adjust the plugin’s query to avoid a computed sort. After that, MySQL 8 was fine. The lesson stuck: version changes expose latent schema problems. The database didn’t betray them; it simply stopped tolerating the mess.
Mini-story 2: The optimization that backfired
A different company had the classic problem: wp-admin search and product filtering were slow. Someone proposed “just add indexes on everything in wp_postmeta.” They created multiple wide composite indexes, including long prefixes of meta_value, and deployed them during business hours because “online DDL is safe now.”
For a short moment, searches got faster. Then write latency went up. Imports and cron jobs started timing out. The database CPU climbed and stayed there. Disk usage ballooned because every new index inflated the table, and the buffer pool could no longer keep the working set hot.
Worse: some queries didn’t use the new indexes at all, because the predicate shapes didn’t match the index order. Others used them but paid a heavy maintenance cost on writes. They had optimized a small slice of reads by taxing every write and every cache line.
They recovered by doing what they should have done first: identify the top 5 query patterns and index only those, with minimal width. They dropped the deadweight indexes, scheduled schema changes properly, and moved transient-heavy features to Redis.
The durable lesson: “more indexes” is not a strategy. It’s a way to convert read pain into write pain while running out of RAM.
Mini-story 3: The boring but correct practice that saved the day
A media org had a WordPress multisite deployment and a small SRE team that insisted on two “boring” practices: (1) slow query log enabled with a reasonable threshold, rotated and archived, and (2) a weekly review of the top offenders with plugin owners.
People rolled their eyes because everything seemed fine most of the time. Then a new plugin version shipped that changed how it handled related posts. It introduced a query that joined wp_posts to wp_postmeta multiple times and added a sort on an unindexed expression.
Because the slow log was already in place, the team saw the new offender within hours. They didn’t need a war room or a detective novel. They had the query fingerprint and a before/after baseline.
The fix was surgical: one index, one plugin configuration change to limit the related-posts pool, and a rollback option if the index caused write pain. Users never noticed. Leadership never heard about it, which is the highest compliment production gives you.
That’s what boring correctness buys: fewer heroic weekends.
Common mistakes: symptom → root cause → fix
1) Symptom: homepage slow, but database CPU is low
Root cause: connection pile-ups and lock waits; requests are waiting, not computing.
Fix: check SHOW PROCESSLIST for lock states; find and stop long transactions; schedule DDL off-peak; reduce PHP-FPM concurrency.
2) Symptom: admin pages (especially “Products” or “Orders”) are painfully slow
Root cause: meta-heavy filtering and sorting causing tmp table spills or full scans.
Fix: slow log + EXPLAIN the admin query; add one composite index for the dominant predicate; raise tmp table limits if needed; disable expensive columns/filters in admin list views.
3) Symptom: sudden spikes in write I/O and lock waits every few minutes
Root cause: WP-Cron bursts, transient cleanup, or plugin scheduled tasks.
Fix: move cron to system cron; ensure only one runner; move transients to persistent object cache; reduce scheduled task frequency where safe.
4) Symptom: search is slow and gets worse with content growth
Root cause: LIKE '%term%' queries on large text fields, often in postmeta.
Fix: stop doing leading-wildcard LIKE at scale. Use proper full-text search on content where applicable, limit search scope, or offload search. Indexes won’t rescue this pattern reliably.
5) Symptom: “Error establishing a database connection” under load
Root cause: max connections reached, thread scheduling overhead, or DB stalled on I/O/locks so connections time out.
Fix: measure Threads_connected, Max_used_connections; reduce PHP-FPM workers; add backpressure; ensure DB isn’t I/O pegged; avoid long transactions.
6) Symptom: adding an index made performance worse
Root cause: index maintenance overhead, inflated working set, or optimizer choosing the wrong index.
Fix: validate with EXPLAIN and query stats; drop unused/low-value indexes; keep indexes narrow and aligned to predicates; consider histograms (MySQL) only after basics.
7) Symptom: database is fast, but replication lag explodes during traffic
Root cause: write-heavy workload (options/transients), large transactions, or single-threaded apply limitations depending on configuration.
Fix: reduce writes (persistent object cache), avoid huge transactions, tune replication settings appropriate to engine/version; don’t use replicas for reads if they’re always behind.
8) Symptom: intermittent stalls during backups or analytics jobs
Root cause: backup/ETL job causing I/O contention or long-running consistent reads impacting purge/undo.
Fix: schedule backups off-peak; use non-blocking backup methods; throttle; ensure undo/redo sizing fits workload; monitor I/O and transaction age.
Checklists / step-by-step plan
Step-by-step: stabilize a slow WordPress database without rewriting
- Baseline what “slow” means. Capture p95/p99 page latency, DB CPU, disk await, and active connections during a bad window.
- Turn on slow query logging briefly. Use 0.5–1s threshold for 10–30 minutes under real traffic.
- Identify top 3 query fingerprints. Don’t chase the long tail.
- EXPLAIN the worst query. Look for full scans, filesort, and tmp table usage.
- Fix the biggest “always-on” tax first. Usually autoload bloat or a hot meta query on a high-traffic endpoint.
- Add one targeted index at a time. Validate improvement and watch write latency.
- Move transients to a persistent object cache. This reduces write churn and lock contention in
wp_options. - Control concurrency. Cap PHP-FPM workers to what the DB can handle; avoid connection storms.
- Re-check tmp table spills and buffer pool stats. Tune memory only after query shape is reasonable.
- Turn slow log thresholds back to normal. Keep it enabled at a sane value and rotate logs.
- Write down what changed. Not for bureaucracy—so the next incident isn’t archaeology.
Checklist: before you add an index
- Do you have the exact query text (or normalized fingerprint)?
- Did you run
EXPLAINand capture “rows” estimates? - Can you test in staging with production-like data volume?
- Do you know the write rate impact (imports, cron, checkouts)?
- Do you have a rollback plan (drop index) and a maintenance window if needed?
Checklist: autoload cleanup safely
- Measure current autoload MB and top offenders.
- Confirm which options belong to decommissioned plugins/themes.
- Change autoload flags cautiously (prefer plugin settings if available).
- Re-measure autoload MB and watch frontend latency.
- Keep a snapshot/backup before mass changes.
FAQ
1) Should I choose MySQL or MariaDB for WordPress performance?
Choose the one you can run in a current supported version with strong on-call familiarity. For WordPress, schema/query hygiene beats engine choice most days.
2) Is MariaDB “faster for WordPress” because of query cache?
No. Query cache is not the modern fix people think it is, and MySQL 8 removed it for good reasons. Use page/object caching and fix query patterns instead.
3) What’s the single biggest database fix for many WordPress sites?
Autoload bloat cleanup in wp_options, plus a persistent object cache to reduce transient churn. It’s not glamorous, but it’s often the highest ROI.
4) I enabled Redis object cache; why is MySQL still slow?
Because Redis doesn’t fix unindexed meta queries, tmp table spills, or lock contention from long transactions. Confirm cache hit rate and then keep diagnosing the top slow queries.
5) Can I fix slow meta queries without rewriting the plugin?
Often yes: add one or two targeted composite indexes and reduce the scope of filters/sorts via configuration. If the plugin does LIKE '%term%' on meta_value, you may need to change the feature, not the index.
6) Is increasing innodb_buffer_pool_size always safe?
It’s safe only if the host has RAM headroom for the OS, filesystem cache, and other services. Oversizing causes swapping, and swapping turns databases into performance art.
7) Why do I see “Waiting for table metadata lock”?
Usually because someone is running DDL (ALTER TABLE) while traffic continues, or a long transaction is blocking metadata changes. Schedule DDL off-peak and avoid long-running transactions.
8) How do I know if I’m I/O bound versus CPU bound?
Check OS-level disk metrics (await, %util) and InnoDB buffer pool reads/misses. High await and high buffer misses point to I/O. High CPU with lots of full scans points to query inefficiency.
9) Should I run WordPress reads on replicas?
Only if you can tolerate replica lag and you’ve confirmed your workload is read-heavy enough to benefit. WordPress has surprising writes (options, sessions/transients via plugins), which can keep replicas behind during spikes.
10) What if the slow query is inside WordPress core?
It’s rare that core is the primary offender at scale; it’s usually a plugin or a specific data shape (massive meta/options). If it truly is core, you still start by indexing and caching rather than forking core.
Conclusion: next steps you can do this week
MySQL versus MariaDB is not your first-order problem. Your first-order problem is that WordPress workloads punish sloppy schemas and sloppy assumptions, and plugins are professional at generating both.
Do this, in this order:
- Enable slow query logging for a controlled window and extract the top offenders.
- Measure
wp_optionsautoload size and cut it down aggressively but safely. - EXPLAIN the worst meta queries and add targeted composite indexes—one at a time.
- Move transients to a persistent object cache and take cron seriously (one runner, predictable schedule).
- Put guardrails on concurrency so the database doesn’t get stampede’d into timeouts.
- Re-check disk and tmp table metrics. If you’re still spilling to disk, fix that next.
If you do nothing else: stop guessing. Turn the lights on (slow log + basic metrics) and make the database tell you the truth. It usually does.