You add “the right indexes,” run the migration, and watch latency get worse. CPU climbs. IO melts. Writes slow to a crawl. The dashboard says you improved “read performance,” yet customers are now learning patience.
This is what happens when indexing advice is treated like a universal law instead of a workload-specific trade. MariaDB and PostgreSQL can both run blisteringly fast—or politely set your hardware budget on fire—depending on which rules you follow and which you ignore.
Why “best practices” fail in production
Most indexing advice is written for a fictional database: stable schema, predictable queries, polite concurrency, and a storage layer that behaves like the brochure promised. Your database is not that database.
“Add an index for every WHERE clause” is a classic. It’s also how you turn a working OLTP system into a write-amplified compost heap. Every extra index is another structure to update, another set of pages to dirty, another reason replication lag exists, and another chance for the optimizer to pick the wrong path.
Production indexing is about constraints. How many writes can you afford? How much memory is reserved for caching index pages? How often does your data change shape (hot keys, skew, bursts)? What’s your tolerance for maintenance windows? PostgreSQL and MariaDB answer these questions differently because their internals and operational tooling differ.
Also: “best practice” often assumes the planner has accurate stats. When stats drift—or your data distribution changes—your perfect index becomes a trap. It’s not that the database is dumb. It’s that you changed the rules while it was playing.
One quote that should be taped to the monitor: Hope is not a strategy.
—Gene Kranz. Indexing without measurement is hope with extra steps.
Joke #1: Adding indexes without measuring is like buying more shelves because you can’t find anything—eventually you own a library and still lose your keys.
Facts and history that still matter
Some context points that seem academic until they explain your outage:
- PostgreSQL inherited MVCC early (multi-version concurrency control), which means dead tuples and vacuum are part of the indexing story, not a footnote.
- InnoDB became the default MySQL engine long ago, and MariaDB largely follows the same InnoDB lineage—so secondary index design inherits its “clustered primary key” reality.
- PostgreSQL’s GIN and GiST indexes were built for complex data types and search patterns; they’re powerful, but they introduce maintenance and bloat dynamics that a plain B-tree crowd forgets.
- MariaDB/MySQL historically leaned on “index condition pushdown” and optimizer heuristics; they’re good until they’re not, and then the only truth is the actual plan.
- PostgreSQL gained INCLUDE columns for B-tree indexes (covering-ish behavior) relatively late compared to some engines; it changed how people should build “covering” indexes.
- InnoDB secondary indexes store the primary key in leaf pages. That one detail explains a shocking amount of space growth and cache churn when your primary key is wide.
- PostgreSQL uses visibility maps to make index-only scans possible; if vacuum can’t keep up, your “index-only” scan becomes “index plus table anyway.”
- MySQL/MariaDB have long supported “invisible indexes” (in MySQL) and MariaDB has similar capability via optimizer switches and index hints; being able to test index removal safely is operational gold.
What indexes really cost (and who pays)
1) Write amplification: every insert/update pays rent
In OLTP, the bill comes due on writes. Inserts touch the table plus every secondary index. Updates can be worse: update a column that appears in multiple indexes and you’ve multiplied the work. Delete is not a freebie either—both engines must record the deletion in their own way, and both can leave cleanup for later.
MariaDB (InnoDB) updates B-tree structures and writes redo/undo logs. PostgreSQL writes new tuple versions (MVCC) and updates indexes for the new tuple; old versions hang around until vacuum cleans up. Different mechanisms, same moral: more indexes mean more churn.
2) Cache pressure: indexes compete with what you actually need
Indexes are not “free reads.” They’re data structures living in memory and on disk. If your buffer pool (MariaDB) or shared_buffers + OS cache (PostgreSQL) can’t hold the working set, you get IO churn. IO churn turns “best practice index” into “why are we at 4000 IOPS at idle?”
3) Planning risk: the optimizer can pick the wrong “good” plan
The more indexes you have, the more choices the planner has. This sounds good until the planner’s cardinality estimates are wrong. Then it confidently picks a plan that looks cheap in its model and is expensive in reality.
4) Maintenance: bloat, fragmentation, and the lie of “set and forget”
PostgreSQL bloat often comes from dead tuples and index entries that aren’t immediately reclaimed. Vacuum mitigates it, but vacuum needs resources and configuration. MariaDB/InnoDB can suffer fragmentation and page split issues as well, and “OPTIMIZE TABLE” is not something you want to run casually at scale.
Joke #2: Vacuum is the only janitor people yell at for doing their job too slowly, and also for doing it at all.
Where MariaDB and PostgreSQL differ in indexing behavior
Clustered primary key vs heap table reality
InnoDB (MariaDB) stores table data clustered by the primary key. Secondary index leaf entries include the primary key, used as the “row pointer.” That makes primary key width and randomness matter a lot. A wide PK bloats every secondary index. A random PK increases page splits and reduces locality.
PostgreSQL uses heap tables; indexes point to tuple locations (TIDs). Your primary key isn’t physically clustering the table unless you explicitly CLUSTER (and accept its maintenance implications). This makes some “PK locality” arguments weaker in Postgres, but introduces other issues like HOT updates and heap fragmentation.
Covering indexes: “INCLUDE” vs “just add columns”
In MariaDB, a “covering index” is typically achieved by placing the needed columns in the index key. That increases index size and can worsen write cost. In PostgreSQL, you can use INCLUDE to add non-key columns, allowing index-only scans while keeping the index order defined by fewer keys. It’s not magic; included columns still occupy space and require maintenance, but you have more precise control.
Index-only scans are conditional, not guaranteed
PostgreSQL’s index-only scans require visibility map bits to be set, which depends on vacuum. If your table is frequently updated, the visibility map can lag, and Postgres will hit the heap anyway. In MariaDB, “covering” means the engine can avoid table lookups if all required columns are in the index, but you still pay with larger indexes and more cache pressure.
Statistics and misestimation failure modes look different
Both engines can misestimate. PostgreSQL exposes more knobs (default_statistics_target, per-column stats targets, extended statistics). MariaDB leans on persistent stats and histograms in newer versions, but the path to “fixing estimates” often looks like “ANALYZE and pray,” plus careful index design and query shaping.
Concurrency and maintenance rhythms differ
In Postgres, vacuum is part of steady-state operations. In InnoDB, you’re often more focused on buffer pool sizing, redo log, and avoiding pathological page churn. Both require routine maintenance; they just fail differently when you neglect it.
Workload patterns that make advice backfire
Pattern A: “Read-heavy” systems that are secretly write-heavy
Telemetry, audit logs, event streams, “append-only” tables—these feel read-heavy because the dashboard queries are what you notice. But the database spends most of its time ingesting. Adding indexes for every dashboard filter can multiply write costs and trigger replication lag.
Pattern B: High-cardinality vs low-cardinality columns
Indexing a boolean or tiny enum can be useful in Postgres with partial indexes, but in MariaDB a low-cardinality index might get ignored or cause wasted work. The “index everything in WHERE” advice ignores selectivity.
Pattern C: The composite-index order myth
Yes, composite index order matters. No, there isn’t a universal order. “Put the most selective column first” is often wrong when your query uses range conditions, ORDER BY, or needs index-only coverage. In MariaDB, leftmost prefix rules are a big deal. In Postgres, the planner is flexible, but still constrained by how the index can be used.
Pattern D: Over-indexing to cover ORMs
ORMs generate queries with inconsistent predicates and lots of optional filters. Teams respond by adding “supporting” indexes for every permutation. That’s how you end up with 25 indexes on a table with 10 columns.
Pattern E: Sorting and pagination that punishes you
OFFSET/LIMIT pagination plus ORDER BY on non-indexed columns is a slow-motion disaster. “Add an index on the sort column” helps until you add a WHERE clause that changes the best index. Then you need a compound index that matches filter + order, or you need keyset pagination. The “best practice” here is not “add an index”; it’s “change the query.”
Fast diagnosis playbook
When latency is up and people are refreshing Slack like it’s a status page, this is the order that usually finds the culprit fastest.
First: confirm whether you’re CPU-bound, IO-bound, or lock-bound
- CPU-bound: high CPU, low IO wait, queries consuming cycles; plans often show expensive joins, sorts, or functions.
- IO-bound: high read/write IOPS, high await; buffer cache misses; index/table working set doesn’t fit.
- Lock-bound: threads waiting; lock graphs; long transactions; autovacuum blocked (Postgres) or metadata locks (MariaDB).
Second: identify the top 1–3 queries by total time, not just per-call time
The “slowest query” by per-call time is often a red herring. The top query by total time is the one that pays your mortgage.
Third: validate plans against reality
In Postgres, compare EXPLAIN (ANALYZE, BUFFERS) to estimates. In MariaDB, inspect EXPLAIN, handler read counts, and index usage. Look for scans you didn’t expect, filesorts, temp tables, and nested loops that should have been hash joins (Postgres) or should have been limited earlier.
Fourth: check index health and maintenance state
In Postgres: vacuum activity, dead tuples, bloat indicators, visibility map. In MariaDB: buffer pool hit rate, change buffer behavior, index size growth, and whether your PK choice is bloating every secondary index.
Fifth: only then create or drop indexes
If you create indexes before you know your bottleneck class, you’re just adding weight to a ship you haven’t inspected for holes.
Hands-on: 14 tasks with commands, outputs, and decisions
These are production-real tasks: run a command, interpret the output, and make a concrete decision. Commands are shown as if you’re on a box with client tools installed.
Task 1 (PostgreSQL): Find the biggest time consumers by total time
cr0x@server:~$ psql -d appdb -c "SELECT query, calls, total_exec_time::numeric(12,1) AS total_ms, mean_exec_time::numeric(10,2) AS mean_ms FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 5;"
query | calls | total_ms | mean_ms
--------------------------------------------------------------------+-------+------------+---------
SELECT ... FROM orders WHERE account_id=$1 AND status=$2 ORDER BY... | 98234 | 8543210.5 | 86.97
UPDATE inventory SET qty=qty-$1 WHERE sku=$2 | 45012 | 3011220.2 | 66.90
SELECT ... FROM events WHERE created_at >= $1 AND tenant_id=$2 | 12033 | 2210098.7 | 183.67
What it means: The first query is burning the most wall-clock time overall; even “moderate” mean latency adds up with high call volume.
Decision: Optimize that top query first. Don’t chase the rare 30-second report query unless it’s impacting SLA.
Task 2 (PostgreSQL): Compare estimates vs reality with buffers
cr0x@server:~$ psql -d appdb -c "EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE account_id=42 AND status='open' ORDER BY created_at DESC LIMIT 50;"
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.42..102.55 rows=50 width=128) (actual time=1.212..12.844 rows=50 loops=1)
Buffers: shared hit=120 read=480
-> Index Scan Backward using idx_orders_account_created on orders (cost=0.42..10523.31 rows=5200 width=128) (actual time=1.210..12.832 rows=50 loops=1)
Index Cond: (account_id = 42)
Filter: (status = 'open'::text)
Rows Removed by Filter: 940
Buffers: shared hit=120 read=480
Planning Time: 0.290 ms
Execution Time: 12.930 ms
What it means: The index helps the ORDER BY/LIMIT, but the filter on status is removing lots of rows. Buffers show real reads: you’re touching many pages.
Decision: Consider a composite index on (account_id, status, created_at DESC) or a partial index on open status if that’s stable. Also check whether “open” is a small fraction; partial indexes shine here.
Task 3 (PostgreSQL): Identify unused indexes (with caution)
cr0x@server:~$ psql -d appdb -c "SELECT relname AS table, indexrelname AS index, idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) AS size FROM pg_stat_user_indexes JOIN pg_class ON pg_class.oid=relid WHERE idx_scan=0 ORDER BY pg_relation_size(indexrelid) DESC LIMIT 10;"
table | index | idx_scan | size
----------+------------------------+----------+--------
events | idx_events_payload_gin | 0 | 2048 MB
orders | idx_orders_status | 0 | 512 MB
What it means: These indexes have not been scanned since stats reset. That does not guarantee they’re useless; they could be used rarely, or only for constraints, or stats were reset recently.
Decision: Validate with query logs, app patterns, and a safe test window. For Postgres, consider dropping truly unused indexes to reduce write and vacuum overhead.
Task 4 (PostgreSQL): Find index bloat signals via dead tuples
cr0x@server:~$ psql -d appdb -c "SELECT relname, n_live_tup, n_dead_tup, (n_dead_tup::numeric/(n_live_tup+1))*100 AS dead_pct FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 5;"
relname | n_live_tup | n_dead_tup | dead_pct
-----------+------------+------------+----------
events | 80000000 | 22000000 | 27.50
orders | 1200000 | 210000 | 17.50
What it means: Lots of dead tuples means vacuum pressure and potential index bloat. This can wreck index-only scans and inflate IO.
Decision: Tune autovacuum for these tables (per-table settings), and consider partitioning or write pattern changes if it’s chronic.
Task 5 (PostgreSQL): Check autovacuum activity and blockers
cr0x@server:~$ psql -d appdb -c "SELECT pid, now()-xact_start AS xact_age, wait_event_type, wait_event, query FROM pg_stat_activity WHERE state<>'idle' ORDER BY xact_start NULLS LAST LIMIT 8;"
pid | xact_age | wait_event_type | wait_event | query
------+------------+-----------------+--------------------+---------------------------------------------------
9123 | 02:41:10 | Lock | relation | VACUUM (ANALYZE) events
7331 | 02:42:55 | | | BEGIN; SELECT ... FROM events FOR UPDATE;
What it means: A long transaction holding locks can block vacuum progress. The vacuum is waiting; dead tuples accumulate; indexes bloat; performance declines.
Decision: Fix the app transaction pattern. Add statement timeouts, shorten transactions, and avoid idle-in-transaction sessions.
Task 6 (PostgreSQL): Confirm whether index-only scans are actually happening
cr0x@server:~$ psql -d appdb -c "SELECT relname, idx_scan, idx_tup_read, idx_tup_fetch FROM pg_stat_user_indexes JOIN pg_class ON pg_class.oid=relid WHERE relname='orders' ORDER BY idx_scan DESC;"
relname | idx_scan | idx_tup_read | idx_tup_fetch
---------+----------+--------------+--------------
orders | 1200000 | 98000000 | 97000000
What it means: idx_tup_fetch nearly equals idx_tup_read, implying many heap fetches. Index-only scans aren’t saving you much.
Decision: Improve vacuum effectiveness and consider whether the index can be made covering with INCLUDE (Postgres) or whether the query shape should change.
Task 7 (MariaDB): Find top statements by total time (Performance Schema)
cr0x@server:~$ mariadb -e "SELECT DIGEST_TEXT, COUNT_STAR, ROUND(SUM_TIMER_WAIT/1000000000000,2) AS total_s, ROUND(AVG_TIMER_WAIT/1000000000000,4) AS avg_s FROM performance_schema.events_statements_summary_by_digest ORDER BY SUM_TIMER_WAIT DESC LIMIT 3\G"
*************************** 1. row ***************************
DIGEST_TEXT: SELECT * FROM orders WHERE account_id = ? AND status = ? ORDER BY created_at DESC LIMIT ?
COUNT_STAR: 98234
total_s: 8543.21
avg_s: 0.0869
What it means: Same principle: total time identifies the real bully query.
Decision: Focus indexing and query changes on these few statements first. Don’t index for a query that runs twice a day unless it blocks the world.
Task 8 (MariaDB): Explain a query and watch for filesort/temp
cr0x@server:~$ mariadb -e "EXPLAIN SELECT * FROM orders WHERE account_id=42 AND status='open' ORDER BY created_at DESC LIMIT 50\G"
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: orders
type: ref
possible_keys: idx_orders_account_created,idx_orders_status
key: idx_orders_account_created
key_len: 8
ref: const
rows: 5200
Extra: Using where; Using filesort
What it means: “Using filesort” means the engine expects to sort rather than read rows in the desired order from the index. That often implies your index doesn’t match the ORDER BY after filtering.
Decision: Consider an index that matches both filter and sort order: (account_id, status, created_at). Validate cardinality; don’t blindly pile on indexes.
Task 9 (MariaDB): Measure index usage by Handler counters (quick smell test)
cr0x@server:~$ mariadb -e "SHOW GLOBAL STATUS LIKE 'Handler_read%';"
+-----------------------+-----------+
| Variable_name | Value |
+-----------------------+-----------+
| Handler_read_first | 182349 |
| Handler_read_key | 982341234 |
| Handler_read_next | 774123998 |
| Handler_read_rnd_next | 889120044 |
+-----------------------+-----------+
What it means: High Handler_read_rnd_next usually indicates lots of table scans (or index scans behaving like scans). High Handler_read_next suggests range scans. This is crude, but fast.
Decision: If scans are skyrocketing, identify the queries and missing/unused indexes—or accept that you’re IO-bound and need query changes or partitioning.
Task 10 (MariaDB): See index sizes and spot “PK too wide” collateral damage
cr0x@server:~$ mariadb -e "SELECT table_name, index_name, ROUND(stat_value*@@innodb_page_size/1024/1024,1) AS index_mb FROM mysql.innodb_index_stats WHERE database_name='appdb' AND stat_name='size' AND table_name='orders' ORDER BY index_mb DESC;"
+------------+---------------------------+----------+
| table_name | index_name | index_mb |
+------------+---------------------------+----------+
| orders | PRIMARY | 820.0 |
| orders | idx_orders_account_status | 740.0 |
| orders | idx_orders_created | 610.0 |
+------------+---------------------------+----------+
What it means: Secondary indexes can be nearly as large as PRIMARY because they carry the PK in their leaf entries. If your PK is a big UUID string, you’re paying for it everywhere.
Decision: Consider compact PKs (integer surrogate keys) or binary UUID storage where appropriate. Revisit whether all secondary indexes are needed.
Task 11 (PostgreSQL): Test a new index without blocking writes (concurrently)
cr0x@server:~$ psql -d appdb -c "CREATE INDEX CONCURRENTLY idx_orders_account_status_created ON orders (account_id, status, created_at DESC) INCLUDE (total_amount);"
CREATE INDEX
What it means: CONCURRENTLY avoids long write blocks, but it takes longer and can fail if there are conflicting operations.
Decision: Use concurrently in production unless you’re in a controlled maintenance window and accept blocking. Then re-run EXPLAIN (ANALYZE) to confirm plan changes.
Task 12 (MariaDB): Add an index online (InnoDB) and validate the algorithm
cr0x@server:~$ mariadb -e "ALTER TABLE orders ADD INDEX idx_orders_account_status_created (account_id, status, created_at), ALGORITHM=INPLACE, LOCK=NONE;"
Query OK, 0 rows affected (12 min 31.44 sec)
What it means: In-place, lock-none generally means less disruptive DDL, but the operation still consumes IO and can hurt latency while building.
Decision: Schedule during low traffic, monitor replication lag, and confirm with EXPLAIN that it’s actually used. If it’s not used, you just bought write overhead for nothing.
Task 13 (PostgreSQL): Find duplicate or redundant indexes
cr0x@server:~$ psql -d appdb -c "SELECT i1.relname AS index1, i2.relname AS index2, pg_get_indexdef(i1.oid) AS def1, pg_get_indexdef(i2.oid) AS def2 FROM pg_class i1 JOIN pg_index ix1 ON ix1.indexrelid=i1.oid JOIN pg_class i2 ON i2.relkind='i' JOIN pg_index ix2 ON ix2.indexrelid=i2.oid WHERE ix1.indrelid=ix2.indrelid AND i1.oid<>i2.oid AND ix1.indkey=ix2.indkey LIMIT 3;"
index1 | index2 | def1 | def2
----------------------+------------------------+----------------------------------------+----------------------------------------
idx_orders_account | idx_orders_account_v2 | CREATE INDEX ... (account_id) | CREATE INDEX ... (account_id)
What it means: Two indexes with identical keys are usually redundant unless one is partial, different opclass, or different predicate.
Decision: Drop the redundant one after validating dependencies and usage. Less index count reduces write cost and vacuum work.
Task 14 (MariaDB): Verify optimizer actually chooses your intended index
cr0x@server:~$ mariadb -e "EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE account_id=42 AND status='open' ORDER BY created_at DESC LIMIT 50\G"
*************************** 1. row ***************************
EXPLAIN: {
"query_block": {
"select_id": 1,
"table": {
"table_name": "orders",
"access_type": "range",
"possible_keys": ["idx_orders_account_status_created"],
"key": "idx_orders_account_status_created",
"rows_examined_per_scan": 60,
"filtered": 100,
"using_filesort": false
}
}
}
What it means: The engine chose the composite index, expects low rows per scan, and no filesort. That’s what you wanted.
Decision: Ship it, then watch write latency, replication lag, and buffer pool misses. A “read win” that causes a “write loss” is still a loss.
Three corporate mini-stories from the indexing trenches
Mini-story 1: The incident caused by a wrong assumption (the “UUID PK doesn’t matter” fallacy)
The company ran a multi-tenant SaaS on MariaDB with InnoDB. A new service was rolled out with a table design that used a string UUID as the primary key. The reasoning was standard: globally unique IDs simplify merges and avoid coordination. They had read the blog posts. They had the stickers.
Within a week, write latency drifted upward. Not a spike, a slope. The on-call saw increasing IO and a buffer pool hit rate that looked like it was slowly giving up on life. Replication lag started to appear during peak traffic, then stuck around after peaks ended.
The wrong assumption: “Primary key choice mostly affects the table, not the indexes.” In InnoDB, every secondary index leaf stores the primary key. Their “harmless” PK turned into a tax on every other index. They had also added a handful of secondary indexes to support analytics filters, multiplying the tax.
They fixed it the boring way: introduced an integer surrogate primary key and kept the UUID as a unique secondary key for external references. The migration wasn’t fun, but the system stopped bleeding. The team learned a rule that should be printed on onboarding docs: in InnoDB, your primary key is a storage decision, not just an identity decision.
Mini-story 2: The optimization that backfired (covering index mania)
A different org ran PostgreSQL for an event ingestion platform. A staff engineer went on an index-only scan campaign. The logic was solid: reduce heap fetches by making indexes “cover” common queries. They added a few multi-column B-tree indexes with INCLUDE columns, and query latency improved in staging.
In production, things were fine for a month. Then ingestion volume increased, updates became more frequent (status changes and enrichment), and autovacuum started falling behind. The visibility map bits weren’t getting set quickly enough, and index-only scans quietly turned into index scans plus heap fetches. Meanwhile, the bigger indexes increased vacuum and checkpoint pressure.
Symptoms appeared as periodic latency spikes and IO storms. The irony was painful: the “index-only” project increased the amount of data that had to be vacuumed and cached, which made it harder for the very precondition of index-only scans (visibility) to stay true.
The fix was not “add more indexes.” They rolled back the widest INCLUDE columns, split workloads by partitioning the hottest event table, and tuned autovacuum thresholds per partition. Index-only scans returned where they made sense, and the system stopped oscillating between “fast” and “on fire.”
Mini-story 3: The boring but correct practice that saved the day (index changes as reversible deployments)
A fintech team ran both MariaDB and PostgreSQL across different services. They had a habit that looked tedious: every schema/index change had a rollback plan, and they treated index deployment like an application deployment. The DBA wasn’t trying to be difficult; they were trying to keep weekends intact.
When a sudden query regression hit after a minor release, the root cause was a planner flip: Postgres started preferring a new index that looked cheaper in estimates but was worse in reality due to skew. Because the team had deployed the index in a separate change window and tagged it, they could isolate the issue quickly.
They didn’t scramble to rewrite queries under pressure. They simply disabled the path by dropping the index concurrently (or, in other cases, using planner hints was avoided; they preferred structural fixes). Then they gathered real stats and redesigned the index to match the dominant access pattern.
The “boring practice” was not heroics. It was change hygiene: deploy one performance-affecting change at a time, measure, and keep rollback cheap. That practice doesn’t trend on social media, but it prevents incident bridges.
Common mistakes: symptom → root cause → fix
1) Writes suddenly slow after “read optimization”
- Symptom: Insert/update latency rises, replication lag increases, CPU and IO both climb.
- Root cause: Too many secondary indexes, or a new wide composite/covering index added to a hot table.
- Fix: Remove redundant indexes; keep only those supporting top queries. In Postgres, prefer targeted indexes and partial indexes; in MariaDB, keep PK compact and avoid indexing “everything.”
2) “Index exists” but query still scans or sorts
- Symptom: MariaDB EXPLAIN shows “Using filesort” or “Using temporary”; Postgres plan shows Seq Scan or Sort.
- Root cause: Index doesn’t match the predicate order (composite order wrong), or predicate uses functions/casts preventing index use, or low selectivity makes scan cheaper.
- Fix: Align index with filter + order; rewrite predicates to be sargable (avoid wrapping column in function); consider partial index (Postgres) or computed column strategies (MariaDB) where applicable.
3) Planner chooses a terrible plan after data grows
- Symptom: A query that was fine last week now spikes; EXPLAIN shows nested loop on huge row counts, wrong join order, or wrong index.
- Root cause: Stale stats, skewed distribution, parameter-sensitive plans, or missing extended statistics (Postgres).
- Fix: Run ANALYZE; increase statistics targets for skewed columns; use extended statistics in Postgres; in MariaDB, refresh persistent stats/histograms where available and validate with JSON EXPLAIN.
4) Postgres index-only scans don’t stay index-only
- Symptom: Plans show Index Only Scan but heap fetches are high; performance regresses during write bursts.
- Root cause: Visibility map not set due to vacuum lag; frequent updates invalidate all-visible bits.
- Fix: Tune autovacuum (scale factors, cost delay) per hot table/partition; reduce update churn; reconsider INCLUDE bloat; partition hot data.
5) MariaDB buffer pool hit rate collapses “for no reason”
- Symptom: More disk reads, higher latency, even though query mix looks similar.
- Root cause: Index working set grew (new indexes, PK widened, or index cardinality explosion) and no longer fits in memory.
- Fix: Drop unused indexes; keep PK compact; adjust buffer pool sizing; ensure your hot indexes fit in memory, not your cold ones.
6) Lock waits spike after adding/removing indexes
- Symptom: Lock timeouts, DDL waits, blocked writers.
- Root cause: Non-online DDL, long transactions, or schema changes competing with hot traffic.
- Fix: In Postgres use
CREATE INDEX CONCURRENTLYandDROP INDEX CONCURRENTLY; in MariaDB useALGORITHM=INPLACE,LOCK=NONEwhere supported; fix long transactions first.
Checklists / step-by-step plan
A pragmatic indexing workflow (works for both engines)
- Pick the target: choose top queries by total time, not “slowest once.”
- Get a real plan: Postgres:
EXPLAIN (ANALYZE, BUFFERS). MariaDB:EXPLAIN FORMAT=JSONplus status counters. - Classify the pain: CPU vs IO vs locks. If you don’t know which, you’re not ready to index.
- Propose the smallest index that helps: don’t make it covering “just in case.”
- Check write impact: is the table hot? how many writes/sec? how many existing indexes?
- Deploy safely: Postgres concurrently; MariaDB in-place/online; watch replication lag.
- Verify usage: confirm the index is chosen and reduces rows/IO. If not used, remove it.
- Measure side effects: buffer cache, vacuum, checkpoint/redo, IO wait, lock waits.
- Reassess quarterly: indexes are not evergreen; workloads rot.
Index design checklist: do and don’t
- Do: match composite indexes to your most common predicate + order. In OLTP, optimize the main path.
- Do: in Postgres, use partial indexes for “status=active” style patterns when the predicate is stable and selective.
- Do: in InnoDB, keep the primary key compact and avoid random, wide PKs if you care about secondary index size.
- Don’t: index low-selectivity columns alone and expect miracles.
- Don’t: add INCLUDE columns until you’ve proven heap fetches are the bottleneck and vacuum can keep up.
- Don’t: “fix” a bad query by adding three indexes if rewriting the query removes the need for sorting or scanning.
When to choose MariaDB-style tactics vs Postgres-style tactics
- If PK bloat is killing you: MariaDB/InnoDB often benefits from PK redesign; Postgres less so, but still benefits from index width discipline.
- If skewed predicates mislead the planner: Postgres gives you more statistics tools; use them before adding redundant indexes.
- If you need partial indexes and richer index types: Postgres is usually the better toolbox.
- If you need operational predictability for simple B-tree workloads: MariaDB can be straightforward—until you over-index and the buffer pool becomes a battleground.
FAQ
1) Should I “index every foreign key” in MariaDB and PostgreSQL?
Usually yes for OLTP join performance, but not blindly. In MariaDB, missing FK indexes can cause lock amplification and slow deletes/updates. In Postgres, joins can still work without them, but you’ll often pay in scans. Verify with top queries and actual plans.
2) Is a composite index always better than multiple single-column indexes?
No. Composite is better when your query uses those columns together in a way the index can exploit (filter + order, or selective prefix). Multiple single-column indexes can still be useful, and Postgres can sometimes combine them (bitmap index scans), but that can turn into extra heap work.
3) Why does adding an index sometimes make a query slower?
Because the planner chooses it and gets it wrong, or because the index causes random IO that’s worse than a sequential scan. Also, a new index can change plan shape (join order, nested loops) in ways that look cheaper in estimates but aren’t.
4) Do covering indexes always help?
No. In MariaDB, making an index “cover” by adding columns increases index size and write cost. In Postgres, INCLUDE can reduce heap fetches, but only if vacuum keeps visibility maps healthy. Covering is a tool, not a lifestyle.
5) How do I know if my Postgres stats are lying?
When estimated rows are wildly different from actual rows in EXPLAIN (ANALYZE), or when the plan flips after minor changes. Fix with ANALYZE, per-column stats targets, and extended statistics for correlated columns.
6) Can I just force index usage with hints?
In MariaDB you can, and people do. In Postgres, hints aren’t built-in (extensions exist). Forcing plans is operational debt: it locks you into a plan that may become wrong as data changes. Prefer making the correct plan naturally cheapest via schema, stats, and query shape.
7) What’s the fastest safe way to remove an index?
Postgres: DROP INDEX CONCURRENTLY to avoid blocking writers. MariaDB: dropping indexes is usually fast but still can lock metadata; do it during calm periods and watch for replication lag or DDL lock waits.
8) Why does a boolean/status index sometimes help in Postgres but not in MariaDB?
Postgres can exploit partial indexes like WHERE status='open' and avoid indexing the rest, which increases selectivity. MariaDB doesn’t have the same partial-index feature; a low-cardinality full index is often not selective enough to beat a scan.
9) Is partitioning an indexing solution?
Partitioning is a data management solution that can make indexing cheaper by reducing per-partition index size and vacuum/maintenance cost. It can also complicate queries and plans. Do it when a single table’s hot set and cold set have different operational needs.
10) How many indexes is “too many”?
When your write latency and maintenance cost dominate, or when your cache can’t hold the working set. The number isn’t universal. Count indexes per hot table, measure write amplification, and ruthlessly delete the ones that don’t pay rent.
Next steps you can actually do
If you’re running MariaDB or PostgreSQL in production and indexing is drifting into folklore, do this in the next week:
- Pull the top queries by total time (pg_stat_statements or performance_schema digests). Make a short list.
- For each query, capture a real plan (Postgres: ANALYZE+BUFFERS; MariaDB: JSON explain). Save it as an artifact.
- Classify the bottleneck (CPU/IO/locks). If you can’t, add observability before you add indexes.
- Delete one truly unused or redundant index on a hot table, with a rollback plan. Measure write latency before/after.
- Add one targeted index using online/concurrent methods, then prove it’s used and improves the right metric (not just the benchmark).
- Schedule maintenance sanity checks: vacuum health in Postgres, buffer pool and index size growth in MariaDB.
The point isn’t to become an indexing philosopher. It’s to keep latency predictable, costs bounded, and your on-call rotation boring. Boring is the premium plan.