MySQL vs MariaDB: the “default” choice that secretly slows your VPS

Was this helpful?

The first time you notice it, it feels like the network. Pages “randomly” hang, API calls time out, and your VPS load average does a convincing impression of modern art.
Then you look closer and realize the database is doing what databases do: quietly turning minor configuration sins into major latency.

On a small VPS, “default” is not neutral. Default is a set of assumptions made for someone else’s machine: different RAM, different storage, different workload, different tolerance for risk.
Choose MySQL or MariaDB based on package manager inertia and you can end up paying for it in IOPS, memory pressure, and replication surprises.

The thesis: defaults hurt small boxes

On a VPS, the database isn’t just another process. It’s the process that turns CPU cycles into user-visible waiting.
If you’re on 1–4 vCPUs with modest RAM and a shared storage backend, the difference between “fine” and “pain” is often
a single default: buffer pool sizing, durability settings, fsync behavior, or a background thread that’s correct for a beefy server
but punishing on a tiny one.

MySQL and MariaDB are close enough that people treat the choice like picking a brand of bottled water. But the internals have diverged
for years, and the packaging defaults (especially on distro builds) can put you in slow-motion trouble:

  • Too-small InnoDB buffer pool → constant reads, elevated disk IOPS, and tail latency.
  • Over-eager flushing → storage queue depth spikes, stalls, and “mystery” timeouts.
  • Wrong SQL modes or optimizer behavior → plans that look fine until the table grows.
  • Assuming “drop-in replacement” for replication or tooling → painful migrations at the worst time.

The right way to choose is boring: decide based on operational compatibility, predictable performance on your workload, and upgrade path.
If you’re on a typical VPS running a web app with InnoDB tables, I’ll be blunt: pick the engine you can upgrade confidently and tune correctly,
then make the defaults yours. The enemy is not MySQL or MariaDB. The enemy is letting the package decide your production posture.

Facts and history that matter in production

A few concrete facts and context points that explain why “they’re basically the same” stopped being true:

  1. MariaDB forked from MySQL in 2009 after concerns about Oracle’s acquisition of Sun. The intent was continuity; the reality became divergence.
  2. MySQL 8.0 removed the query cache (it was a global mutex magnet). MariaDB kept a variant longer, and many distros still ship configs that mention it.
  3. MySQL 8.0 uses a transactional data dictionary inside InnoDB. MariaDB implemented metadata differently. This affects upgrade behavior, crash recovery feel, and tooling assumptions.
  4. MariaDB introduced Aria as a MyISAM replacement for internal temporary tables and crash safety. It can be useful, but it’s another moving part.
  5. MySQL added “instant ADD COLUMN” (in many cases) and improved online DDL behavior over time; MariaDB has its own online DDL story and edge cases.
  6. GTID implementations differ between MySQL and MariaDB. They share the acronym, not full compatibility. This matters when you try to fail over at 2 a.m.
  7. Default authentication plugins diverged (notably around MySQL 8’s defaults). Your app drivers and legacy clients may care more than you do.
  8. Both optimize for throughput, not your VPS—meaning background flushing and thread concurrency can outperform benchmarks while making latency spiky on constrained storage.

One quote that’s still the best summary of why this stuff matters:
paraphrased ideaWerner Vogels: everything fails eventually, so you design and operate assuming failure, not hoping it won’t happen.

What actually slows a VPS database

“Database is slow” is not a diagnosis. On a VPS it’s usually one (or more) of these categories:

1) Storage latency and fsync behavior (the silent killer)

VPS storage is often network-backed, oversubscribed, or aggressively cached. Your benchmark might show 20k IOPS, but your database needs
consistent latency, not bursty hero numbers. InnoDB is a durability machine: it writes redo, flushes, and expects stable fsync.
If fsync latency spikes, transactions pile up, threads stall, and your app starts blaming “the database” like it’s a sentient entity.

The worst part: your CPU may be idle while everything waits on storage. You’ll see low CPU, high load average, and a queue of threads in “waiting for handler commit”.

2) Memory pressure: tiny buffer pool + Linux page cache fights

On a small VPS, you don’t have enough RAM to waste. If your InnoDB buffer pool is too small, you churn pages and hit disk constantly.
If it’s too big, the OS starts reclaiming aggressively, your box swaps, and now everything is slow, including the database logs, because the kernel is in a bad mood.

The “correct” buffer pool is not a mantra like “set it to 80%”. It depends on what else runs on the box, how big your working set is,
and whether you value stable latency over peak cache hit rate.

3) Concurrency defaults that don’t match your vCPU count

Both MySQL and MariaDB can spin up background threads and concurrency behaviors that are fine on 16 cores but silly on 2.
If you see mutex contention, high context switching, or thread scheduling overhead, the bottleneck isn’t SQL. It’s resource choreography.

4) Query plans that “worked yesterday”

On small datasets, bad indexing is hidden by cache. On a VPS, the moment the dataset stops fitting in RAM,
a missing composite index turns into random I/O and a slow query log full of shame.
Differences in optimizer behavior between MySQL and MariaDB can change which queries fall off a cliff first.

Joke #1: If you can’t reproduce the slowdown in staging, congratulations—you’ve built a staging environment that’s great at lying.

Practical differences: MySQL 8 vs MariaDB today

Let’s talk about choices that change outcomes on a VPS, not ideological debates.

Packaging and “default” config is half the battle

On Debian/Ubuntu families, installing mysql-server may give you MariaDB depending on the distro and version.
On some distros, MariaDB ships with config snippets that include legacy knobs or conservative defaults.
MySQL community packages often have their own defaults that assume you’ll tune for your hardware.

The dangerous pattern: you think you chose “MySQL”, but you chose “whatever the distro maintainers tested for general compatibility”.
That can be correct for “it starts”, and wrong for “it stays fast”.

Replication and operational tooling compatibility

If you replicate, the choice matters more.
MariaDB replication has features MySQL doesn’t, and vice versa. But the big gotcha is that GTID sets and semantics aren’t interchangeable.
If you plan to switch engines later, you’re signing up for a migration project, not a package upgrade.

For a single-node VPS, replication may still exist (read replicas, migration replicas, DR).
If you ever want to create a new replica quickly, “compatible GTIDs and binlog formats” is not a cute detail.

Performance isn’t “which is faster”, it’s “which is predictable”

Either can be fast. Both can be slow. The practical question is: which one gives you the least surprising behavior under pressure,
with the least confusing upgrade and configuration semantics, on the OS you run?

  • If you need MySQL 8 features (modern JSON behavior, data dictionary changes, ecosystem compatibility), pick MySQL and tune it properly.
  • If you’re on a distro that treats MariaDB as the default and your stack is already compatible, MariaDB is fine—just audit the defaults and remove legacy cruft.
  • If you need to move between managed services and self-hosting, MySQL compatibility can reduce friction. But don’t assume; verify with your drivers and SQL modes.

Fast diagnosis playbook (first/second/third)

This is what you do when the VPS is slow and you need answers before your users start writing poetry in your support inbox.

First: prove whether you’re CPU-bound, I/O-bound, or memory-bound

  • Check storage wait and run queue: iostat, vmstat.
  • Check memory pressure and swap: free, sar (if available).
  • Check MySQL/MariaDB thread states: processlist / performance schema / status counters.

Second: identify the top wait inside the database

  • InnoDB metrics: buffer pool hit rate, log waits, checkpoint age behavior.
  • Slow queries: enable slow query log briefly if safe, or use existing logs.
  • Locking: look for transactions waiting on row locks or metadata locks.

Third: decide if the fix is query/index, config, or storage

  • If disk await is high and buffer pool is tiny: tune memory and flushing first.
  • If disk is fine but query time spikes: look at plans and indexes.
  • If everything is “fine” but latency is jagged: look at fsync behavior, background flushing, and burst credits on your provider.

Hands-on tasks: commands, outputs, decisions

These are real tasks you can run on a Linux VPS. Each includes the command, example output, what it means, and the decision you make.
Run them as a checklist during an incident, or as a baseline audit when you inherit a server that “has always been like this”.

Task 1: Confirm what database you’re actually running

cr0x@server:~$ mysql --version
mysql  Ver 15.1 Distrib 10.11.6-MariaDB, for debian-linux-gnu (x86_64) using  EditLine wrapper

What it means: That’s MariaDB 10.11, not Oracle MySQL. People confuse the client and server, but this output is usually a solid hint.

Decision: Stop saying “MySQL” in tickets. If you plan MySQL-specific tuning or features, re-check server package origin and capabilities.

Task 2: Confirm the server daemon and package source

cr0x@server:~$ systemctl status mariadb --no-pager
● mariadb.service - MariaDB 10.11.6 database server
     Loaded: loaded (/lib/systemd/system/mariadb.service; enabled; preset: enabled)
     Active: active (running) since Mon 2025-12-29 08:12:31 UTC; 2h 14min ago
       Docs: man:mariadbd(8)
             https://mariadb.com/kb/en/library/systemd/
   Main PID: 742 (mariadbd)
     Status: "Taking your SQL requests now..."
      Tasks: 41 (limit: 2275)
     Memory: 1.2G
        CPU: 38min 12.331s

What it means: The service name and main binary identify the fork. Also note memory usage: on a small VPS, 1.2G might already be the whole box.

Decision: If you’re memory-tight, move immediately to buffer pool sizing and connection/thread settings before touching query micro-optimizations.

Task 3: Check RAM, swap, and whether you’re already losing

cr0x@server:~$ free -h
               total        used        free      shared  buff/cache   available
Mem:           3.8Gi       3.1Gi       110Mi        62Mi       640Mi       210Mi
Swap:          2.0Gi       1.4Gi       600Mi

What it means: You are swapping. “available” is tiny. The kernel is reclaiming and the DB will stall unpredictably.

Decision: Reduce memory footprint (buffer pool too big, too many connections, large tmp tables) or upgrade RAM. Tuning without fixing swap is performance cosplay.

Task 4: Determine if you’re I/O-bound right now

cr0x@server:~$ iostat -x 1 3
Linux 6.1.0-18-amd64 (server) 	12/29/2025 	_x86_64_	(2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          12.00    0.00    6.00   28.00    0.50   53.50

Device            r/s     w/s   rkB/s   wkB/s  rrqm/s  wrqm/s  %util  await
vda             110.0   240.0  4200.0  9200.0     0.0     2.0   96.0   18.7

What it means: %util near 100 and await ~19ms: storage is saturated and slow. That’s how you get “random” timeouts.

Decision: Reduce writes (flush settings, binlog sync, transaction batching), increase cache hit rate, or move to faster storage. Also check provider burst limits.

Task 5: Check whether your filesystem is full or near-full

cr0x@server:~$ df -h /var/lib/mysql
Filesystem      Size  Used Avail Use% Mounted on
/dev/vda1        50G   46G  2.0G  96% /

What it means: 96% used is not “fine”. InnoDB needs room for temporary files, logs, and internal operations.

Decision: Free space now (logs, backups, old binlogs) or expand disk. Do not wait for “disk full” corruption-adjacent drama.

Task 6: Identify the data directory and check doublewrite/redo pressure context

cr0x@server:~$ mysql -e "SHOW VARIABLES LIKE 'datadir'; SHOW VARIABLES LIKE 'innodb_doublewrite'; SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit';"
+---------------+---------------------+
| Variable_name | Value               |
+---------------+---------------------+
| datadir       | /var/lib/mysql/     |
+---------------+---------------------+
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| innodb_doublewrite| ON    |
+-------------------+-------+
+--------------------------------+-------+
| Variable_name                  | Value |
+--------------------------------+-------+
| innodb_flush_log_at_trx_commit | 1     |
+--------------------------------+-------+

What it means: Durability is set to safest defaults. On cheap VPS storage, this can translate into a lot of fsync waits.

Decision: For truly critical data, keep it. For “we can replay from a queue” systems, consider changing innodb_flush_log_at_trx_commit to 2, but only with explicit risk acceptance.

Task 7: Check buffer pool size vs RAM

cr0x@server:~$ mysql -e "SHOW VARIABLES LIKE 'innodb_buffer_pool_size';"
+-------------------------+------------+
| Variable_name           | Value      |
+-------------------------+------------+
| innodb_buffer_pool_size | 134217728  |
+-------------------------+------------+

What it means: 128MiB buffer pool on a 4GiB server is almost certainly too small unless the dataset is tiny.

Decision: Increase it thoughtfully (often 1–2GiB on a 4GiB VPS, depending on co-located services). Then watch swap and OOM risk.

Task 8: Check buffer pool efficiency and reads-from-disk rate

cr0x@server:~$ mysql -e "SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_read%';"
+---------------------------------------+-----------+
| Variable_name                         | Value     |
+---------------------------------------+-----------+
| Innodb_buffer_pool_read_requests      | 982334112 |
| Innodb_buffer_pool_reads              | 18399244  |
+---------------------------------------+-----------+

What it means: Some disk reads are normal. But if Innodb_buffer_pool_reads climbs rapidly during peak, you’re missing cache or indexing.

Decision: If reads from disk correlate with latency spikes, increase buffer pool (if RAM allows) and/or fix indexes. Don’t guess: correlate with time windows.

Task 9: See what the server thinks it’s waiting on (quick triage)

cr0x@server:~$ mysql -e "SHOW FULL PROCESSLIST;" | head -n 15
Id	User	Host	db	Command	Time	State	Info
112	app	10.0.1.25:53142	prod	Query	12	Waiting for handler commit	UPDATE orders SET status='paid' WHERE id=?
118	app	10.0.1.25:53159	prod	Query	11	Waiting for handler commit	INSERT INTO payments(order_id, ...) VALUES (...)
121	app	10.0.1.25:53168	prod	Query	9	Waiting for handler commit	UPDATE inventory SET ...
130	app	10.0.1.25:53201	prod	Sleep	55			NULL

What it means: “Waiting for handler commit” often points to redo log flush/fsync pressure (durability + storage latency).

Decision: Focus on disk/fsync path: redo log sizing, flush policy, storage performance, binlog sync settings.

Task 10: Check slow query log configuration (and enable briefly if needed)

cr0x@server:~$ mysql -e "SHOW VARIABLES LIKE 'slow_query_log'; SHOW VARIABLES LIKE 'slow_query_log_file'; SHOW VARIABLES LIKE 'long_query_time';"
+----------------+-------+
| Variable_name  | Value |
+----------------+-------+
| slow_query_log | OFF   |
+----------------+-------+
+---------------------+--------------------------+
| Variable_name       | Value                    |
+---------------------+--------------------------+
| slow_query_log_file | /var/log/mysql/slow.log  |
+---------------------+--------------------------+
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| long_query_time | 10.000000 |
+-----------------+-------+

What it means: It’s off and long_query_time is high. On a VPS, a 1–2s query can already ruin tail latency.

Decision: Temporarily enable with a lower threshold during a controlled window, then turn it back off or rotate aggressively to avoid I/O overhead.

Task 11: Verify binary logging and sync settings (write amplification audit)

cr0x@server:~$ mysql -e "SHOW VARIABLES LIKE 'log_bin'; SHOW VARIABLES LIKE 'sync_binlog'; SHOW VARIABLES LIKE 'binlog_format';"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin       | ON    |
+---------------+-------+
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| sync_binlog   | 1     |
+---------------+-------+
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW   |
+---------------+-------+

What it means: This is safe but expensive. sync_binlog=1 adds fsync pressure. On fragile storage, this can dominate.

Decision: If you don’t replicate and don’t need point-in-time recovery from binlogs, consider disabling binlog. If you do need it, consider sync_binlog=100 (risk tradeoff) and document the blast radius.

Task 12: Check temporary table behavior (hidden disk usage)

cr0x@server:~$ mysql -e "SHOW GLOBAL STATUS LIKE 'Created_tmp%'; SHOW VARIABLES LIKE 'tmp_table_size'; SHOW VARIABLES LIKE 'max_heap_table_size';"
+-------------------------+---------+
| Variable_name           | Value   |
+-------------------------+---------+
| Created_tmp_disk_tables | 182331  |
| Created_tmp_files       | 44219   |
| Created_tmp_tables      | 903112  |
+-------------------------+---------+
+----------------+----------+
| Variable_name  | Value    |
+----------------+----------+
| tmp_table_size | 16777216 |
+----------------+----------+
+---------------------+----------+
| Variable_name       | Value    |
+---------------------+----------+
| max_heap_table_size | 16777216 |
+---------------------+----------+

What it means: Many temp tables spill to disk, and the size limits are small (16MiB). Sorting/grouping workloads will hammer storage.

Decision: Increase these carefully (they’re per-connection potential), and fix queries that create huge temp tables. On VPS, disk temp tables are latency landmines.

Task 13: Check connection count and whether you’re thread-thrashing

cr0x@server:~$ mysql -e "SHOW GLOBAL STATUS LIKE 'Threads_connected'; SHOW GLOBAL STATUS LIKE 'Threads_running'; SHOW VARIABLES LIKE 'max_connections';"
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| Threads_connected | 312   |
+-------------------+-------+
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| Threads_running | 46    |
+-----------------+-------+
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 500   |
+-----------------+-------+

What it means: 312 connections on a small VPS is often self-harm. Even if they’re sleeping, memory overhead and contention add up.

Decision: Fix pooling at the app layer, lower max_connections, and size thread cache appropriately. If you need hundreds of connections, you probably need a bigger box or a proxy layer.

Task 14: Check InnoDB log configuration (checkpoint pressure vs recovery time)

cr0x@server:~$ mysql -e "SHOW VARIABLES LIKE 'innodb_log_file_size'; SHOW VARIABLES LIKE 'innodb_log_files_in_group';"
+---------------------+----------+
| Variable_name       | Value    |
+---------------------+----------+
| innodb_log_file_size| 50331648 |
+---------------------+----------+
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_log_files_in_group| 2     |
+--------------------------+-------+

What it means: Redo logs totaling ~96MiB is small for write-heavy workloads. That can force frequent checkpoints and more background flushing.

Decision: Increase log size if you’re write-heavy and storage is jittery, balancing recovery time. Plan the change correctly (requires restart and file handling).

Task 15: Confirm actual config files in play (stop tuning the wrong file)

cr0x@server:~$ mysql -e "SHOW VARIABLES LIKE 'pid_file'; SHOW VARIABLES LIKE 'log_error';"
+---------------+------------------------+
| Variable_name | Value                  |
+---------------+------------------------+
| pid_file      | /run/mysqld/mysqld.pid |
+---------------+------------------------+
+---------------+--------------------------+
| Variable_name | Value                    |
+---------------+--------------------------+
| log_error     | /var/log/mysql/error.log |
+---------------+--------------------------+

What it means: You now know where to look for startup diagnostics and can confirm which instance is running.

Decision: Open the error log and confirm no warnings about ignored variables (common when copying configs between MySQL and MariaDB).

Task 16: Inspect error log for ignored settings or crash recovery hints

cr0x@server:~$ sudo tail -n 40 /var/log/mysql/error.log
2025-12-29 08:12:30 0 [Note] mariadbd: ready for connections.
2025-12-29 09:41:02 0 [Warning] 'query_cache_size' is deprecated and will be removed in a future release.
2025-12-29 10:07:11 0 [Warning] Aborted connection 118 to db: 'prod' user: 'app' host: '10.0.1.25' (Got timeout reading communication packets)

What it means: Deprecated knobs may still work today but indicate config drift. Aborted connections can be app timeouts or server stalls.

Decision: Remove deprecated settings and treat aborted connections as a symptom: correlate with storage await and query stalls.

Three corporate mini-stories (the kind you don’t post in Slack)

Mini-story 1: The incident caused by a wrong assumption

A company ran a set of small regional VPS nodes for latency-sensitive workloads. The database tier was “MySQL” in everyone’s vocabulary.
New nodes were provisioned from a distro image, and nobody questioned it. Why would they? The app connected, tables existed, replication “worked”.

During a maintenance window, they introduced a replica in a different region and planned a quick failover drill.
The drill failed in a boring but catastrophic way: the GTID-based tooling expected MySQL semantics and reported conflicting state.
The on-call engineer tried to “repair” it with commands that were correct for one engine and subtly wrong for the other.

The result wasn’t immediate data loss. It was worse: a partial failover with an application that believed it was writing to the primary,
while a subset of traffic hit a node that was not replicating cleanly. The incident lasted long enough for customer-facing inconsistencies to show up.

The postmortem had one line that mattered: the assumption that “MariaDB is a drop-in MySQL replacement” was treated as fact,
so no one documented which engine was actually deployed. That assumption propagated into automation, monitoring thresholds, and runbooks.

The fix was dull and effective: inventory every node, standardize engine/version per environment, and add a startup check in the deployment pipeline
that asserts the expected server flavor and critical variables. They stopped relying on vibes and started relying on facts.

Mini-story 2: The optimization that backfired

Another team chased write latency on a budget VPS platform. They read about durability knobs and decided to “speed things up”
by relaxing commit flush behavior. The change was small, the graphs looked better, and the team celebrated quietly—because loud celebrations tempt fate.

Two weeks later, the host node had an unplanned reboot. The database recovered, the app came back, and everyone thought they’d dodged it.
Then the support queue started filling with reports of missing recent updates. Not huge volumes, but enough to be real.

The system was technically behaving as configured: transactions acknowledged to clients were not necessarily on stable storage at the moment of acknowledgement.
That’s exactly what they had asked for. The problem was not the knob; the problem was they changed the durability contract without changing the product’s expectations.

The rollback was immediate. The long-term “optimization” turned into a real engineering plan:
keep strict durability for the core tables, push less-critical writes through a queue with idempotency,
and stop using database durability knobs as a substitute for architecture.

Mini-story 3: The boring but correct practice that saved the day

A SaaS team ran MariaDB on modest VPS instances. Nothing fancy. But they had one habit: weekly restore drills.
Not theoretical restores. Actual restores to a scratch VPS, with application smoke tests and a basic checksum validation.

One day, a developer deployed a migration that introduced a query pattern that created massive on-disk temporary tables.
The database didn’t “crash”. It just got slower and slower as disk filled. Eventually it hit 100% and the service fell over.

The immediate response was messy: free space, kill the worst queries, roll back the app. But a few tables were partially updated
by the time the disk filled and the rollback logic wasn’t clean. They needed to restore to a point before the migration.

The restore drill practice paid off. The team already knew the restore steps, the time-to-restore envelope, and the one configuration file
that always got missed. They restored, validated, replayed only the safe subset of changes, and came back online with minimal drama.

The final action item wasn’t “be more careful”. It was concrete: add query plan checks for the migration path, monitor temp table spill rates,
and keep doing the boring restore drills that everyone complains about until they save your weekend.

Common mistakes: symptom → root cause → fix

1) Symptom: low CPU, high load average, random timeouts

Root cause: I/O wait and storage saturation; threads blocked on fsync/flush or slow reads.

Fix: Use iostat -x to confirm high await/%util. Increase buffer pool if undersized, reduce write amplification (binlog sync, flush policy), and consider moving to better storage.

2) Symptom: performance tanks after “small” traffic increase

Root cause: Working set no longer fits in buffer pool; now you’re doing random reads from disk.

Fix: Measure buffer pool reads vs requests. Add missing indexes, increase buffer pool, and verify you’re not swapping afterward.

3) Symptom: sudden stalls during heavy write bursts

Root cause: Checkpoint pressure (redo logs too small) or aggressive flushing interacting with slow storage.

Fix: Increase redo log capacity where appropriate, and tune flushing behavior. Validate with processlist states like “Waiting for handler commit”.

4) Symptom: memory usage grows until the box swaps

Root cause: Too many connections, large per-connection buffers, temp table growth, or buffer pool set too high for the machine.

Fix: Cap connections, implement pooling, reduce per-connection buffers, and right-size buffer pool. Verify with free -h and Threads_connected.

5) Symptom: replication/failover tooling behaves strangely

Root cause: Assuming MySQL and MariaDB GTID/replication semantics match; mixed versions/flavors.

Fix: Standardize engine across topology or use explicit compatibility plans. Don’t mix without a tested runbook.

6) Symptom: “It got slower after we upgraded”

Root cause: Changed optimizer behavior, SQL modes, or config variables no longer valid; also possible regression due to different defaults.

Fix: Diff key variables pre/post upgrade, inspect error log for ignored settings, and re-baseline with slow query samples and EXPLAIN plans.

Joke #2: The database didn’t “randomly” get slow. It’s just expressing its feelings about your defaults.

Checklists / step-by-step plan

Plan A: You have a slow VPS right now

  1. Stabilize: confirm disk space (df -h), stop runaway queries, and ensure you’re not swapping heavily.
  2. Classify: run iostat -x and free -h. Decide: I/O-bound vs memory-bound vs CPU-bound.
  3. Top waits: processlist for commit waits/locks; quick slow log sampling if safe.
  4. Quick win tuning: fix buffer pool if tiny; reduce connections; address temp table spills.
  5. High-risk knobs: only change durability settings with explicit sign-off and rollback steps.
  6. Validate: measure p95 latency and storage await after changes, not just average QPS.

Plan B: You’re choosing between MySQL and MariaDB for a new VPS

  1. Decide on ecosystem needs: client drivers, ORM quirks, managed service alignment, replication needs.
  2. Pick one engine per environment: don’t mix “because it worked once”.
  3. Baseline configs: set buffer pool, log sizes, connection limits, and slow log policy intentionally.
  4. Plan upgrades: test minor upgrades and config validity in staging that matches VPS storage characteristics as closely as possible.
  5. Build observability: capture slow query logs (rotated), basic OS metrics, and DB status counters.

Plan C: You already run one and want to switch

  1. Don’t treat it as a package swap: treat it as a migration with compatibility testing.
  2. Inventory features: SQL modes, authentication plugins, replication format, stored routines, triggers, and charset/collation behavior.
  3. Dry run restore/migrate: practice export/import or logical replication approaches on a scratch node.
  4. Cutover with rollback: define exact rollback criteria (error rates, replication lag, data checks) and keep the old primary read-only for a window.

FAQ

1) Is MariaDB faster than MySQL on a VPS?

Sometimes, for specific workloads and versions. But on VPSes the bigger determinant is whether your defaults match your RAM and storage latency.
Either engine can be fast; either can be a stalling mess.

2) Why does “default install” often feel slow on small servers?

Defaults aim for broad compatibility and durability. On constrained storage, safe fsync behavior and small caches can translate into frequent stalls.
You need to right-size memory and reduce unnecessary write amplification.

3) Can I tune my way out of slow VPS storage?

You can reduce the damage (bigger buffer pool, fewer fsyncs, fewer disk temp tables). But you can’t tune away terrible storage latency.
If iostat shows high await under modest load, upgrading storage or changing provider might be the real fix.

4) Should I change innodb_flush_log_at_trx_commit to 2?

Only if you understand and accept the durability tradeoff: you may lose up to about a second of transactions on a crash.
For many web apps with queues and idempotency, it’s acceptable. For financial or state-of-record systems, it usually isn’t.

5) Is disabling binary logging safe?

If you don’t replicate and don’t use binlogs for point-in-time recovery, disabling can reduce write load.
If you need PITR or replicas, keep it on and tune sync_binlog with documented risk.

6) Why are there so many “Sleeping” connections in processlist?

That’s typically application connection pooling or leaked connections. Sleeping isn’t free: memory and thread management overhead add up.
Fix pooling behavior and cap max_connections to force discipline.

7) Can I mix MySQL primary with MariaDB replicas (or the reverse)?

Sometimes in limited configurations, but it’s fragile and version-dependent. For production, assume “no” unless you have a tested, documented,
and continuously validated replication compatibility plan. This is not where you want surprises.

8) What’s the single best knob for VPS performance?

innodb_buffer_pool_size sized correctly for your RAM and workload, combined with keeping the box out of swap.
After that, the best knob is usually “add the right index”.

9) How do I know if my problem is queries vs configuration?

If iostat shows high await and InnoDB disk reads climb during peak, you’re likely cache/index bound.
If processlist shows commit waits and binlog sync is strict, you’re likely fsync/write-amplification bound.

10) Should I use MyISAM/Aria for speed on a VPS?

For application tables: almost always no. InnoDB’s crash safety and concurrency behavior are what you want.
Engine-switching for “speed” usually just relocates pain into recovery and data integrity.

Next steps you can do this week

If your VPS database is slow, don’t start with engine religion. Start with measurement, then remove the obvious self-inflicted wounds.

  1. Identify the engine and version you’re actually running and document it in your runbook.
  2. Baseline OS reality: capture free -h, df -h, and iostat -x during normal and peak periods.
  3. Right-size InnoDB buffer pool and cap connection counts so you stop fighting the kernel.
  4. Audit write amplification (binlog, sync settings, commit flush) and explicitly decide your durability contract.
  5. Turn on slow query logging briefly (with rotation) and fix the top offenders with indexes and query changes.
  6. Practice a restore to a scratch VPS. Not because it’s fun. Because someday it will be necessary and you’ll want muscle memory.

Choose MySQL or MariaDB based on compatibility and your upgrade path. Then make the defaults yours.
Your VPS will stop “randomly” slowing down. It will become predictably fast, which is the only kind of fast that matters in production.

← Previous
Proxmox “pmxcfs is not mounted”: why /etc/pve is empty and how to recover
Next →
Ubuntu 24.04: NVMe Disappears Under Load — the Power/ASPM Settings That Often Fix It

Leave a comment