You ran a quick benchmark on your laptop and SQLite looked like a rocket. Then you shipped it,
put it behind a web server, added a few workers, and performance turned into a slow-motion incident report.
That “SQLite is fast” line wasn’t wrong—just incomplete.
Production is not your laptop. Production has concurrency, noisy neighbors, real disks, real latency,
backups, failovers, and the charming habit of punishing any assumption you didn’t write down.
This is why “fast locally” becomes “slow in prod,” and why MariaDB often behaves better under pressure
even when it looks heavier on day one.
The real difference: architecture, not SQL syntax
SQLite is a library you link into your application. MariaDB is a server you connect to.
That single sentence explains most of the performance surprises.
With SQLite, your “database” is a file and your DB engine runs inside your process. Reads can be absurdly fast
because you skip network hops, skip authentication handshakes, skip context switching between client and server,
and often hit the OS page cache. On a quiet developer machine with one user, it’s almost unfair.
MariaDB is a separate process (often on a separate host). It pays overhead you can measure: TCP, connection handling,
query parsing, thread scheduling. But it buys you properties you also measure: robust concurrency control, mature
buffering and background flushing, tuned I/O patterns, instrumentation, and operational knobs designed for multi-tenant
reality.
If your workload is “a small app, one writer, a lot of reads, minimal ops staff,” SQLite can be the correct choice.
If your workload is “web traffic, bursts, multiple workers, background jobs, migrations, and someone asking for HA,”
MariaDB stops being optional and becomes basic hygiene.
Two performance metrics people confuse (and then suffer)
- Latency: how long one request takes. “My query feels slow.”
- Throughput: how many requests you complete per second under concurrency. “The system collapses at 20 RPS.”
SQLite can look great on latency tests and then fall over on throughput as soon as you add concurrent writers.
MariaDB can look slower in a single-thread test and then keep its dignity when the traffic graph spikes.
One quote, because it’s still true
Everything fails, all the time.
— Werner Vogels
If you design for the file to be always available, the lock to be always uncontended, and the disk to be always fast,
you’ve designed for a demo.
Interesting facts and history (the stuff that explains today)
- SQLite’s “serverless” design was a deliberate choice: it’s an embedded database engine, not a daemon, making it ideal for apps and appliances.
- SQLite stores the entire database in a single file (plus sidecar files like WAL or journal), which is operationally convenient and performance-sensitive.
- SQLite uses database-level write locking (with some nuance under WAL), which simplifies correctness but limits concurrent writes.
- WAL mode (Write-Ahead Logging) in SQLite dramatically improves read/write concurrency by separating readers from the writer—up to a point.
- MariaDB is a fork of MySQL, created when Oracle acquired Sun; it kept the MySQL ecosystem alive with a different governance model.
- InnoDB became the default MySQL storage engine because it provided ACID transactions and crash recovery in a way that scaled better than older engines.
- InnoDB’s buffer pool is one of the biggest reasons MariaDB can outperform on real workloads: it’s a purpose-built cache with smarter behavior than “whatever the OS cached.”
- SQLite is everywhere—mobile devices, browsers, embedded systems—because the deployment story (one library, one file) is unbeatable when the workload fits.
Why SQLite is often blazing fast locally
Local tests are a best-case scenario for SQLite. Your app and database share memory and CPU, and the file lives on a
fast local SSD. The OS page cache does most of the work after warm-up. And because you’re probably running one process,
you’re avoiding the most expensive part of SQLite’s design: coordination.
1) No network
MariaDB needs a socket (TCP or Unix), a protocol, and a server thread to respond. Even when it’s efficient, it’s not free.
SQLite is function calls. That’s why it feels snappy.
2) “It’s cached” (you just didn’t know it)
When you run a benchmark twice and the second run is faster, you’re likely measuring the OS page cache, not the database.
SQLite reads the file. The kernel caches pages. Your second run is memory-speed. Congratulations: you benchmarked RAM.
3) Single writer, no contention
Many local setups run a single process. SQLite shines there. A single writer can do a lot of work per second, especially
when you batch transactions and don’t fsync on every tiny write.
4) Simpler query overhead
SQLite’s query planner and execution engine are lean. It’s optimized for low overhead. That can beat a server database in
microbenchmarks, especially when the dataset fits in cache and there’s no concurrency stress.
Joke #1: SQLite in dev is like a shopping cart with perfect wheels—quiet, smooth, and only moving one person’s groceries.
What production changes: concurrency, storage, and failure modes
Production workloads are not polite. They are multi-threaded, bursty, and full of accidental synchronization points.
They also run on infrastructure with quirks: network filesystems, container overlay layers, throttled IOPS, noisy
neighbors, and backup jobs that show up like clockwork at the worst time.
Concurrency turns “fast” into “blocked”
SQLite concurrency is where most “works fine locally” stories go to die. The common failure mode isn’t “SQLite is slow.”
It’s “SQLite is waiting.”
- Multiple writers contend for the same database-level lock.
- Long-running read transactions can prevent WAL checkpoints from completing, growing the WAL and causing stalls.
- Busy timeouts are misconfigured, leading to immediate
database is lockederrors or long hangs.
Storage is not just “disk speed”
SQLite’s durability depends on filesystem semantics. If the filesystem is local, POSIX-ish, and behaves, you’re fine.
If the file lives on NFS, SMB, a distributed filesystem with “close enough” locking, or a container storage driver with
surprising fsync behavior, you can get anything from performance cliff to corruption risk.
MariaDB also depends on the filesystem, but its I/O patterns and configuration knobs are designed to handle ugly realities:
group commit, background flushing, doublewrite, adaptive flushing, and sane defaults for crash recovery.
Durability settings: you always pay somewhere
The “fast locally” benchmark often quietly disables durability. SQLite can run with synchronous=OFF or
NORMAL and look amazing. But if production requires not losing data during power loss or node crash,
you’ll turn durability back on and discover where the time went: fsync.
MariaDB has the same truth, but it amortizes it differently via InnoDB redo logs and group commit. SQLite often pays more
directly per transaction unless you batch.
Observability is a performance feature
When something is slow at 2 a.m., “how do we see what’s happening?” becomes the only question. MariaDB has mature
instrumentation: slow query log, performance schema (in MySQL; MariaDB has similar tooling), engine status, buffer pool
stats, and connection/thread info. SQLite can be instrumented, but it’s mostly on you: application-level metrics,
tracing, and careful logging around transactions.
Why MariaDB (InnoDB) usually wins in production
MariaDB’s advantage isn’t that it’s magically faster at SQL. It’s that it’s built for concurrent access, controlled
durability, and predictable behavior under load. SQLite is built for being embedded and correct in a small footprint.
Different goals, different results.
InnoDB handles concurrency like it expects humans to make bad choices
InnoDB uses row-level locking (and MVCC) for many workloads, allowing concurrent writers to proceed without blocking
each other as often. SQLite’s write lock is coarser. WAL helps reads coexist with writes, but not multiple writers in the
way an OLTP server engine does.
Buffer pool vs OS cache: similar idea, different control
SQLite leans heavily on the OS page cache. MariaDB has the InnoDB buffer pool, a managed cache with internal heuristics
and visibility. You can size it, monitor hit rates, and reason about it.
The OS cache is still good, but in production you want the database to behave consistently even when the kernel decides
it prefers caching something else (like your log files during a burst of error spam).
Log-structured durability and group commit
InnoDB turns random writes into more sequential-ish log writes, flushes strategically, and groups commits across sessions.
That matters on real disks and cloud volumes where IOPS and latency are billed realities, not theoretical numbers.
Operational features that affect performance indirectly
- Connection handling: pooling, thread cache, max connections. SQLite doesn’t have connections in the same way, but your app’s process model becomes the bottleneck.
- Replication: not free, but it enables read scaling and safer maintenance windows.
- Online schema changes: still tricky, but possible with the right tooling and patterns. SQLite migrations can lock the world if you’re not careful.
- Backups: MariaDB supports consistent snapshots and streaming strategies; SQLite backups are possible but require careful handling of WAL/journal and file copies.
Joke #2: Benchmarking SQLite on your laptop and declaring victory is like load-testing a bridge with a bicycle and calling it “truck-ready.”
Practical tasks: commands, outputs, and decisions (12+)
These are the checks I run when someone says “SQLite was fast in dev” or “MariaDB is slow” and expects me to guess.
Each task includes: a command, what the output means, and the decision you make from it.
Task 1: Identify your filesystem and mount options (SQLite cares a lot)
cr0x@server:~$ findmnt -no SOURCE,FSTYPE,OPTIONS /var/lib/app
/dev/nvme0n1p2 ext4 rw,relatime,errors=remount-ro
Meaning: Local ext4 with typical options. Good baseline for SQLite. If you see nfs, cifs, or something exotic, treat it as a red flag.
Decision: If the DB file is on networked storage, move it to local disk or switch to MariaDB/Postgres. SQLite on NFS is how you earn an incident.
Task 2: Check disk latency under real load (your “slow DB” might be slow storage)
cr0x@server:~$ iostat -xz 1 3
Linux 6.5.0 (server) 12/30/25 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
12.31 0.00 4.22 8.55 0.00 74.92
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
nvme0n1 55.0 4200.0 1.0 1.79 1.20 76.36 90.0 8600.0 3.0 3.23 18.50 95.56 1.80 92.00
Meaning: High w_await (18.5ms) with high %util suggests the device is saturated on writes. SQLite and MariaDB both suffer, but SQLite can stall harder when fsync frequency is high.
Decision: Reduce sync frequency via batching, tune durability settings (carefully), or provision faster storage / more IOPS. If you can’t control writes, move to MariaDB and let it amortize commits.
Task 3: Confirm SQLite journal mode and sync settings (the usual “fast local” trick)
cr0x@server:~$ sqlite3 /var/lib/app/app.db "PRAGMA journal_mode; PRAGMA synchronous;"
wal
2
Meaning: wal is good for read/write concurrency. synchronous=2 means FULL. Durable, slower.
Decision: If you’re seeing stalls on writes, try batching transactions before touching synchronous. If your business can tolerate losing a few seconds on crash, consider synchronous=NORMAL with WAL—but document the risk.
Task 4: Check whether WAL is growing (often caused by long readers)
cr0x@server:~$ ls -lh /var/lib/app/app.db*
-rw-r----- 1 app app 1.2G Dec 30 01:58 /var/lib/app/app.db
-rw-r----- 1 app app 3.8G Dec 30 02:10 /var/lib/app/app.db-wal
-rw-r----- 1 app app 32K Dec 30 01:59 /var/lib/app/app.db-shm
Meaning: WAL is bigger than the DB file. That’s not automatically wrong, but it screams “checkpoint not happening” or “long-running read transactions.”
Decision: Identify long readers; add checkpointing strategy; ensure connections close transactions promptly. If your app holds read transactions open for minutes, fix that first.
Task 5: Find long-lived SQLite transactions (application-level, but you can infer)
cr0x@server:~$ lsof /var/lib/app/app.db | head
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
api 1221 app 9u REG 259,2 1288490188 77 /var/lib/app/app.db
worker 1304 app 11u REG 259,2 1288490188 77 /var/lib/app/app.db
Meaning: Processes holding the DB open. Not proof of open transactions, but it tells you who’s in the room.
Decision: Instrument the app: log begin/commit durations, add tracing around transaction scope, and verify no request handler leaves a transaction open across network calls.
Task 6: Check for lock contention symptoms in logs
cr0x@server:~$ journalctl -u app-api -n 30 --no-pager | sed -n '1,8p'
Dec 30 02:12:01 server app-api[1221]: ERROR db: sqlite busy: database is locked
Dec 30 02:12:01 server app-api[1221]: WARN db: retrying transaction attempt=1
Dec 30 02:12:02 server app-api[1221]: ERROR db: sqlite busy: database is locked
Dec 30 02:12:02 server app-api[1221]: WARN db: retrying transaction attempt=2
Meaning: Busy/locked errors. This is throughput collapse: workers spend time retrying instead of doing work.
Decision: Reduce concurrent writers, introduce a write queue, or move the write workload to MariaDB. SQLite is telling you it’s not a write-concurrency engine.
Task 7: Validate SQLite busy timeout (avoid instant failure, but don’t hide the problem)
cr0x@server:~$ sqlite3 /var/lib/app/app.db "PRAGMA busy_timeout;"
0
Meaning: No busy timeout; locks fail immediately.
Decision: Set a sane busy timeout in the application (e.g., 2000–5000ms) and add retry with jitter. But treat this as a bandage; still fix contention.
Task 8: Check MariaDB is actually using InnoDB and not doing something weird
cr0x@server:~$ mariadb -e "SHOW VARIABLES LIKE 'default_storage_engine';"
+------------------------+--------+
| Variable_name | Value |
+------------------------+--------+
| default_storage_engine | InnoDB |
+------------------------+--------+
Meaning: InnoDB is the default, as it should be for production OLTP.
Decision: If you see something else, stop and fix that first. Performance tuning a wrong engine is performance theater.
Task 9: Check MariaDB buffer pool sizing (common production misconfig)
cr0x@server:~$ mariadb -e "SHOW VARIABLES LIKE 'innodb_buffer_pool_size';"
+-------------------------+-----------+
| Variable_name | Value |
+-------------------------+-----------+
| innodb_buffer_pool_size | 134217728 |
+-------------------------+-----------+
Meaning: 128MB buffer pool. Fine for tiny DBs, tragic for anything real.
Decision: Increase it (often 60–75% of RAM on a dedicated DB host). Then verify hit rate and memory pressure. Don’t starve the OS and other services.
Task 10: Detect MariaDB disk flushing pressure (fsync tax)
cr0x@server:~$ mariadb -e "SHOW GLOBAL STATUS LIKE 'Innodb_data_fsyncs'; SHOW GLOBAL STATUS LIKE 'Innodb_os_log_fsyncs';"
+-------------------+-------+
| Variable_name | Value |
+-------------------+-------+
| Innodb_data_fsyncs| 98321 |
+-------------------+-------+
+---------------------+-------+
| Variable_name | Value |
+---------------------+-------+
| Innodb_os_log_fsyncs| 22110 |
+---------------------+-------+
Meaning: fsync counts can indicate how hard you’re hitting storage. On their own they’re not “bad,” but spikes correlated with latency are suspicious.
Decision: If latency aligns with fsync spikes, check innodb_flush_log_at_trx_commit and storage latency. Don’t change durability casually; prefer fixing I/O and batching writes.
Task 11: Find slow queries on MariaDB (don’t guess)
cr0x@server:~$ mariadb -e "SHOW VARIABLES LIKE 'slow_query_log'; SHOW VARIABLES LIKE 'long_query_time';"
+----------------+-------+
| Variable_name | Value |
+----------------+-------+
| slow_query_log | ON |
+----------------+-------+
+-----------------+-------+
| Variable_name | Value |
+-----------------+-------+
| long_query_time | 1.000 |
+-----------------+-------+
Meaning: Slow query log enabled; threshold is 1s.
Decision: Use the slow log to target the real offenders. If you don’t have this on in production, you’re debugging blind.
Task 12: Inspect MariaDB current waits and locks (spot contention)
cr0x@server:~$ mariadb -e "SHOW FULL PROCESSLIST;" | head
Id User Host db Command Time State Info
48 app 10.0.2.41:51244 prod Query 12 Waiting for table metadata lock ALTER TABLE events ADD COLUMN x INT
51 app 10.0.2.40:49810 prod Query 11 Sending data SELECT * FROM events WHERE created_at > NOW() - INTERVAL 1 DAY
Meaning: Metadata lock from an ALTER blocks queries. This is a classic production slowdown that has nothing to do with “DB is slow” and everything to do with online DDL strategy.
Decision: Stop doing blocking schema changes at peak. Use online schema change methods or schedule maintenance windows.
Task 13: Validate you aren’t accidentally running SQLite on an overlay filesystem
cr0x@server:~$ stat -f -c "%T" /var/lib/app/app.db
ext2/ext3
Meaning: It’s on ext-family (ext4 shows similarly). If you see overlayfs, you might be inside a container layer with unpleasant fsync semantics.
Decision: Put the DB file on a real persistent volume mount, not the container writable layer.
Task 14: Check open file descriptor limits (SQLite + many workers = surprise)
cr0x@server:~$ ulimit -n
1024
Meaning: Low FD limit. Under load, you might exhaust FDs and misdiagnose as “DB slow” because everything retries.
Decision: Raise limits for the service and verify with systemd unit settings. Then re-test under load.
Task 15: Quick reality check on CPU throttling (cloud can do that)
cr0x@server:~$ mpstat 1 3
Linux 6.5.0 (server) 12/30/25 _x86_64_ (8 CPU)
01:20:11 PM all %usr %nice %sys %iowait %irq %soft %steal %idle
01:20:12 PM all 18.0 0.0 6.0 2.0 0.0 1.0 0.0 73.0
01:20:13 PM all 19.0 0.0 6.0 3.0 0.0 1.0 0.0 71.0
Meaning: No obvious steal time here. If %steal is high, your “database problem” might be your hypervisor stealing CPU.
Decision: If steal is high, move instance types or hosts, or isolate the DB. Tuning queries won’t fix a CPU you don’t own.
Task 16: Confirm MariaDB durability settings (don’t accidentally run “benchmark mode”)
cr0x@server:~$ mariadb -e "SHOW VARIABLES LIKE 'innodb_flush_log_at_trx_commit'; SHOW VARIABLES LIKE 'sync_binlog';"
+--------------------------------+-------+
| Variable_name | Value |
+--------------------------------+-------+
| innodb_flush_log_at_trx_commit | 1 |
+--------------------------------+-------+
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| sync_binlog | 1 |
+---------------+-------+
Meaning: Fully durable settings. This costs write latency, but it’s what many production environments actually need.
Decision: If you change these, do it with explicit risk sign-off. Otherwise, buy IOPS or redesign write patterns.
Fast diagnosis playbook
When performance tanks, you don’t have time for philosophy. You need a 10-minute triage that narrows the field.
Here’s the playbook I use when the question is “SQLite vs MariaDB—what’s slow and why?”
First: Is it waiting on locks or waiting on I/O?
- SQLite: search logs for
database is locked; check WAL size; check busy timeout; identify how many writers exist. - MariaDB: check
SHOW FULL PROCESSLISTfor lock waits; check InnoDB status if needed; check slow query log for outliers. - Host: run
iostat -xzand look for highawaitand high%util.
Second: Is the system saturating one resource?
- CPU: high %usr/%sys, low iowait. Query planning, JSON parsing, or encryption overhead can dominate.
- Disk: high iowait, high disk await/util. You’re IOPS-limited or latency-limited.
- Network: MariaDB on a separate host: check RTT and packet drops. A “fast query” over a slow network is still slow.
- Threads/workers: too many app workers can destroy SQLite via contention; too many MariaDB connections can add context-switching and memory pressure.
Third: Validate workload shape (the question that decides the database)
- How many concurrent writers at peak?
- Are writes tiny and frequent (worst case), or batched (best case)?
- Do you run long read transactions (reports, exports, analytics) against the same DB as OLTP?
- Do you need HA/replication/backups with minimal downtime?
If you have multiple writers and you can’t serialize them cheaply, stop arguing with SQLite. Use MariaDB (or Postgres).
If you’re mostly reads and occasional writes and you can batch, SQLite can still win.
Common mistakes: symptom → root cause → fix
1) Symptom: “Random spikes of latency, then everything times out”
Root cause: SQLite writer lock contention; app retries amplify load (thundering herd).
Fix: Reduce writer concurrency; implement a single writer queue; batch writes; enable WAL; set busy timeout with jittered retries. If write concurrency is inherent, migrate to MariaDB.
2) Symptom: “SQLite WAL file grows without bound”
Root cause: Long-lived read transactions prevent checkpointing; or checkpoints are disabled/misconfigured.
Fix: Ensure readers commit quickly; avoid wrapping long report jobs in a single transaction; run periodic checkpoints; separate analytics from OLTP.
3) Symptom: “It was fast until we put the DB on shared storage”
Root cause: Network filesystem locking and fsync semantics; latency and lock coherence kill SQLite.
Fix: Put SQLite on local disk only; otherwise switch to MariaDB with proper storage. Don’t “tune around” NFS file locking with hope.
4) Symptom: “MariaDB is slow, but CPU is low and queries look simple”
Root cause: InnoDB buffer pool too small; everything is disk reads.
Fix: Increase innodb_buffer_pool_size; verify working set; add indexes. Measure again.
5) Symptom: “MariaDB slows down during schema changes”
Root cause: Metadata locks or blocking ALTER; long-running transactions prevent DDL completion.
Fix: Use online schema change strategy; shorten transactions; run DDL off-peak; ensure your migration tooling is production-safe.
6) Symptom: “SQLite ‘works’ but we lose data after crashes”
Root cause: Durability settings relaxed (synchronous=OFF), or underlying storage lies about flushing.
Fix: Restore durable settings; verify filesystem and device cache behavior; if you need strong guarantees, move to MariaDB with durable config and proper hardware.
7) Symptom: “High p99 latency only in production”
Root cause: Production has burst concurrency, cold caches, background jobs, backups, noisy neighbors, and real I/O contention.
Fix: Run load tests with concurrency; add realistic background workloads; measure p99, not averages; add observability and alerts before you scale.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
A mid-size company built an internal job scheduler: queue tables, worker pool, retries, the usual.
It started as a single binary with an embedded SQLite file. It was fast, cheap, and deployable anywhere.
The team loved it because it required zero coordination with the database group.
Then the product got popular internally, and the team scaled the workers horizontally. Same shared volume, same SQLite file,
now mounted into multiple containers. The assumption was “it’s just a file; multiple pods can read and write it.”
For a week it even seemed fine—until Monday morning.
Monday brought a traffic burst plus a backlog of scheduled jobs. Lock contention turned into a cascade:
workers retried instantly, which increased lock pressure, which increased retry volume. The system wasn’t “slow”; it was
mostly waiting, and when it did work it hammered the storage layer with fsync-heavy tiny transactions.
The fix wasn’t a clever pragma. They moved the job queue to MariaDB, kept SQLite for local caching where it belonged,
and added a simple rule: embedded databases are single-node components unless proven otherwise.
The incident report ended with a sentence every SRE recognizes: “We assumed the filesystem behaved like a local disk.”
Mini-story 2: The optimization that backfired
Another org had a write-heavy service collecting events from edge devices. They used MariaDB and had a painful write latency
problem at peak. Someone looked at durability settings and found easy wins: relax fsync, increase throughput, ship it.
The benchmark graphs looked gorgeous.
Two months later, a host crashed. Not the whole cluster—just one machine. The service failed over fine, but they discovered
a gap of missing events. It wasn’t huge, but it was enough to break downstream reconciliation and trigger customer-facing
alerts. Suddenly everyone cared about “only a few seconds of data.”
The postmortem was awkward because the optimization worked exactly as designed. The failure wasn’t mysterious.
It was the classic trade: fewer fsyncs, higher performance, weaker guarantees. The real mistake was treating that trade
as an engineering-only decision instead of a product decision with explicit risk acceptance.
The recovery plan was boring: restore durable settings, batch inserts properly, use multi-row inserts, and provision
storage that could handle the write rate without playing games. Performance got better again—this time without gambling.
Mini-story 3: The boring but correct practice that saved the day
A small payments-adjacent service ran MariaDB with a strict operational routine: slow query log enabled, weekly review of
top offenders, and a habit of running migrations in a staging environment with production-like data volume.
It wasn’t glamorous. It also didn’t make slide decks.
A developer introduced a new endpoint that joined two tables on an unindexed column. In dev it was instant because the
dataset was tiny and everything lived in cache. In staging with a real snapshot, it was obviously bad: the query plan was
a full scan and the row estimates were ugly.
Because the team had the habit of checking plans and watching the slow log, they caught it before release.
They added the index, rewrote the query to be sargable, and the incident never happened.
The most effective performance tooling in production is still: “we look at the logs like adults.”
Checklists / step-by-step plan
Decision checklist: should this workload be SQLite or MariaDB?
- Choose SQLite when:
- Single node, local disk, no shared filesystem.
- Mostly reads, low write rate, or writes can be batched.
- You want zero operational overhead and can tolerate limited concurrency.
- You can accept “scale up” more than “scale out.”
- Choose MariaDB when:
- Multiple concurrent writers are normal, not exceptional.
- You need HA/replication, online maintenance, and predictable behavior under load.
- You need operational tooling and visibility without instrumenting everything yourself.
- You expect growth, multiple services, or shared access patterns.
Step-by-step: making SQLite behave (when it’s the right tool)
- Keep the DB file on local disk, not NFS/SMB/overlay layers.
- Enable WAL mode for read-heavy workloads with occasional writes.
- Batch writes: wrap multiple inserts/updates in a single transaction.
- Set busy timeout and implement jittered retries to avoid herd effects.
- Keep transactions short; don’t hold them across network calls or long computations.
- Watch WAL growth; checkpoint intentionally and avoid long readers.
- Be explicit about durability and document what you can lose on crash.
Step-by-step: making MariaDB fast (without turning off safety)
- Size the buffer pool to fit the working set, within memory constraints.
- Enable and use the slow query log; fix queries, not vibes.
- Add proper indexes; check query plans before and after.
- Batch writes (multi-row inserts, prepared statements).
- Control connection counts with pooling; avoid thousands of concurrent connections unless you enjoy context switching.
- Provision IOPS and measure disk latency; don’t pretend gp2-like volumes are magic.
- Plan schema changes to avoid metadata lock storms.
FAQ
1) Is SQLite “slower” than MariaDB?
Not universally. SQLite can be faster for single-process, local workloads with small-to-medium datasets and low write contention.
MariaDB is usually faster (and more stable) under concurrent writers and multi-user access.
2) Why does SQLite feel instant on my laptop?
You’re measuring function-call overhead plus OS caching on a local SSD with minimal contention. It’s the best-case environment.
Production adds concurrency, I/O pressure, and real durability costs.
3) Does WAL mode make SQLite “production-ready”?
WAL improves read/write concurrency, especially many readers with a writer. It does not turn SQLite into a multi-writer OLTP server.
If you need many concurrent writers, WAL won’t save you.
4) Can I run SQLite on NFS if I’m careful?
You can, but you’re betting your uptime on filesystem locking and flush semantics you don’t control. If you need shared storage,
use a server database. SQLite’s file-based locking is not a fun distributed systems project.
5) Why is MariaDB slower in my single-thread benchmark?
Because you’re including client/server overhead and likely not saturating what MariaDB is designed to handle: concurrency.
A fair test uses realistic concurrency, realistic data, and measures p95/p99 latency plus throughput.
6) What’s the most common SQLite performance killer in production?
Too many concurrent writers, usually from scaling app workers without a write coordination strategy.
The symptom is lock errors or long stalls, not always high CPU.
7) What’s the most common MariaDB performance killer in production?
Bad indexing and undersized buffer pool, followed closely by disk latency and poorly planned schema changes.
MariaDB is forgiving, but it’s not psychic.
8) Can I keep SQLite and still scale the app horizontally?
Yes, if you avoid shared writes. Common patterns: each node has its own SQLite for caching; or you funnel writes through a single writer service.
If every node needs to write the same DB file, that’s a server database problem.
9) Should I turn off fsync/durability to make it fast?
Only with explicit acceptance of data-loss risk, and only after you’ve fixed batching and I/O provisioning. Disabling durability as a default
is how you convert performance issues into integrity issues.
10) If I migrate from SQLite to MariaDB, what usually hurts?
Query compatibility edges, transaction isolation assumptions, and operational tasks (backups, users, schema changes).
The payoff is predictable concurrency and tooling. Budget time for proper migrations and query plan tuning.
Next steps you can actually do this week
If you’re running SQLite in production and seeing slowness:
- Confirm the DB file is on local disk and not a shared/network/overlay filesystem.
- Enable WAL mode (if appropriate), set a busy timeout, and batch writes.
- Measure lock contention explicitly (log retries, track transaction durations).
- Decide whether you’re trying to do multi-writer OLTP with a single-file DB. If yes, stop and migrate.
If you’re running MariaDB and seeing slowness:
- Turn on slow query logging (or verify it’s on), then fix the top offenders.
- Right-size the InnoDB buffer pool and verify you’re not disk-reading your way through every request.
- Check storage latency and fsync pressure; buy IOPS before you buy superstition.
- Audit schema change practices so you don’t self-inflict metadata lock outages.
The core lesson: local speed proves your SQL works. Production speed proves your system design works. Treat those as separate milestones,
and you’ll sleep more.