You’re on a VPS. You want “a database.” Not a weekend project, not a yak farm. Something that won’t wake you up at 03:00 because a single file got stuck, or because your app suddenly has real traffic and your “simple” choice turns into a migration with teeth.
The fastest way to pick between PostgreSQL and SQLite is to stop arguing about features and start asking one brutal question: where is your concurrency and failure boundary? If it’s inside one process, SQLite is a scalpel. If it’s across many processes, users, jobs, and connections, PostgreSQL is your boring, battle-tested wrench.
The one-minute decision
If you read only this section, you’ll still make a respectable choice.
Pick SQLite if all of these are true
- Your app is mostly single-writer and modest traffic (think: one web process or a queue worker doing writes, not a swarm).
- You can live with file-based locking semantics and the occasional “database is locked” if you misuse it.
- You want zero ops overhead: no daemon, no background vacuum tuning, no connection pooling drama.
- Your failure domain is “this VPS and this disk” and you’re okay with that.
- You want easy local dev parity: shipping a single DB file is a power move.
Pick PostgreSQL if any of these are true
- You have multiple writers, multiple app instances, cron jobs, workers, analytics queries, admin tooling… anything that behaves like a small crowd.
- You need strong concurrency without turning your app into a lock coordinator.
- You care about isolation, durability guarantees, and recoverability under messy real-world failure modes.
- You want online schema changes, richer indexing, and query plans that scale beyond “cute.”
- You foresee growth and prefer to scale by adding CPU/RAM now and replicas later, rather than doing a high-stakes migration later.
Dry rule of thumb: if your database has to mediate human impatience (web traffic) and machine impatience (jobs), PostgreSQL is the adult in the room.
Joke #1: SQLite is like a bicycle: fast, elegant, and perfect until you try to move a couch with it.
A mental model that prevents regrets
Most “Postgres vs SQLite” debates die because people compare SQL syntax or feature checklists. The choice is really about operational shape: who talks to the database, how often, and what happens when things go wrong.
SQLite: a library with a file, not a server
SQLite runs in-process. There’s no database server daemon accepting connections. Your app links a library; the “database” is a file (plus optional journaling/WAL files). That means:
- Latency can be great because there’s no network hop. Calls are function calls.
- Concurrency is limited by file locking. Reads are fine. Writes require coordination; WAL improves this but doesn’t make it a free-for-all.
- Durability depends on filesystem semantics, mount options, and your use of synchronous settings. It’s not “unsafe,” it’s “you own the sharp edges.”
- Backups are file backups, which can be wonderfully simple—until you take one at the wrong time without using SQLite’s backup APIs.
PostgreSQL: a server with processes, memory, and opinions
PostgreSQL runs as a database server with its own processes, caches, write-ahead log (WAL), background vacuum, and well-defined transactional semantics. That means:
- High concurrency with MVCC (multi-version concurrency control): readers don’t block writers in the way you’d expect from file locks.
- Durability and crash recovery are core. You still need to configure and test, but the system is built for bad days.
- Operational overhead exists: upgrades, backups, monitoring, vacuum, and connection management.
- Scaling paths are clearer: replication, read replicas, partitioning, connection poolers, and mature tooling.
The boundary question
Ask: “Is the database a shared service boundary?” If yes, PostgreSQL. If no, SQLite can be a legitimate production database. Don’t underestimate how often “no” quietly turns into “yes” once you add a worker, then a second app instance, then an admin dashboard that runs heavy queries.
Interesting facts and a bit of history
Some context helps because the design choices weren’t arbitrary. They’re scars from real usage.
- SQLite was born in 2000 as an embedded database to avoid the overhead of client/server DBs for a specific software project; it became the default “small SQL” engine for the world.
- PostgreSQL traces back to the 1980s (POSTGRES project at UC Berkeley), and its DNA shows: extensibility, correctness, and an academic obsession with transactional behavior.
- SQLite is arguably the most deployed database engine because it ships in phones, browsers, operating systems, and countless applications as a library.
- PostgreSQL popularized rich extensibility via custom types, operators, and extensions; this is why it’s the default “SQL plus” platform in many modern stacks.
- SQLite’s WAL mode (write-ahead logging) was added later to reduce writer blocking and improve concurrency; it changed what “SQLite is good for” in production.
- PostgreSQL’s MVCC means old row versions hang around until vacuum cleans them; this is a performance feature and an operational chore.
- SQLite is famously strict about database file portability across architectures and versions, but it still depends on filesystem behavior for durability.
- PostgreSQL’s WAL is also called WAL (same acronym, different implementation details), and it’s the basis for replication and point-in-time recovery.
- SQLite’s “database is locked” is not a bug; it’s an explicit outcome of the locking model. The bug is your assumption that it behaves like a server DB.
VPS realities: disks, memory, and neighbors
A VPS is not a laptop and not a managed database. It’s a small slice of a bigger machine with shared IO and sometimes unpredictable neighbors. Your database choice should respect that.
Disk IO is the first lie your benchmarks tell
On a VPS, your “SSD” might be fast, or it might be “fast when the neighbors are asleep.” SQLite and PostgreSQL both care about fsync behavior, but they experience it differently:
- SQLite writes to a single database file (plus journaling/WAL). Random writes can be punishing if your workload is churn-heavy.
- PostgreSQL writes to multiple files: data files and WAL segments. WAL writes are sequential-ish and can be kinder to real disks, but you now have background processes and checkpoints.
Memory is not just “cache”; it’s policy
SQLite relies heavily on the OS page cache. That’s fine—Linux is good at caching. PostgreSQL has its own shared buffers plus OS cache. If you mis-size it on a small VPS, you can end up double-caching and starving the rest of the system.
Process model matters when you have small RAM
SQLite lives inside your app process. PostgreSQL uses multiple processes and per-connection memory. On a 1 GB VPS, a pile of idle connections can be a performance bug, not a minor detail. If you run Postgres on small iron, you learn to love connection pooling.
Operational blast radius
SQLite’s blast radius is often “this file.” PostgreSQL’s blast radius is “this cluster,” but with better tooling to isolate and recover. SQLite can be recovered by copying a file—unless you copy it at the wrong moment. PostgreSQL can be recovered by replaying WAL—unless you never tested your backups. Pick your poison; then mitigate it.
Hands-on tasks: commands, outputs, decisions (12+)
Below are tasks you can run on a VPS today. Each one gives you a signal, not a vibe. The point is to decide based on evidence: IO capability, concurrency needs, and failure risks.
Task 1: Check CPU and memory pressure (are you even allowed to run Postgres?)
cr0x@server:~$ lscpu | egrep 'Model name|CPU\(s\)'
CPU(s): 2
Model name: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
cr0x@server:~$ free -h
total used free shared buff/cache available
Mem: 1.0Gi 220Mi 180Mi 12Mi 620Mi 690Mi
Swap: 1.0Gi 0B 1.0Gi
What it means: On 1 GB RAM, Postgres is possible but you must be disciplined (pool connections, tune memory). SQLite will feel effortless.
Decision: If you can’t afford a few hundred MB for Postgres plus headroom for your app, prefer SQLite or upgrade the VPS.
Task 2: Identify your storage type and mount options (durability lives here)
cr0x@server:~$ findmnt -no SOURCE,FSTYPE,OPTIONS /
/dev/vda1 ext4 rw,relatime,errors=remount-ro
What it means: ext4 with relatime is normal. If you see odd options like data=writeback or exotic network FS, you must treat SQLite durability claims with suspicion and tune Postgres carefully too.
Decision: If you’re on networked or weird storage, Postgres with tested WAL+fsync behavior is usually safer than “file copy database.”
Task 3: Quick disk latency check (your future “db is slow” ticket)
cr0x@server:~$ iostat -xz 1 3
Linux 6.2.0 (server) 12/30/2025 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
3.10 0.00 1.20 0.40 0.10 95.20
Device r/s w/s rkB/s wkB/s await svctm %util
vda 5.00 8.00 80.0 210.0 2.10 0.40 0.52
What it means: await in low single digits is decent. If you see 20–100ms spikes, both SQLite and Postgres will suffer, but SQLite will show it as stalls inside app threads.
Decision: High IO wait argues for Postgres with careful checkpoint tuning and possibly moving to better storage; it also argues for reducing write amplification either way.
Task 4: Measure filesystem sync cost (SQLite and Postgres both pay this bill)
cr0x@server:~$ sudo dd if=/dev/zero of=/var/tmp/fsync.test bs=4k count=25000 conv=fdatasync status=progress
102400000 bytes (102 MB, 98 MiB) copied, 1.52 s, 67.4 MB/s
25000+0 records in
25000+0 records out
102400000 bytes (102 MB, 98 MiB) copied, 1.52 s, 67.3 MB/s
What it means: This is crude, but it approximates “how painful is forcing durability.” If this is glacial, your “safe” settings will hurt.
Decision: If forced sync is expensive, SQLite needs WAL + sane synchronous settings; Postgres needs careful checkpointing and not overdoing synchronous_commit for non-critical writes.
Task 5: Verify open file limits (Postgres will care more)
cr0x@server:~$ ulimit -n
1024
What it means: 1024 is tight for Postgres under load with many connections and files. SQLite cares less, but your app might.
Decision: If you choose Postgres, raise limits via systemd or limits.conf; if you can’t, keep connections low and use a pooler.
Task 6: Inspect live connection count (if it’s already a crowd, SQLite will get spicy)
cr0x@server:~$ sudo ss -tanp | awk '$4 ~ /:5432$/ {c++} END {print c+0}'
0
What it means: No Postgres right now, but the pattern is what matters: how many concurrent DB clients will exist?
Decision: If you expect dozens/hundreds of concurrent connections, Postgres plus a pooler wins. SQLite does not have “connections” in the same sense; it has “threads and processes fighting over a file.”
Task 7: Create a SQLite database with WAL and inspect pragmas (make it less fragile)
cr0x@server:~$ sqlite3 /var/lib/myapp/app.db 'PRAGMA journal_mode=WAL; PRAGMA synchronous=NORMAL; PRAGMA wal_autocheckpoint=1000;'
wal
What it means: WAL mode enabled; synchronous NORMAL is a common compromise (durable enough for many apps, less IO pain than FULL).
Decision: If you pick SQLite, you should be explicit about pragmas. Default settings are not “production policy,” they’re “generic library defaults.”
Task 8: Simulate concurrent SQLite writes (spot the lock wall early)
cr0x@server:~$ for i in $(seq 1 20); do (sqlite3 /var/lib/myapp/app.db "BEGIN IMMEDIATE; CREATE TABLE IF NOT EXISTS t(x); INSERT INTO t VALUES($i); COMMIT;" >/dev/null 2>&1 &); done; wait; echo done
done
What it means: This is a rough test. If you rerun with more contention and start seeing “database is locked” in stderr, that’s your warning siren.
Decision: If your real workload resembles this (many writers), stop romanticizing SQLite and use Postgres.
Task 9: Install Postgres and confirm the service health
cr0x@server:~$ sudo apt-get update -qq
...output...
cr0x@server:~$ sudo apt-get install -y postgresql
...output...
cr0x@server:~$ sudo systemctl status postgresql --no-pager
● postgresql.service - PostgreSQL RDBMS
Loaded: loaded (/lib/systemd/system/postgresql.service; enabled)
Active: active (exited)
What it means: On Debian/Ubuntu, the wrapper service may show “active (exited)” while the cluster units run. Don’t panic; check the cluster.
Decision: If you can’t keep a service healthy on your VPS (permissions, disk full, memory pressure), SQLite might be the saner choice until you stabilize the host.
Task 10: Check Postgres cluster readiness
cr0x@server:~$ pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
16 main 5432 online postgres /var/lib/postgresql/16/main /var/log/postgresql/postgresql-16-main.log
What it means: It’s online. You have a data directory and a log file path—two things you will learn to respect.
Decision: If Postgres starts cleanly and stays online under your app, that’s a strong signal you can afford the ops.
Task 11: Inspect Postgres durability and checkpoint settings (don’t fly blind)
cr0x@server:~$ sudo -u postgres psql -c "SHOW synchronous_commit; SHOW fsync; SHOW full_page_writes; SHOW checkpoint_timeout; SHOW max_wal_size;"
synchronous_commit
-------------------
on
(1 row)
fsync
-------
on
(1 row)
full_page_writes
------------------
on
(1 row)
checkpoint_timeout
--------------------
5min
(1 row)
max_wal_size
--------------
1GB
(1 row)
What it means: Defaults are conservative. They aim for safety on generic hardware, not necessarily your specific VPS.
Decision: If you need high write throughput, you may tune checkpoints and WAL size. If you need maximum safety, keep these conservative and invest in backups and testing.
Task 12: Spot vacuum pressure (Postgres’s “housekeeping tax”)
cr0x@server:~$ sudo -u postgres psql -c "SELECT relname, n_dead_tup FROM pg_stat_user_tables ORDER BY n_dead_tup DESC LIMIT 5;"
relname | n_dead_tup
---------+------------
(0 rows)
What it means: No user tables yet. Later, this shows whether dead tuples are piling up. Piles mean bloat, slower queries, and eventually paging misery.
Decision: If you choose Postgres, you must monitor vacuum/bloat. If you can’t commit to that, SQLite’s simplicity starts looking attractive—assuming concurrency fits.
Task 13: Identify slow queries in Postgres (decide if you need indexes or a different DB)
cr0x@server:~$ sudo -u postgres psql -c "SHOW shared_preload_libraries;"
shared_preload_libraries
--------------------------
(1 row)
What it means: If pg_stat_statements isn’t enabled, you’re missing a key lens into query behavior.
Decision: If you’re running anything beyond toy traffic, enable query stats and treat them as production telemetry. SQLite has options, but Postgres makes this kind of analysis routine.
Task 14: Estimate DB size and growth (SQLite file vs Postgres cluster)
cr0x@server:~$ du -sh /var/lib/myapp/app.db
48M /var/lib/myapp/app.db
cr0x@server:~$ sudo -u postgres psql -c "SELECT pg_size_pretty(pg_database_size(current_database()));"
pg_size_pretty
----------------
7289 kB
(1 row)
What it means: SQLite is one file; Postgres is a directory tree plus WAL. Growth patterns differ: SQLite file grows and may not shrink; Postgres can bloat unless vacuumed.
Decision: If you need predictable size management and retention, Postgres with proper vacuum and partitioning (if needed) tends to be easier to control than a monolithic file.
Task 15: Test backup/restore workflow (this decides whether you sleep)
cr0x@server:~$ sqlite3 /var/lib/myapp/app.db ".backup '/var/backups/app.db.bak'"
cr0x@server:~$ ls -lh /var/backups/app.db.bak
-rw-r--r-- 1 root root 48M Dec 30 03:12 /var/backups/app.db.bak
cr0x@server:~$ sudo -u postgres pg_dump -Fc -f /var/backups/pg.dump postgres
cr0x@server:~$ ls -lh /var/backups/pg.dump
-rw-r--r-- 1 postgres postgres 36K Dec 30 03:13 /var/backups/pg.dump
What it means: Both can be backed up. The key is consistency and restore testing. SQLite needs correct backup method; Postgres needs you to practice restore and permissions.
Decision: If you cannot and will not test restores, pick neither—because you’re not choosing a database, you’re choosing a future incident.
Fast diagnosis playbook
This is the “something is slow” triage sequence. The goal is to isolate the bottleneck in minutes, not debate architecture in Slack for hours.
First: is it CPU, memory, or disk?
cr0x@server:~$ uptime
03:20:11 up 12 days, 2:41, 1 user, load average: 0.22, 0.40, 0.35
cr0x@server:~$ vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 184320 28000 635000 0 0 10 25 120 180 3 1 95 1 0
0 0 0 183900 28000 635200 0 0 0 0 110 170 2 1 97 0 0
Interpretation: High wa means disk IO wait; high si/so means swapping; high r with low idle means CPU pressure.
Action: If the host is swapping, fix memory first (reduce connections, tune Postgres, add RAM). If IO wait is high, look at checkpointing, fsync costs, and write patterns.
Second: is the database locked or blocked?
SQLite: look for lock errors in app logs; check if you’re doing long transactions.
Postgres: check blocking locks.
cr0x@server:~$ sudo -u postgres psql -c "SELECT pid, wait_event_type, wait_event, state, query FROM pg_stat_activity WHERE state <> 'idle' ORDER BY pid;"
pid | wait_event_type | wait_event | state | query
------+-----------------+------------+--------+-------
(0 rows)
Interpretation: If you see sessions waiting on locks, you’re not “slow,” you’re serialized. Different fix: shorten transactions, add indexes to reduce lock duration, avoid long-running DDL in peak hours.
Third: is it a query problem or a capacity problem?
For Postgres, identify slow queries and explain them. For SQLite, examine your access patterns and indexes and consider moving heavy queries off the hot path.
cr0x@server:~$ sudo -u postgres psql -c "EXPLAIN (ANALYZE, BUFFERS) SELECT 1;"
QUERY PLAN
--------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=1)
Planning Time: 0.020 ms
Execution Time: 0.010 ms
(3 rows)
Interpretation: In real use, you look for sequential scans on big tables, huge buffer hits, or time spent waiting on IO.
Action: If queries are slow because of missing indexes, fix schema. If slow because disk is slow, fix storage or reduce write churn. If slow because of concurrency, fix connection pooling or pick the correct database.
Common mistakes (symptoms → root cause → fix)
These are not moral failures. They’re predictable outcomes of treating a database like a black box.
1) “database is locked” appears sporadically (SQLite)
Symptoms: App errors under load, spikes during background jobs, requests failing and then succeeding on retry.
Root cause: Multiple writers or long transactions holding write locks. WAL helps, but a single writer still needs time.
Fix: Enable WAL; keep transactions short; serialize writes via a job queue; add busy_timeout; or migrate to Postgres if you need concurrent writes.
2) SQLite feels fast until you deploy multiple app instances
Symptoms: Works in dev, flaky in prod; performance tanks only after scaling horizontally.
Root cause: File locking across processes becomes contention. Also: shared filesystems are a trap.
Fix: Don’t share SQLite over NFS. If you need more than one writer process, use Postgres.
3) Postgres is “slow” but CPU is idle
Symptoms: High latency, low CPU, periodic stalls.
Root cause: IO wait during checkpoints or fsync-heavy write workload; max_wal_size too small; poor storage.
Fix: Increase max_wal_size; tune checkpoint settings; move WAL to faster disk if possible; reduce synchronous writes for non-critical paths (carefully).
4) Postgres falls over with many connections on a small VPS
Symptoms: Memory spikes, OOM kills, “too many clients,” random timeouts.
Root cause: One connection per request pattern; per-connection memory overhead; no pooling.
Fix: Use PgBouncer; reduce max_connections; use a sane pool size; fix app to reuse connections.
5) Backups exist but restores fail
Symptoms: Restore test fails; permissions broken; missing roles; SQLite backup file corrupt or inconsistent.
Root cause: Backups taken incorrectly (SQLite file copy mid-write) or not tested (Postgres dumps missing globals/roles).
Fix: For SQLite, use .backup or the backup API; for Postgres, run restore drills including roles and schema; automate verification.
6) Postgres tables bloat and queries degrade over weeks
Symptoms: Disk usage grows faster than data; indexes swell; queries slow; vacuum runs constantly.
Root cause: MVCC dead tuples accumulate; autovacuum not keeping up; aggressive UPDATE/DELETE patterns.
Fix: Tune autovacuum per table; avoid hot updates where possible; consider partitioning or periodic maintenance.
7) SQLite file balloons and never shrinks
Symptoms: Disk use grows even after deletes; VPS runs low on disk.
Root cause: SQLite reuses pages but doesn’t always return space to filesystem; fragmentation; large deletes.
Fix: Periodic VACUUM (expensive); design retention strategy; consider splitting large tables or moving to Postgres if churn is high.
8) “We used Postgres because it’s enterprise” and now ops are drowning
Symptoms: Nobody owns upgrades, vacuum, backups; the DB is a pet, not cattle.
Root cause: Choosing Postgres without allocating operational maturity.
Fix: Either invest in ops basics (monitoring, backup drills, upgrade cadence) or keep it simple with SQLite until you truly need the server DB.
Three corporate mini-stories
Mini-story 1: The incident caused by a wrong assumption (SQLite file on “shared storage”)
The company was mid-size, the product was healthy, and someone had a bright idea: run two app instances behind a load balancer “for resilience.” The database was SQLite, sitting on what the VPS provider advertised as “shared storage,” mounted into both instances. It seemed elegant. One file. Two instances. What could go wrong?
It worked for a few days. Then came the first traffic bump—nothing dramatic, just a marketing email. Requests started piling up. Latency spiked. Some users got errors; some got stale reads; a few saw odd partial updates that vanished on refresh.
The on-call dug through logs and found intermittent “database is locked,” but not consistently. Worse, there were occasional “disk I/O error” style messages that looked like hardware. They weren’t. They were the filesystem and lock manager having a disagreement about who owned the truth across two nodes.
The wrong assumption was subtle: “If the storage is shared, the file lock is shared.” On many shared filesystems, advisory locks don’t behave like local ext4 locks, especially under failure or latency. SQLite wasn’t “broken”; the environment violated the assumptions it makes to deliver ACID semantics.
The fix was boring: move to Postgres on one node first, then add a replica later. They also removed the shared mount and treated storage boundaries as failure boundaries. The incident report didn’t blame SQLite; it blamed the architecture that pretended a file could be a distributed system.
Mini-story 2: The optimization that backfired (Postgres tuned for speed, paid in data loss anxiety)
A different org had Postgres on a small VPS. Writes were heavy: events, logs, counters. The team wanted lower latency and saw a blog post about turning off durability knobs. They changed settings to reduce fsync pressure and made commits return faster. Everyone cheered. Graphs went down and to the right.
Two weeks later the VPS host had an unplanned reboot. Nothing dramatic—just one of those “node maintenance” events you only learn about after it happens. Postgres restarted fine, but a slice of the most recent writes was missing. Not catastrophic, but enough to trigger customer questions and internal alarm bells.
Now the real tax arrived: uncertainty. They couldn’t confidently say what was lost, and the product folks started treating the database as “maybe consistent.” That is a corrosive state. It turns every bug into a debate about whether the data is real.
The optimization backfired because it optimized the wrong thing: steady-state latency at the cost of predictable durability. There are valid reasons to relax durability for ephemeral analytics or caches. But they were using it for customer-facing state.
The eventual fix was to restore safe settings for core tables, isolate high-write low-value data into separate paths, and run proper backups with restore tests. They also introduced batching to reduce commit frequency rather than gambling on crash behavior.
Mini-story 3: The boring but correct practice that saved the day (backup drills and restore automation)
This one is less dramatic, which is the point. A team running a SaaS on a single VPS used Postgres. They were not fancy. They didn’t have a platform team. But they did one thing relentlessly: weekly restore drills to a scratch VM, with a checklist.
They had a script that pulled the latest backup, restored it, ran a small suite of sanity queries, and confirmed the app could boot against it. They also kept a minimal “runbook” describing how to promote the restored DB if the primary died. Nobody loved doing it. It was like flossing.
Then a developer accidentally ran a destructive migration against production. Not malicious. Just a fat-fingered environment variable and a migration tool that happily complied. Tables were dropped. The on-call muted alerts, swore quietly, and started the restore drill they had practiced.
They still had a bad hour, but not a bad week. They restored, re-ran migrations correctly, and replayed a short window of business events from logs. The CEO never had to learn what “WAL” stands for, which is the highest compliment operations can receive.
Quote (paraphrased idea): “You don’t rise to the occasion; you fall back to your preparation.” — paraphrased idea often attributed in reliability/ops circles
Checklists / step-by-step plan
Checklist A: If you’re leaning SQLite (make it production-shaped)
- Confirm single-writer reality: list all code paths that write (web requests, workers, cron, admin scripts). If it’s more than one actor at a time, plan to serialize or migrate.
- Use WAL mode: set
PRAGMA journal_mode=WAL. - Set sane synchronous: usually
NORMALis a good VPS compromise; useFULLif you cannot tolerate recent-write loss on crash. - Set busy_timeout: make the app wait briefly rather than fail instantly on lock contention.
- Back up correctly: use SQLite’s backup mechanism, not “cp the file during peak writes.”
- Plan for file growth: monitor DB file size and free disk; schedule periodic vacuum only if you must.
- Don’t put SQLite on NFS/shared mounts: local disk only, unless you enjoy debugging file locks across latency.
Checklist B: If you’re leaning PostgreSQL (make it boring, stable, and cheap)
- Right-size connections: keep
max_connectionssane; use a pooler for web apps. - Set memory deliberately: tune
shared_buffersconservatively on small RAM; leave headroom for OS cache and your app. - Enable query visibility: turn on query stats so you can see what’s slow before users tell you.
- Monitor vacuum: watch dead tuples and autovacuum activity; bloat is a slow leak.
- Backups and restore tests: automate both. A backup without a restore drill is a wish.
- Upgrade planning: decide how you’ll handle minor updates and major version upgrades before you’re forced to.
- Disk management: monitor disk usage for data and WAL; avoid running at 90% full on a VPS.
Step-by-step: the no-regret decision path (15 minutes)
- Run Task 1–4 to understand RAM and IO reality.
- List your writers. If more than one concurrent writer exists now or soon, choose Postgres.
- If SQLite is still plausible, run Task 7–8. If lock contention appears under a toy concurrency test, choose Postgres.
- If choosing Postgres, run Task 9–12 and confirm you can keep it healthy on this VPS.
- Run Task 15 and do at least one restore drill. Pick the system whose restore path you can actually execute under stress.
Joke #2: The fastest database is the one you didn’t lose at 03:00, which is also why backups have the best ROI of any feature you’ll never demo.
FAQ
1) Can SQLite handle production traffic?
Yes, if “production traffic” means mostly reads, a small number of writes, and a controlled concurrency model. It’s used in plenty of real systems. It just doesn’t want to be your multi-tenant write coordinator.
2) Does WAL mode make SQLite “as good as Postgres”?
No. WAL reduces reader/writer blocking and improves concurrency, but you still have a single database file with locking semantics and fewer concurrency tools. Postgres is designed as a shared service.
3) Is Postgres overkill for a small VPS?
Sometimes. If your VPS is tiny and your workload is simple, Postgres can be extra moving parts. But if you have multiple writers or any growth trajectory, “overkill” quickly becomes “thank you for not making me migrate under pressure.”
4) What’s the biggest hidden cost of Postgres on a VPS?
Connection and memory management. Without pooling and sane limits, Postgres can burn RAM on idle sessions and die in a way that looks like “random instability.” It’s not random; it’s math.
5) What’s the biggest hidden cost of SQLite on a VPS?
Lock contention and operational assumptions. The moment you have multiple writers, long transactions, or you put the file on questionable storage, you inherit failure modes that feel mysterious until you accept the locking model.
6) If I start with SQLite, how painful is migrating to Postgres?
It ranges from “a weekend” to “a quarter,” depending on schema complexity, data volume, and how much your app relied on SQLite quirks. If you anticipate growth, design your app with a DB abstraction and migration tooling from day one.
7) Should I use SQLite for caching and Postgres for source of truth?
That can work, but don’t build a distributed system accidentally. If you need caching, consider in-memory caches or Postgres-native strategies. If you do use SQLite as a local cache, treat it as disposable and rebuildable.
8) What about durability: is SQLite unsafe?
SQLite can be durable when configured correctly and used on a filesystem that honors its expectations. The risk is not “SQLite is unsafe,” it’s “SQLite makes it easy for you to be unsafe without noticing.” Postgres centralizes those durability behaviors in a server that’s designed for crashes.
9) Do I need replication on a VPS?
Not always. For many VPS setups, the first win is reliable backups and restore drills. Replication is valuable once you have uptime requirements that exceed “restore within X minutes” and you can afford the complexity.
10) How do I decide if my app has “multiple writers”?
If writes can happen concurrently from more than one OS process or container (web workers, job workers, scheduled tasks, admin scripts), you have multiple writers. If you deploy multiple app instances, you definitely do.
Next steps you can do today
Pick a path and make it operationally real. Databases don’t fail because you chose the wrong brand; they fail because you didn’t match the system to the workload and didn’t practice recovery.
If you choose SQLite
- Enable WAL and set synchronous explicitly.
- Add a busy timeout and keep transactions short.
- Implement backups using SQLite’s backup mechanism and run a restore test.
- Write down a hard rule: “no shared filesystem, no multi-writer chaos.”
If you choose PostgreSQL
- Set up sane connection pooling and limits immediately.
- Turn on query visibility and watch for slow queries and locks.
- Automate backups and perform restore drills on schedule.
- Monitor disk usage and vacuum health before you need to.
The no-regret edition isn’t about picking the “best” database. It’s about picking the database whose failure modes you can predict, observe, and recover from on a VPS at human hours.