Backups don’t fail during your quarterly tabletop exercise. They fail at 02:17, when the on-call phone is vibrating off the nightstand and your brain is still negotiating with reality.
In that moment you don’t care about elegant architecture. You care about one thing: how fast can I restore, prove it’s correct, and get users unblocked without making the damage worse? MySQL and SQLite can both be recovered. The question is which one is easier to recover when you’re tired, rushed, and operating with incomplete information.
What “easier to recover” really means
Most comparisons between MySQL and SQLite backups are written by people who haven’t restored either one while Slack is melting down. Under pressure, “easier” is not “fewer features.” It’s a checklist of things you can do fast, repeatably, and safely.
The recovery questions that matter at 02:17
- Do I have a consistent backup? Not “a file exists.” Consistent means it represents a real point in time without torn pages or half-written transactions.
- Can I restore without guessing? Ideally: one command and a known-good procedure. Reality: you’ll still need judgment, but fewer forks in the road helps.
- Can I validate quickly? If verification takes hours, people will skip it. Then you’ll learn the backup is broken at the worst possible time.
- Can I do point-in-time recovery (PITR)? If you can’t, “we restored” may also mean “we lost the last six hours.”
- What happens if the storage layer is the problem? Corruption is rarely polite. It will happily look like an application bug until you’re deep in denial.
Here’s the operational truth: SQLite is simple because it’s mostly one file, and MySQL is recoverable because it’s a system with battle-tested tooling. Under pressure, both can be easy, and both can be a trap. The difference is the kind of trap.
One quote worth keeping in your runbook: Hope is not a strategy.
— General Gordon R. Sullivan
That’s the recovery mindset. Don’t hope the file copy was consistent. Don’t hope the binlogs exist. Don’t hope WAL was included. Prove it.
Interesting facts and historical context (the parts that matter)
Small historical details explain why each system behaves the way it does during backup and restore.
- SQLite started in 2000 as an embedded database for tools that needed a SQL engine without a server process. Its backup story is “copy data safely,” not “coordinate a cluster.”
- SQLite is famously “serverless”: no daemon, no network protocol. That simplicity is a recovery superpower—until concurrency and filesystem semantics get involved.
- SQLite’s WAL mode (write-ahead logging) became mainstream in the 2010s because it improves concurrency. It also changes what “a backup” means: the .db file alone may not contain the latest committed data.
- MySQL’s InnoDB became the default storage engine in MySQL 5.5. Before that, MyISAM was common and lacked crash-safe transactions. If your org has legacy assumptions, they may be older than your coworkers.
- MySQL binary logs were designed for replication but became the backbone of point-in-time recovery. They’re not optional if you care about undoing human mistakes.
- Percona popularized hot physical backups for InnoDB in the 2000s with XtraBackup, because logical dumps don’t scale and don’t preserve physical layout.
- “Copy the datadir” used to be a common MySQL folk remedy. It worked sometimes, then bit people hard when file-per-table, redo logs, and metadata got more complex.
- SQLite relies on filesystem locking semantics. Put it on a weird filesystem or a shared network mount with quirky locks and you can turn “simple file DB” into a performance and corruption festival.
History isn’t trivia. It’s the reason your restore plan works—or doesn’t—when the pager is loud.
Backup primitives: what you’re actually backing up
SQLite: “the database is a file” (plus sometimes two more)
SQLite stores data in a single primary database file (commonly app.db). But depending on journaling mode, you may also have:
- Rollback journal:
app.db-journal(common in DELETE journal mode). - WAL file:
app.db-walplus shared memory fileapp.db-shm(in WAL mode).
Under pressure, the trap is forgetting that “the database” may be three files and that they must be captured consistently. The good news: if you can use SQLite’s own backup mechanism or take the snapshot at the filesystem layer correctly, recovery is usually straightforward.
MySQL: a running system with multiple backup layers
MySQL has a server process, redo logs, undo logs, data files, and optionally binary logs. Backups come in two broad flavors:
- Logical backups (e.g.,
mysqldump,mysqlpump): portable SQL, slower at scale, can be partial, usually easier to inspect. - Physical backups (e.g., Percona XtraBackup, filesystem snapshots): faster, preserves physical structure, requires more discipline and compatibility care.
Under pressure, MySQL’s advantage is that it was built for operational recovery at scale—if you adopted the right backup strategy ahead of time. If you didn’t, it can be brutally unforgiving.
Joke #1: Backups are like fire extinguishers—everybody loves the idea, and nobody checks the pressure gauge until the kitchen is on fire.
SQLite recovery under pressure: the good, the bad, the file
Why SQLite can be ridiculously easy
If you have a consistent copy, restoring SQLite can be as simple as putting a file back where it belongs. There’s no server to start, no user grants to rehydrate, no replication topology to untangle.
This is why SQLite is beloved for mobile apps, edge devices, small internal tools, and “we need something reliable but not a full database service” products. For restore-time simplicity, it’s hard to beat: file in, app starts, done.
Why SQLite can be surprisingly hard
The hard part isn’t restoring. It’s knowing that what you copied is consistent.
If you copied the database file while the application was writing, you can get a backup that looks fine until you run a query that touches the wrong page. Or until the first vacuum. Or until the CEO tries to export a report and you discover you restored a logically inconsistent database that passes casual smoke tests.
WAL mode: recovery’s best friend and worst enemy
WAL mode is great for concurrency. It’s also a classic backup foot-gun. In WAL mode, recent committed transactions may live in -wal until checkpointed into the main .db file. If you back up only app.db, you may restore to an older state than you think.
Under pressure, SQLite recovery success often boils down to one question: did your backup include a consistent snapshot of .db, -wal, and -shm, or did it use SQLite’s online backup method?
Corruption reality
SQLite is robust, but it’s not magical. Corruption usually comes from the environment: bad storage, broken locking over network filesystems, or applications that treat SQLite like a multi-writer server database and do their own “creative” concurrency.
When SQLite corrupts, your recovery path often looks like this:
- Try to restore from a known-good backup.
- If no good backup exists, attempt a dump-and-reload, salvage, or table-by-table extraction.
- Accept that some data may be lost, and plan how to reconcile at the application layer.
MySQL recovery under pressure: options, sharp edges, and why it’s still sane
MySQL’s recovery advantage: multiple paths
MySQL gives you choices: logical restore, physical restore, crash recovery, replica promotion, point-in-time recovery via binlogs. Under pressure, multiple options are good if you already know which one you’re going to use. Otherwise it’s decision fatigue with SQL syntax.
Logical backups: reliable, but time can kill you
mysqldump is the duct tape of MySQL operations. You can use it almost anywhere, restore into slightly different versions, and inspect what’s inside.
The downside is speed and operational blast radius:
- Large databases can take a long time to dump and a longer time to restore.
- Restores are chatty and can overwhelm I/O and replication.
- Under pressure, you may be tempted to skip constraints, triggers, or indexes “temporarily.” Temporarily has a habit of becoming permanent.
Physical backups: fast restores, more prerequisites
Physical backups (like XtraBackup) can restore large InnoDB datasets much faster than logical SQL replay. They also preserve internal structures, which helps with predictable performance after restore.
The cost is that you must respect MySQL version compatibility, configuration, and the backup tool’s requirements. Physical restore can be a clean win—until you discover you never practiced it and your runbook is three years out of date.
PITR: the real reason MySQL wins many production arguments
When someone runs a destructive query, “restore last night’s backup” is not a plan, it’s a confession. MySQL with binary logs can recover to a specific timestamp or transaction boundary. That’s a big deal.
SQLite can do PITR-ish things if you ship WAL segments or take frequent snapshots, but it’s not a first-class, standardized operational flow across organizations the way MySQL binlog-based PITR is.
Crash recovery is built in, but don’t confuse it with backup
InnoDB crash recovery can replay redo logs and get you back to a consistent state after a crash. That is not the same as having a backup you can restore to another machine. Under pressure, teams often conflate these and then discover that “the server won’t start” is not solvable with optimism.
Joke #2: “We don’t need backups, we have RAID” is like saying you don’t need seatbelts because your car has four tires.
Pressure verdict: who wins and when
If you need the simplest possible restore
SQLite wins when your database is small-to-medium, writes are controlled, and you have a clean, consistent snapshot mechanism (SQLite backup API, or filesystem snapshots done right). Restore can be as simple as replacing a file and restarting an app.
If you need point-in-time recovery and operational flexibility
MySQL wins when you need to undo specific mistakes, restore large datasets, handle concurrency and multiple writers, and recover without losing hours of data. MySQL’s backup ecosystem is richer, and it’s built around operational reality.
Where teams actually get hurt
SQLite teams get hurt by wrong assumptions about file copying, WAL behavior, and filesystem locking.
MySQL teams get hurt by not choosing a backup strategy (logical vs physical + binlogs) and then improvising during an outage.
If you want the most recoverable system under pressure, the answer is not “choose MySQL” or “choose SQLite.” The answer is: choose the database that matches your operational maturity, and then practice the restore like you mean it.
Practical tasks with commands, outputs, and decisions (12+)
These are real, runnable commands. Each one includes what the output means and what decision you make next. Under pressure, you want commands that collapse uncertainty.
SQLite tasks
Task 1: Identify whether the database is using WAL
cr0x@server:~$ sqlite3 /var/lib/app/app.db "PRAGMA journal_mode;"
wal
Meaning: Output wal means active changes may be in app.db-wal.
Decision: Backups must include app.db, app.db-wal, and app.db-shm, or use SQLite’s online backup method.
Task 2: Check if WAL files exist (and their size)
cr0x@server:~$ ls -lh /var/lib/app/app.db*
-rw-r----- 1 app app 512M Dec 30 01:58 /var/lib/app/app.db
-rw-r----- 1 app app 96M Dec 30 02:15 /var/lib/app/app.db-wal
-rw-r----- 1 app app 32K Dec 30 02:15 /var/lib/app/app.db-shm
Meaning: A large -wal suggests many committed transactions not yet checkpointed into the main file.
Decision: If restoring from file copies, restore all three files together from the same snapshot set.
Task 3: Take a consistent online backup with SQLite’s backup command
cr0x@server:~$ sqlite3 /var/lib/app/app.db ".backup '/backups/app.db.bak'"
Meaning: SQLite creates a consistent backup even while the DB is in use (it cooperates with its own locking).
Decision: Prefer this over cp when you can; it avoids the “torn copy” problem.
Task 4: Integrity check on a restored SQLite file
cr0x@server:~$ sqlite3 /restore/app.db "PRAGMA integrity_check;"
ok
Meaning: ok is the best two-letter word you’ll read tonight.
Decision: If it’s not ok, stop and restore a different backup or move into salvage mode.
Task 5: Quickly list tables to confirm schema is present
cr0x@server:~$ sqlite3 /restore/app.db ".tables"
accounts audit_log invoices sessions users
Meaning: You have the expected tables; you’re not staring at an empty or wrong database file.
Decision: Proceed to application-level smoke tests. If tables are missing, you restored the wrong file or a partial backup.
Task 6: Salvage from a damaged SQLite database using dump
cr0x@server:~$ sqlite3 /var/lib/app/app.db ".dump" > /tmp/app_dump.sql
Error: database disk image is malformed
Meaning: The file is corrupt enough that a full dump failed.
Decision: Try restoring a prior backup. If none, attempt salvage tactics (page-level recovery tools or extracting unaffected tables), and plan for partial data loss.
MySQL tasks
Task 7: Confirm binary logging is enabled (PITR capability check)
cr0x@server:~$ mysql -uroot -p -e "SHOW VARIABLES LIKE 'log_bin';"
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin | ON |
+---------------+-------+
Meaning: Binlogs are enabled, so PITR is possible if you retained the files.
Decision: If OFF, you cannot do true PITR. You’ll be restoring to the backup time and accepting more data loss.
Task 8: Find where binlogs live and what format they’re in
cr0x@server:~$ mysql -uroot -p -e "SHOW VARIABLES WHERE Variable_name IN ('log_bin_basename','binlog_format');"
+------------------+----------------------+
| Variable_name | Value |
+------------------+----------------------+
| binlog_format | ROW |
| log_bin_basename | /var/lib/mysql/binlog|
+------------------+----------------------+
Meaning: You know where to look and that the format is ROW (good for correctness, less readable).
Decision: Make sure backups include binlogs or that they’re shipped elsewhere; ROW means you’ll likely use mysqlbinlog for replay rather than hand-editing.
Task 9: Verify the latest successful backup exists and is non-trivial in size
cr0x@server:~$ ls -lh /backups/mysql/full/
-rw-r----- 1 backup backup 8.2G Dec 30 01:00 full.sql.gz
-rw-r----- 1 backup backup 1.4M Dec 30 01:00 full.sql.gz.sha256
Meaning: A multi-GB dump suggests you didn’t back up an empty schema by mistake.
Decision: Verify checksum before restore. If the dump is suspiciously small, stop and investigate before you replace production with a very accurate empty database.
Task 10: Checksum validation before restore
cr0x@server:~$ sha256sum -c /backups/mysql/full/full.sql.gz.sha256
full.sql.gz: OK
Meaning: Your backup artifact survived storage, transfer, and time.
Decision: If this fails, don’t restore it. Find a different backup, or you’ll create a second incident.
Task 11: Restore a mysqldump into a new instance (safer than in-place)
cr0x@server:~$ zcat /backups/mysql/full/full.sql.gz | mysql -uroot -p --host=127.0.0.1 --protocol=tcp
Meaning: No output usually means success; errors will print to stderr.
Decision: Restore into a fresh instance or separate schema when possible, then cut over. In-place restores are how you turn “bad day” into “career development opportunity.”
Task 12: Confirm row counts for critical tables after restore
cr0x@server:~$ mysql -uroot -p -e "SELECT table_name, table_rows FROM information_schema.tables WHERE table_schema='app' AND table_name IN ('users','orders');"
+------------+------------+
| table_name | table_rows |
+------------+------------+
| users | 184233 |
| orders | 921044 |
+------------+------------+
Meaning: Counts are plausible (not zero, not absurdly small).
Decision: If counts are off, you may have restored the wrong backup, or the dump was taken mid-incident and is incomplete.
Task 13: Identify the binlog position at backup time (for PITR)
cr0x@server:~$ zcat /backups/mysql/full/full.sql.gz | grep -m1 "CHANGE MASTER TO"
-- CHANGE MASTER TO MASTER_LOG_FILE='binlog.003982', MASTER_LOG_POS=19483211;
Meaning: This dump recorded the binlog file/position consistent with the snapshot.
Decision: Use that position as the start point for replaying binlogs to reach the desired recovery time.
Task 14: Inspect binlogs around a suspected bad query time
cr0x@server:~$ mysqlbinlog --start-datetime="2025-12-30 01:00:00" --stop-datetime="2025-12-30 02:00:00" /var/lib/mysql/binlog.003982 | head
# at 19483211
#251230 1:00:00 server id 1 end_log_pos 19483315 CRC32 0xa1b2c3d4 Anonymous_GTID last_committed=0 sequence_number=1 rbr_only=no
SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;
Meaning: You’re reading the binlog stream for that window.
Decision: Decide whether you’re doing time-based replay (stop before the mistake) or filtering a specific statement (harder, riskier, sometimes necessary).
Task 15: Replay binlogs to reach a target time (PITR)
cr0x@server:~$ mysqlbinlog --start-position=19483211 --stop-datetime="2025-12-30 01:47:00" /var/lib/mysql/binlog.003982 | mysql -uroot -p
Meaning: Transactions between the backup position and the stop time are applied.
Decision: If you suspect clock drift, prefer stop-position or GTID-based recovery. Time-based recovery assumes your timestamps and timezones aren’t lying to you.
Task 16: Check InnoDB crash recovery signals in logs
cr0x@server:~$ journalctl -u mysql -n 50 --no-pager
Dec 30 02:11:04 db1 mysqld[2190]: InnoDB: Starting crash recovery from checkpoint LSN=7123489123
Dec 30 02:11:05 db1 mysqld[2190]: InnoDB: 1 transaction(s) which must be rolled back
Dec 30 02:11:06 db1 mysqld[2190]: InnoDB: Crash recovery finished.
Meaning: MySQL performed crash recovery and came up cleanly.
Decision: If crash recovery loops or fails, stop hammering restart. Consider restoring from backup; repeated restarts can worsen disk issues.
Fast diagnosis playbook: find the bottleneck quickly
This is the “stop thrashing and get signal” sequence. You’re trying to answer: what’s preventing recovery—missing artifacts, slow restore, corruption, or the wrong recovery target?
First: prove you have the right artifacts
- SQLite: Confirm journaling mode, and confirm you have a consistent set of files (
.db+-wal+-shmwhen applicable). - MySQL: Confirm backup exists, passes checksum, and (if PITR) confirm binlogs exist for the required time window.
Second: prove the restore target is sane
- Restore into a new path/instance first whenever possible.
- Validate filesystem free space. Restores fail late and rudely when disks fill.
- Check version compatibility (especially physical MySQL restores).
Third: find whether you’re I/O-bound, CPU-bound, or locked
- I/O-bound symptoms: restore throughput low, high await times, storage latency spikes. Fix: move to faster disk, use snapshots/physical restore, avoid decompressing on the same slow volume.
- CPU-bound symptoms: decompression pegging cores, checksums slow. Fix: parallel decompression, move compute closer to data, or use pre-decompressed staging.
- Locked symptoms (SQLite): “database is locked”, long stalls. Fix: stop writers, use backup API, or restore from snapshot while app is down.
Fourth: decide the recovery objective clearly
- Do you need last known good (quick restore) or point-in-time (precise)?
- Are you recovering from hardware failure, operator error, or data corruption? The path differs.
Common mistakes: symptom → root cause → fix
SQLite mistakes
- Symptom: After restore, recent writes are missing.
- Root cause: WAL mode enabled; backup copied only
.dbwithout-wal. - Fix: Back up via
sqlite3 .backupor snapshot all related files atomically; verify by comparing expected last-write markers. - Symptom: “database disk image is malformed” during queries or dump.
- Root cause: Corrupt database file (often due to unsafe copy during writes, storage corruption, or broken locking on shared mounts).
- Fix: Restore a known-good backup; if unavailable, attempt salvage/dump of intact tables and rebuild DB; move DB to local disk with correct locking.
- Symptom: “database is locked” during backup or app startup.
- Root cause: Long-running writer transactions, or multiple writers in an environment that doesn’t coordinate properly.
- Fix: Quiesce writers; ensure WAL mode and busy timeout settings are sane; run backups via SQLite backup mechanism.
MySQL mistakes
- Symptom: Restore completes, but data is hours old.
- Root cause: No binlogs retained (or binlogs not backed up/shipped), so no PITR.
- Fix: Enable
log_bin, set a retention policy, and back up or ship binlogs off-host. Test PITR quarterly. - Symptom: Physical restore fails to start MySQL with InnoDB errors.
- Root cause: Version/config mismatch, missing redo/undo files, or restoring datadir without proper preparation step.
- Fix: Standardize versions, record configs with backups, and rehearse physical restore procedures in an isolated environment.
- Symptom:
mysqldumprestore is painfully slow and blocks production. - Root cause: Restoring into a busy instance; single-threaded replay; heavy index creation during load.
- Fix: Restore into a new instance, then cut over. Consider physical backups for large datasets. Use sane load strategies (but don’t “optimize” blindly).
- Symptom: PITR replay causes duplicate key errors or weird constraint failures.
- Root cause: Started replay from the wrong binlog position, replayed into the wrong schema state, or mixed GTID/non-GTID assumptions.
- Fix: Always record binlog coordinates at backup time. Use a clean restore baseline, then replay forward. Prefer GTID-consistent setups if you operate at scale.
Three corporate-world mini-stories from the trenches
Incident #1: A wrong assumption (SQLite file copy ≠ backup)
A small product team shipped an internal approval workflow tool. It used SQLite because it was “just metadata,” the traffic was low, and it kept deployment dead simple. The backup plan was a nightly cron job: cp app.db app.db.$(date) to a backup directory and then to object storage.
It ran for months. Nobody thought about it. That’s always when trouble starts.
One morning, the tool started failing approvals with vague SQL errors. The on-call restored last night’s backup. The application came back, but approvals were missing—specifically, the last day of changes. Everyone blamed the restore. Then they blamed the app. Then, inevitably, they blamed the on-call.
The real issue: the database was in WAL mode, and the cron job copied only app.db. The “backup” was consistent in the sense that it was a valid SQLite file. It was also missing committed transactions that lived in the WAL at the time of copy. The nightly copy faithfully backed up an older view of reality.
The fix was boring: switch backups to SQLite’s online backup command (or snapshot the full set of WAL-related files atomically), add an integrity check after backup, and add a restore test. Suddenly restore meant “restore,” not “restore-ish.”
Incident #2: An optimization that backfired (MySQL restore speed hacks)
A mid-sized SaaS company ran MySQL with nightly logical dumps. Restores were slow, so someone proposed a classic “speed up imports” play: disable foreign key checks and unique checks during restore, and crank bulk settings. It worked in staging. It worked in a dev sandbox. It even worked once in production, which is how bad ideas earn tenure.
Then they hit a real incident: a primary node failed and they needed to rebuild quickly. They restored the dump with constraints disabled and brought the service back. Metrics looked normal. The incident was declared resolved.
Two days later, customer support tickets spiked. The system had orphaned rows and duplicated associations in edge cases. Not everywhere. Just enough to be costly and humiliating. The “optimization” had allowed inconsistent intermediate states to slip through because the restore process wasn’t identical to normal operation, and application writes began before all constraints were re-enabled and verified.
The postmortem was painful but useful: restores must be treated as production changes with validation gates. They moved to restoring into a new instance, ran integrity checks and application-level reconciliation jobs, and only then cut over. They also revisited physical backups to reduce restore time without turning off correctness.
Incident #3: The boring but correct practice that saved the day (MySQL PITR with binlogs)
An enterprise team ran MySQL with weekly full backups, daily incremental backups, and continuous binlog shipping. Not glamorous. It required discipline: retention settings, monitoring for binlog gaps, periodic restore drills, and the social skill of insisting that “yes, we are testing restores again.”
One afternoon, an engineer ran a data cleanup script intended for a staging environment. It connected to production. The script wasn’t malicious; it was just confident. It deleted rows in a high-value table, and the application happily propagated the absence of data into caches and downstream processes.
They declared an incident and froze writes. Then they executed a practiced runbook: restore last full backup to a new instance, apply incremental backups, and replay binlogs to a timestamp just before the script started. The service was back with minimal data loss, and the team could later reconcile the few minutes of writes that were intentionally paused during recovery.
What saved them wasn’t genius. It was a predictable system: known backup coordinates, binlog continuity, and the habit of rehearsing. When the adrenaline hit, there was less thinking and more doing.
Checklists / step-by-step plan
SQLite: step-by-step recovery plan
- Stop writers (or put the app in maintenance mode). SQLite can handle concurrency, but recovery is not the time to negotiate locks with a live workload.
- Identify journaling mode and confirm which files must be restored.
- Restore as a set:
.dbplus WAL-related files if applicable, from the same snapshot. - Run integrity checks and basic schema checks.
- Smoke test with application queries that touch real data paths, not just “SELECT 1”.
- Capture the broken artifact for later analysis. Don’t overwrite it. Incidents are expensive; extract learning from them.
MySQL: step-by-step recovery plan (logical + PITR)
- Define the recovery point: last good time or “just before the bad query.” Write it down.
- Freeze writes if the goal is correctness (especially for PITR). You can’t chase a moving target forever.
- Validate backup artifact (checksum, size sanity).
- Restore to a new instance when possible. In-place restores are for emergencies inside emergencies.
- Apply binlogs from the recorded position to the target timestamp/position.
- Validate data: row counts, critical invariants, and app-level checks.
- Cut over (DNS, load balancer, connection strings), then monitor error rates and replication if applicable.
- Post-restore hygiene: ensure binlog retention resumes, backups continue, and the old instance is preserved long enough for forensic comparison.
MySQL: step-by-step recovery plan (physical restore mindset)
- Confirm version and config compatibility with the backup.
- Provision clean storage with enough space for data + overhead.
- Prepare the backup (apply logs) per your tool’s procedure.
- Restore the datadir and set ownership/permissions correctly.
- Start MySQL and watch logs until it’s fully ready.
- Validate with checks that match your business invariants.
FAQ
1) Is SQLite “easier to recover” because it’s a single file?
Sometimes. If you have a consistent snapshot, restoring is trivial. If you copied it unsafely or forgot WAL files, recovery becomes archaeology.
2) Can SQLite do point-in-time recovery like MySQL?
Not as a standardized operational flow. You can approximate PITR with frequent snapshots or WAL shipping, but it’s bespoke and easy to get wrong under pressure.
3) Is cp app.db ever a valid SQLite backup?
Only if you can guarantee the database is not being written to during the copy, or you’re taking a filesystem snapshot that’s crash-consistent and includes WAL-related files. Otherwise, use SQLite’s backup mechanism.
4) For MySQL, is mysqldump good enough?
For small-to-medium datasets and portability, yes. For large datasets or strict RTOs, you’ll want physical backups plus binlogs. And you’ll want to practice restores.
5) What’s the biggest MySQL recovery “gotcha” in real life?
Thinking you have PITR when you don’t. Binlogs disabled, binlogs not retained, or binlogs not shipped off-host turns “undo the mistake” into “restore last night and apologize.”
6) If MySQL has crash recovery, why do I need backups?
Crash recovery helps a single instance recover from an unclean shutdown. It doesn’t protect you from disk loss, operator error, ransomware, or “we need to restore to yesterday.”
7) Which is more likely to corrupt?
Both can corrupt if storage lies. SQLite is more sensitive to filesystem locking and unsafe copying patterns. MySQL can also corrupt, but its ecosystem and operational patterns tend to catch issues earlier—if you monitor and validate.
8) What’s the fastest safe restore path for each?
SQLite: restore a known-good snapshot of the DB (and WAL set if applicable), run PRAGMA integrity_check, restart app. MySQL: restore from physical backup to a fresh instance, then apply binlogs to the target point.
9) Should I keep SQLite on network storage?
Avoid it unless you are confident in the filesystem’s locking semantics and behavior under contention. SQLite is happiest on local disks. Network mounts are where “simple” goes to get complicated.
10) How often should we test restores?
At least quarterly for systems that matter, and after any major version/config change. If you’ve never restored successfully, you don’t have backups—you have storage.
Conclusion: practical next steps
If you want recovery to be easy under pressure, design for it:
- For SQLite: stop doing casual file copies. Use SQLite’s backup mechanism or consistent snapshots, and always account for WAL. Add integrity checks after backup and after restore.
- For MySQL: pick a strategy: logical for portability, physical for speed, and binlogs for PITR. Then practice restores into a clean environment until the runbook stops being aspirational fiction.
- For both: validate backups automatically, keep multiple generations, and run at least one real restore drill per quarter. The time to discover your backup is broken is not when your customers already did.
Recovery isn’t a feature you add later. It’s a habit you either build deliberately—or pay for in sweat at 02:17.