You click refresh. The banner is still there. Your homepage is down, your boss is asking if “the internet is broken,” and WordPress is cheerfully insisting it’s “briefly unavailable.”
That message is supposed to last seconds. When it sticks, it’s almost never mysterious. It’s a dead-simple lock file, a failed update, or a permissions/storage problem wearing a cheap disguise. Let’s treat it like an incident: verify impact, identify the blocking mechanism, restore service, and then fix the conditions that made it happen.
What the maintenance message really means
WordPress maintenance mode is not a “mode” in the modern feature-flag sense. It’s a file-based lock.
During an update (core, plugin, theme, translations), WordPress creates a file named .maintenance in the site root (same directory as wp-config.php).
While that file exists and appears “fresh enough,” WordPress short-circuits normal page rendering and returns the maintenance message.
The design is intentionally blunt. Updating code on disk while requests are actively executing PHP is a minefield. The lock file is WordPress’s way of saying:
“Back off. I’m swapping parts.”
Maintenance mode is supposed to clear automatically at the end of a successful update. When it doesn’t, the lock file remains, or WordPress keeps thinking an upgrade is in progress because something else is wedged: a half-extracted package, a permissions problem, a full disk, a cached response, or a PHP process that died mid-flight.
The message is honest about one thing: it really does happen during scheduled maintenance. The lie is “briefly.”
How WordPress decides to show the message
At a high level, WordPress does this:
- Creates
.maintenancewith a timestamp. - Runs the update process (download, unzip, copy, maybe run DB upgrades).
- Deletes
.maintenancewhen finished.
If the update process is interrupted—HTTP timeout, PHP fatal error, filesystem write failure, killed worker, crash, or a human clicking “Update all” and closing the tab mid-request—the cleanup step may never happen.
Joke #1: WordPress maintenance mode is like a “Do Not Disturb” sign—great until housekeeping never comes back and you start living with yesterday’s towels.
Interesting facts and a little history (because it matters)
If you run WordPress in production, it helps to know what assumptions are baked into its update and locking mechanisms. Here are concrete facts that explain why this issue keeps showing up in 2025:
- The lock is a file, not a database flag. That choice dates back to early WordPress era simplicity: filesystem writes were assumed cheaper than schema changes and safe across shared hosting.
- Automatic background updates arrived in WordPress 3.7 (2013). Before that, most updates were manual and interactive, so failures were noticed faster—by annoyed humans.
- The maintenance message is intentionally generic. It avoids exposing internal details to unauthenticated users, which is good security hygiene and bad operator ergonomics.
- Core updates use a “copy then swap” style. WordPress tries to reduce partial-upgrade states, but plugins/themes are still vulnerable to half-written directories if the process is interrupted.
- Filesystem credentials logic exists because many sites can’t write to their own directories. This legacy from FTP-based shared hosting still influences how updates fail (credentials prompts, inability to write, silent partial failures).
- Translation updates are their own update stream. They can trigger maintenance mode too, even when you swear “we didn’t change anything.”
- Object caches and full-page caches can outlive the lock. You can delete
.maintenanceand still serve the banner if a cache upstream decided that message is “content.” - Blue/green deployments made this rarer—until people started updating on live nodes again. In containerized environments, updating on-disk inside a running container is the modern equivalent of editing production with vim.
- Some hosts implement their own “maintenance” toggles. Managed WordPress platforms may present the same message via proxy rules, not WordPress itself.
Fast diagnosis playbook (first/second/third)
You want the shortest path to “site is back” without creating a more interesting incident. Here’s the order that works when you’re on-call and your coffee is still loading.
First: confirm whether it’s WordPress or the edge lying to you
- Bypass caches and test origin directly (or at least with cache-busting headers).
- If the banner persists only through CDN/WAF but not at origin, this is a cache purge incident, not a WordPress incident.
Second: check for .maintenance and its timestamp
- If
.maintenanceexists and is old, remove it and retest. - If it’s new and updates are legitimately running, don’t yank the file blindly—verify the updater process isn’t still copying files.
Third: find the update that failed and why
- Check PHP-FPM/Apache/Nginx error logs for fatals/timeouts during upgrade.
- Check disk space and inode exhaustion (yes, inodes still ruin days).
- Check permissions/ownership: can the PHP user write to
wp-contentand (during core) to WordPress root? - Check for stuck
wp-content/upgradeartifacts.
Why it gets stuck: the real failure modes
1) The .maintenance file never got deleted
The most common case. A request timed out, a PHP worker died, someone navigated away, or the web server restarted mid-update.
WordPress doesn’t have a background janitor that reliably cleans up after catastrophic interruption.
If you remove the file and everything loads, you didn’t “fix” the update—you removed the lock. Sometimes that’s enough. Sometimes you just unblocked users while leaving a half-updated plugin that will crash later.
2) Partial update: plugin/theme directory in a broken state
Updates can leave behind:
- A plugin folder missing critical files (autoloaders, main plugin file).
- A new version extracted into a temp directory but not moved into place.
- A mix of old and new files because of permission failures mid-copy.
Removing .maintenance may just reveal the real problem: fatal errors, white screens, or admin login loops.
3) Filesystem permissions/ownership mismatch
WordPress updates need write access. On many servers, the code is owned by a deploy user (or root), but PHP runs as www-data (or similar).
Updates then fail halfway: downloaded zip exists, extraction fails, cleanup never runs, .maintenance remains.
4) Storage problems: full disk, full inodes, slow I/O, or network storage hiccups
As a storage engineer: this is where the “briefly” part dies.
Updates are write-heavy: download, unzip (lots of small files), rename, delete.
If your disk is full, or your inode table is full, or your NFS/EFS/SMB backend is having a moment, the update can hang or fail at a random step.
5) Caching layers keep serving the banner after it’s fixed
CDNs, reverse proxies, and WordPress caching plugins can cache the maintenance response as if it were a valid page.
The origin is healthy, but the edge keeps replaying the bad news.
6) Concurrent updates or “Update all” stampedes
Multiple admins clicking update buttons at the same time is an underrated chaos generator.
Even with a maintenance lock, you can still get overlapping attempts that leave temp directories and inconsistent states.
7) Managed host “maintenance mode” or platform updates
Some platforms show the same message while they snapshot, migrate, or patch. In that case, deleting .maintenance won’t help because WordPress isn’t the one emitting it.
Practical tasks with commands: diagnose, decide, fix
Below are real tasks you can run on a typical Linux host. Each includes: the command, what output means, and what decision you make.
Adjust paths to your docroot. Assume the WordPress root is /var/www/example.com/public.
Task 1: Confirm you’re seeing the origin, not a cached edge response
cr0x@server:~$ curl -sS -D- -o /dev/null -H 'Cache-Control: no-cache' https://example.com/ | sed -n '1,20p'
HTTP/2 503
date: Sat, 27 Dec 2025 12:01:11 GMT
content-type: text/html; charset=UTF-8
cache-control: no-cache, must-revalidate, max-age=0
server: nginx
What it means: HTTP 503 is consistent with WordPress maintenance mode. If you saw headers like x-cache: HIT or CDN-specific headers, you’d suspect caching.
Decision: If the response is 503 at origin too, proceed to server-side checks. If origin is 200 but edge is 503, purge/ban cache and review CDN rules.
Task 2: Check if .maintenance exists (and where)
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && ls -la .maintenance || echo "no .maintenance"'
-rw-r--r-- 1 www-data www-data 55 Dec 27 11:43 .maintenance
What it means: The file exists and is readable. Its presence alone can trigger the banner.
Decision: Check the timestamp freshness next. If it’s old, you can usually remove it. If it’s recent, confirm whether an update is actively running.
Task 3: Inspect the contents of .maintenance
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && cat .maintenance'
<?php $upgrading = 1766835803; ?>
What it means: It stores a UNIX timestamp. WordPress uses this to decide whether the lock is “stale.”
Decision: Convert it to human time and compare to now.
Task 4: Convert the upgrade timestamp and judge staleness
cr0x@server:~$ date -d @1766835803
Sat Dec 27 11:43:23 UTC 2025
What it means: Now you know when maintenance started.
Decision: If this was 30+ minutes ago and no updates are running, treat it as stuck and remove the file after checking logs for failures.
Task 5: Check whether an update process is actively running (PHP-FPM workers, unzip, wp-cli)
cr0x@server:~$ ps aux | egrep 'wp-cli|wordpress|unzip|php.*(update|upgrade)' | grep -v egrep
www-data 21904 0.2 1.1 312000 46000 ? S 11:42 0:02 php-fpm: pool www
What it means: Nothing obvious is running besides normal PHP workers.
Decision: Safe to assume the update is not in-flight. Move on to cleanup: remove the lock and then validate integrity.
Task 6: Remove .maintenance safely and retest
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && rm -v .maintenance'
removed '.maintenance'
What it means: The lock is gone.
Decision: Immediately retest the site and admin. If it loads, you still need to confirm updates didn’t leave the code half-broken.
Task 7: Check for the upgrade working directory that signals interrupted upgrades
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && ls -la wp-content | sed -n "1,60p"'
total 64
drwxr-xr-x 9 www-data www-data 4096 Dec 27 11:42 .
drwxr-xr-x 5 www-data www-data 4096 Dec 27 11:40 ..
drwxr-xr-x 2 www-data www-data 4096 Dec 27 11:42 upgrade
drwxr-xr-x 36 www-data www-data 4096 Dec 27 11:10 plugins
drwxr-xr-x 8 www-data www-data 4096 Dec 27 10:55 themes
What it means: The upgrade directory exists. That’s normal during upgrades; it’s suspicious when it sticks around full of debris.
Decision: Inspect it. If it contains partial extracts from hours ago, clean it up after confirming no update is running.
Task 8: Inspect wp-content/upgrade contents
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && find wp-content/upgrade -maxdepth 2 -type f | head'
wp-content/upgrade/temp-backup/plugin-akismet.1766835740.zip
wp-content/upgrade/akismet/akismet.php
What it means: You have leftover artifacts. Not always harmful, but a sign the update path didn’t finish cleanly.
Decision: If the site is stable, you can remove stale upgrade temp content to prevent future collisions and reclaim space.
Task 9: Check disk space (updates fail when you’re out of space)
cr0x@server:~$ df -h /var/www/example.com/public
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 40G 39G 300M 99% /
What it means: 99% usage. Updates need headroom for zip downloads and extraction. 300MB is “one plugin update away from shame.”
Decision: Free space before reattempting updates. Otherwise you’ll repeat the incident.
Task 10: Check inode exhaustion (the silent killer of “lots of small files” operations)
cr0x@server:~$ df -i /var/www/example.com/public
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda1 2621440 2621200 240 100% /
What it means: You have effectively zero inodes. Unzipping WordPress updates creates thousands of files; inode exhaustion makes “write” fail even when space remains.
Decision: Find and delete inode-heavy junk (cache directories, old backups, temp files). Then re-run updates.
Task 11: Check file ownership and permissions in the WordPress root
cr0x@server:~$ sudo bash -lc 'cd /var/www/example.com/public && stat -c "%U:%G %a %n" wp-config.php wp-content wp-admin | sed -n "1,10p"'
root:root 640 wp-config.php
root:root 755 wp-content
root:root 755 wp-admin
What it means: Everything is owned by root. If PHP runs as www-data, WordPress can’t write updates unless you’ve configured filesystem methods/ACLs.
Decision: Decide your operational model: either (a) deploy immutably and do updates via CI/CD, or (b) allow WordPress to write by fixing ownership/ACLs. Don’t half-do both.
Task 12: Verify the PHP runtime user and whether it can write
cr0x@server:~$ ps -o user,group,comm -C php-fpm8.2 2>/dev/null | head
USER GROUP COMMAND
root root php-fpm8.2
www-data www-data php-fpm8.2
What it means: Workers run as www-data.
Decision: Ensure www-data has write permissions where updates occur (commonly wp-content and sometimes the full root for core updates). If you can’t allow that, stop doing in-place updates.
Task 13: Run WP-CLI to check core integrity and pending updates
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && wp core verify-checksums'
Success: WordPress installation verifies against checksums.
What it means: Core files match expected checksums for your version. Good sign that core didn’t half-upgrade.
Decision: If this fails, plan to reinstall core files (without touching wp-content) or redeploy from a known-good artifact.
Task 14: Identify what was updating when it got stuck
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && wp plugin list --update=available'
+-------------------+----------+-----------+---------+
| name | status | update | version |
+-------------------+----------+-----------+---------+
| akismet | active | available | 5.3 |
| woocommerce | active | available | 8.4.0 |
+-------------------+----------+-----------+---------+
What it means: Updates are available; the stuck maintenance likely occurred during one of these (or during core/theme updates).
Decision: Don’t click “update all” again. Update one component at a time after stabilizing disk/permissions, watching logs.
Task 15: Check recent PHP errors that often coincide with failed upgrades
cr0x@server:~$ sudo tail -n 60 /var/log/php8.2-fpm.log
[27-Dec-2025 11:42:19] WARNING: [pool www] child 22001 said into stderr: "PHP Fatal error: Uncaught Error: Class 'Automattic\WooCommerce\Internal\DependencyManagement\Container' not found in /var/www/example.com/public/wp-content/plugins/woocommerce/woocommerce.php:45"
[27-Dec-2025 11:42:19] WARNING: [pool www] child 22001 said into stderr: "PHP Stack trace: #0 {main} thrown in /var/www/example.com/public/wp-content/plugins/woocommerce/woocommerce.php on line 45"
What it means: A plugin is in a broken state (missing class). That’s classic partial update: some files moved, others didn’t.
Decision: Restore that plugin directory from a known-good version (backup/artifact), or reinstall via WP-CLI. Don’t just delete random files.
Task 16: Reinstall a broken plugin cleanly (without guessing)
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && wp plugin deactivate woocommerce && wp plugin install woocommerce --force --activate'
Plugin 'woocommerce' deactivated.
Installing WooCommerce (9.1.0)
Downloading installation package from https://downloads.wordpress.org/plugin/woocommerce.9.1.0.zip...
Unpacking the package...
Installing the plugin...
Plugin installed successfully.
Activating 'woocommerce'...
Plugin 'woocommerce' activated.
Success: Installed 1 of 1 plugins.
What it means: You replaced the plugin files with a coherent set and reactivated it.
Decision: Retest critical flows (checkout, login, admin). If this is a revenue plugin, do not assume “activated” equals “healthy.”
Task 17: Check the web server error logs for permission/rename failures during update
cr0x@server:~$ sudo tail -n 80 /var/log/nginx/error.log
2025/12/27 11:42:18 [error] 21011#21011: *4411 FastCGI sent in stderr: "PHP message: PHP Warning: rename(/var/www/example.com/public/wp-content/plugins/akismet,/var/www/example.com/public/wp-content/plugins/akismet.old): Permission denied" while reading response header from upstream, client: 203.0.113.50, server: example.com, request: "POST /wp-admin/update.php?action=upgrade-plugin HTTP/2.0"
What it means: WordPress tried to rename directories as part of the update and got Permission denied.
Decision: Fix ownership/ACLs, then retry updates. Otherwise it will fail every time, usually at the rename step.
Task 18: Fix ownership for a “WordPress updates itself” model (opinionated, but common)
cr0x@server:~$ sudo chown -R www-data:www-data /var/www/example.com/public/wp-content
What it means: PHP can now write to wp-content. (You may also need write access to the root for core updates, depending on how you operate.)
Decision: If you run immutable deployments, do not do this. Instead, disable in-dashboard updates and push changes via pipeline.
Task 19: Clear caches that may be replaying the maintenance page
cr0x@server:~$ sudo -u www-data bash -lc 'cd /var/www/example.com/public && wp cache flush'
Success: The cache was flushed.
What it means: WordPress object cache cleared (works with some cache backends; varies).
Decision: If you have a separate full-page cache (Nginx FastCGI cache, Varnish, CDN), purge that too. If you can’t purge, at least bypass and confirm origin health.
Task 20: Validate HTTP status and body after fixes
cr0x@server:~$ curl -sS -D- -o /dev/null https://example.com/ | sed -n '1,15p'
HTTP/2 200
date: Sat, 27 Dec 2025 12:09:07 GMT
content-type: text/html; charset=UTF-8
server: nginx
What it means: You’re back to 200 OK.
Decision: Incident isn’t over until you confirm key pages and admin actions are healthy, and you’ve addressed the underlying cause (space, permissions, update process).
Three corporate-world mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
A mid-size company ran WordPress behind a CDN and a WAF, with a separate origin cluster. Marketing scheduled a “minor plugin update” fifteen minutes before a product announcement.
Someone saw the maintenance banner and did what the internet tells you: deleted .maintenance on one origin node. The banner stayed up.
The wrong assumption was subtle: they assumed the banner was generated by the origin at the moment they refreshed. It wasn’t. The CDN had cached the 503 maintenance response for a surprisingly long time because of an edge rule intended to protect the origin during outages. It treated 503 as cacheable “to reduce load.”
Meanwhile, the update had actually succeeded on two nodes and failed on one. So the fleet was split-brain: some nodes were serving the updated plugin, one node had a partial directory and was throwing fatals, and the edge was happily replaying the maintenance page from cache.
The fix was not heroic. They purged the CDN cache for the affected paths, removed the broken node from rotation, reinstalled the plugin coherently, and only then reintroduced the node.
But the lesson stuck: when you see a generic message, you don’t get to assume where it was generated. Verify the layer first.
Their post-incident action item was simple and effective: make 503 responses non-cacheable at the edge unless explicitly whitelisted, and add an “origin bypass” check to the on-call runbook.
Mini-story 2: The optimization that backfired
Another org decided to “speed up” WordPress by moving wp-content to network storage so multiple web heads could share uploads and plugins. It worked fine under normal traffic.
Updates, however, became a roulette wheel.
On update day, a plugin zip would download, extraction would begin, and then the filesystem would stall for seconds at a time. NFS latency spikes meant directory renames and metadata operations took long enough that PHP requests hit timeouts.
WordPress would die mid-upgrade, leaving .maintenance behind and a plugin directory half-moved.
The team tried to compensate by increasing PHP timeouts and adding retries. That made the site “more tolerant” but also made failures slower and harder to detect. Users sat in maintenance mode longer. On-call sat in meetings longer. Everybody lost.
The boring fix was to stop optimizing the wrong thing: they moved plugin/theme code back to local disk on each node and only kept uploads on shared storage. Deployments synchronized plugin changes across nodes using artifacts.
Updates stopped sticking because the critical upgrade path no longer depended on chatty network filesystem metadata calls.
When people say “storage is slow,” they usually mean “metadata is slow.” WordPress updates are mostly metadata operations wearing a zip file costume.
Mini-story 3: The boring but correct practice that saved the day
A company with a heavily customized WordPress setup ran a strict release process: staging first, then production, with WP-CLI-driven updates and a real rollback path.
No one was allowed to click “Update” in production, which made the admin UI slightly less exciting. The internet survived.
One afternoon, automatic translation updates triggered maintenance mode briefly, then the site fell back to normal. Minutes later, error rates ticked up in monitoring—not from maintenance mode, but from a plugin autoloader conflict that only manifested under a certain cached state.
Because they had logs, metrics, and a known-good artifact, they didn’t start “fixing” production by hand. They rolled back to the previous artifact on all nodes, disabled the automatic update channel that introduced the change, and restored service consistently.
Then they reproduced the issue in staging and fixed it properly.
Their big win wasn’t a clever command. It was discipline: single source of truth for code, repeatable deployments, and the ability to revert without negotiating with a half-written plugin directory.
Joke #2: The most reliable WordPress update strategy is still “don’t do live surgery,” which is also my advice for most hobbies.
Common mistakes: symptom → root cause → fix
This is the part where you stop guessing. Match your symptom to a likely cause and do the specific fix.
1) Symptom: Maintenance banner stays for hours; site returns 503
- Root cause: Stale
.maintenancefile left behind after interrupted upgrade. - Fix: Remove
.maintenancefrom the WordPress root; then verify plugin/theme/core integrity and logs to find what failed.
2) Symptom: Banner is gone after deleting .maintenance, but now you get a white screen (500)
- Root cause: Partial plugin/theme update causing PHP fatal error.
- Fix: Check PHP error logs, then reinstall or restore the broken plugin/theme. Use WP-CLI where possible. Don’t “delete random folders until it works.”
3) Symptom: Some users see the banner, others don’t
- Root cause: CDN or reverse proxy caching the maintenance response; or multiple origin nodes with inconsistent state.
- Fix: Test origin directly, purge caches, and ensure all nodes have consistent code. Remove broken nodes from rotation.
4) Symptom: Updates always fail and maintenance mode often sticks
- Root cause: Permissions/ownership mismatch; PHP user cannot rename/write directories.
- Fix: Pick a model: either allow WordPress to write (ownership/ACLs) or disable in-dashboard updates and deploy via CI/CD.
5) Symptom: Update starts, then hangs; server load spikes; I/O wait climbs
- Root cause: Slow storage or network filesystem latency during unzip/rename operations.
- Fix: Move code to local disk, keep shared storage for uploads only, or redesign updates to be artifact-based. Investigate I/O bottlenecks and inode usage.
6) Symptom: Maintenance message keeps reappearing right after you remove it
- Root cause: Another update process is repeatedly failing and recreating
.maintenance, often via auto-updates or cron. - Fix: Disable auto-updates temporarily, inspect cron jobs, and locate the failing component via logs and WP-CLI update output.
7) Symptom: Only /wp-admin shows issues; front-end looks okay
- Root cause: Admin-side update endpoint failing; may be due to PHP timeouts, WAF rules blocking POST, or broken plugin loaded only in admin.
- Fix: Check WAF logs, PHP timeouts, and admin-specific plugin errors. Update using WP-CLI to bypass browser timeouts.
Checklists / step-by-step plan
Emergency restore checklist (get users back)
- Verify layer: Is the maintenance response coming from origin or cached upstream?
- Check
.maintenance: If present and stale, remove it. - Retest: Confirm HTTP 200 and that the homepage renders.
- Check logs: Look for PHP fatals and permission errors during the upgrade window.
- Fix broken component: Reinstall/restore the plugin/theme/core as needed.
- Stabilize storage: Ensure disk space and inodes aren’t near zero.
- Purge caches: Object cache + page cache + CDN cache as appropriate.
- Validate critical paths: Login, admin dashboard, and any revenue workflows.
Controlled recovery plan (when you suspect partial upgrades)
- Put the site in intentional maintenance (optional): If you have a real maintenance plugin/page, use that instead of relying on
.maintenancechaos. - Identify the exact failing update: Use WP-CLI lists and log timestamps.
- Rollback or reinstall cleanly: Replace entire plugin/theme directories; don’t try to patch missing files manually.
- Verify core checksums: Confirm base integrity before blaming everything else.
- Re-run updates one-by-one: Watch logs during each update.
- Document the root cause: Disk, inodes, permissions, caching, or process model—pick one primary cause and fix it.
Operational model checklist (choose your strategy)
This is where teams stop stepping on rakes. Decide how updates happen in your environment:
- Model A: WordPress self-updates on the server. Then you must ensure correct filesystem ownership/ACLs, sufficient disk/inodes, and safe caching rules.
- Model B: Immutable deploys (recommended for serious production). Then disable in-dashboard updates, build artifacts in CI, and deploy atomically with rollback.
Model A can work. Model B scales better. Mixing them produces the kind of incident where everyone is right and the site is still down.
Prevention: stop re-living this incident
Stop doing big-bang “Update all” in production
Update one component at a time. Watch the system. If it fails, you’ll know what failed.
“Update all” is efficient for humans and brutal for debugging.
Give updates the resources they need
Updates are not compute-heavy, but they are filesystem-heavy. That means:
- Keep real disk headroom (not “99% used but technically fine”).
- Watch inodes, especially on small root filesystems and container overlay layers.
- Avoid running plugin/theme code on slow network filesystems if you can.
Make caching respect reality
If your edge caches 503s for long periods, you are choosing “faster outages.” Sometimes that’s defensible. Often it’s just cargo cult configuration.
Ensure your CDN/proxy treats maintenance responses as non-cacheable unless you intentionally want that behavior.
Log and observe updates like you mean it
WordPress is application software. Updates are deployments. Treat them accordingly:
- Capture PHP errors centrally.
- Record the time of updates (even a simple change log helps).
- Alert on sustained 503 rates and sudden error spikes in the upgrade window.
Use WP-CLI for updates, especially on busy sites
Browser-driven updates are fragile: they rely on a single HTTP request surviving long enough. WP-CLI updates are more controllable, easier to log, and easier to run in a session that won’t die when your laptop roams Wi‑Fi.
One quote to keep you honest
“Hope is not a strategy.” — General Gordon R. Sullivan
You don’t need to become a philosopher about it. Just stop relying on “it usually clears itself” as an operational plan.
FAQ
1) Is it safe to delete the .maintenance file?
Usually, yes—if the update is not actively running. Deleting it removes the lock. The risk is that you expose users to a partially updated site. If you suspect partial upgrades, remove the lock, then immediately verify logs and reinstall the broken component.
2) Where exactly is .maintenance located?
In the WordPress root directory—typically the same directory that contains wp-config.php, wp-admin, and wp-includes. Not inside wp-content.
3) Why does WordPress return a 503 during maintenance?
Because it’s the correct signal to clients and caches: “service temporarily unavailable.” The problem is when caches mishandle it, or when “temporarily” becomes “until a human shows up.”
4) The file is gone, but I still see the maintenance message. What now?
Suspect caching first. Test origin directly and purge CDN/reverse-proxy/page cache. If origin is 200 but edge still serves 503, it’s not WordPress anymore—it’s cache state.
5) Can a plugin cause maintenance mode without me updating it?
Yes. Auto-updates, translation updates, and some managed-host update mechanisms can trigger maintenance. Also, a plugin can break the update process, leaving the lock behind even if the trigger was something else.
6) Should I disable automatic updates to prevent this?
Don’t disable security updates just because updates once hurt you. Instead, fix the underlying causes: permissions, disk/inodes, and a safe update process (WP-CLI, staging, artifacts).
Disabling auto-updates trades a reliability incident for a security incident. That’s not a bargain.
7) Why does this happen more on shared hosting?
Shared hosting often has constrained disk, limited I/O, odd permissions, and PHP timeouts. WordPress’s update mechanism assumes it can write and rename files quickly. On shared hosting, that assumption is routinely false.
8) How do I prevent partial plugin updates?
Use a consistent deployment model. Prefer artifact-based deploys or at least WP-CLI updates executed in a stable session. Ensure adequate disk/inode headroom and correct ownership/ACLs. Avoid updating during peak traffic where timeouts are more likely.
9) Does running WordPress on NFS/EFS make this worse?
It can. Updates involve many small file operations (metadata heavy). Network filesystems can introduce latency spikes that cause timeouts and partial moves. If you must use shared storage, keep uploads there and deploy code locally.
10) What’s the cleanest “enterprise” fix?
Disable in-dashboard updates in production, build versioned artifacts in CI, deploy atomically, and keep a rollback path. Let WordPress be an application, not an interactive file editor.
Conclusion: next steps that actually help
If you’re stuck on “Briefly unavailable for scheduled maintenance,” your fastest win is almost always:
confirm origin vs cache, remove .maintenance if stale, then validate what the update broke.
After that, fix the real cause: permissions, disk/inodes, slow storage, or an update process that depends on a fragile browser request.
Practical next steps:
- Put “check
.maintenance+ check disk/inodes + check logs” into your on-call runbook. - Pick an update model (self-update or immutable deploy) and enforce it.
- Make 503 responses non-cacheable unless you intentionally want cached outages.
- Use WP-CLI for controlled, logged updates—one component at a time.