Debian 13: PHP-FPM socket permissions — the tiny fix that kills 502s (case #35)

Was this helpful?

Your page loads. Until it doesn’t. Then nginx throws a 502 and your incident channel lights up like a Christmas tree made of on-call fatigue.
You restart things. Sometimes it “works.” Sometimes it doesn’t. The load balancer keeps retrying, which is its way of helping you suffer longer.

On Debian 13, a very specific failure mode keeps showing up in the field: PHP-FPM is healthy, nginx is healthy, but the Unix socket between them
is not. Not broken. Not missing. Just… not accessible. A one-line permission fix will end hours of guesswork—if you understand what you’re actually fixing.

The case file: what “socket permissions” really means

When nginx talks to PHP-FPM, it usually does it one of two ways: TCP (127.0.0.1:9000) or a Unix domain socket
(something like /run/php/php8.3-fpm.sock). Unix sockets are faster to connect, avoid port exposure, and
are dead simple to reason about on a single host. Until they aren’t.

A Unix socket is a filesystem object. That means Linux permissions apply. And because it’s under /run,
it’s sitting on tmpfs: recreated at boot, often recreated at service start, and sometimes “helpfully” recreated by systemd units.
You don’t “chmod the socket once” and call it done. You arrange for the correct owner/group/mode to exist whenever the socket is created.

The recurring Debian 13 pain point: packages are sane, but your local configuration drift (or an old cookbook)
assumes the socket is owned by www-data:www-data with mode 0660. Meanwhile your nginx worker runs as
www-data but PHP-FPM may be configured differently, or the socket directory permissions don’t allow traversal,
or systemd is racing you with a default that reverts permissions after restarts.

The end result is usually one of these:

  • connect() to unix:/run/php/php8.3-fpm.sock failed (13: Permission denied)
  • connect() to unix:/run/php/php8.3-fpm.sock failed (2: No such file or directory)
  • upstream prematurely closed connection (often a different problem, but can be permission-adjacent during churn)

Our job is to turn those strings into a decision tree, not a séance.

Fast diagnosis playbook (check 1/2/3)

1) Confirm nginx is failing to connect to a Unix socket (not PHP itself)

The first question is always: “Is this a transport problem or an application problem?”
If nginx can’t connect to the socket, PHP never even gets a chance to disappoint you.

2) Inspect the socket object and its parent directory

If the socket exists, check ownership/mode. Then check the directory permissions. Linux needs execute (“traverse”) on each
parent directory component, not just read permissions on the socket file.

3) Verify PHP-FPM pool settings that control socket creation

Socket permissions aren’t set by magic. They come from the pool config: listen, listen.owner,
listen.group, listen.mode. If those aren’t set, defaults apply—sometimes not the defaults you think.

4) Only then: check AppArmor/SELinux and systemd tmpfiles

In Debian land, AppArmor is more common than SELinux. But don’t start there.
Start with the mundane. Most 502s are boring, and boring is fixable.

Facts & context: why Debian 13 makes this feel new

A few concrete facts and historical context points help explain why this “tiny fix” keeps biting teams during upgrades and migrations:

  1. Unix domain sockets predate most web stacks. They came from early Unix IPC designs and behave like filesystem objects, not network ports.
  2. /run replaced /var/run in modern distros. It’s a tmpfs designed for runtime state. Great for cleanup; annoying for “I chmodded it once” habits.
  3. Debian standardized PHP-FPM socket locations. The typical path /run/php/phpX.Y-fpm.sock is consistent, but consistency exposes drift: old nginx configs still point to older paths.
  4. PHP-FPM pools control socket mode at creation time. Once created, external permission changes are ephemeral: service restart recreates the socket and your manual changes disappear.
  5. nginx and PHP-FPM often run under different users for good reasons. You might run nginx as www-data but PHP-FPM pools as per-site users for isolation; the socket needs a shared group or ACL strategy.
  6. systemd changed the “boot-time ownership” story. tmpfiles and unit sandboxing can enforce filesystem rules you didn’t know existed, especially when using overrides.
  7. AppArmor profiles have become more common and stricter. A socket can be perfectly permissioned and still be blocked by a confinement policy, especially with custom paths.
  8. 502 is an nginx umbrella error. It doesn’t mean “PHP is down.” It means “upstream failed.” Sometimes upstream is just “a socket you can’t open.”
  9. Permissions errors are deterministic but look intermittent under load. If you have multiple nginx workers, rolling restarts, or multiple pools, you can get “sometimes works” when you’re actually hitting different upstreams.

One paraphrased idea from Gene Kim (DevOps/reliability author): Improving flow means removing small constraints that silently throttle the whole system.
Socket permissions are a small constraint with a big blast radius.

Joke #1: Unix sockets are like office doors—if you’re not on the right badge list, you’ll be “bad gateway” to everyone inside.

How 502s from PHP-FPM sockets actually present

Socket permission issues rarely show up as a clean “permission denied” banner in your browser.
They show up as 502s, random spikes in latency (from retries), and a lot of wasted time staring at PHP code that never executed.

The three common shapes

  • Permission denied (errno 13): nginx can see the socket path, but can’t open it. Typically wrong owner/group/mode,
    or missing execute permissions on the directory containing the socket.
  • No such file or directory (errno 2): nginx points to a socket path that doesn’t exist (wrong version, wrong pool name),
    or PHP-FPM didn’t create it (failed to start, misconfigured listen, or directory missing).
  • Connection refused / upstream timed out: can happen with TCP backends or overloaded PHP-FPM. With sockets, it’s more often
    backlog exhaustion or PHP-FPM not accepting connections quickly enough.

Why it “worked yesterday”

If your socket is under /run, a reboot resets it. A service restart recreates it. A package upgrade may reload units.
If your fix was manual (chmod on the socket), it was never a fix; it was a temporary truce.

Field tasks: commands, outputs, and decisions (12+)

These are the tasks I run when someone says “nginx 502 after Debian 13 upgrade” and the error smells like sockets.
Each task includes a command, what typical output looks like, and what decision you make from it.

Task 1: Confirm the exact nginx error string

cr0x@server:~$ sudo tail -n 30 /var/log/nginx/error.log
2025/12/30 09:21:44 [crit] 1842#1842: *912 connect() to unix:/run/php/php8.3-fpm.sock failed (13: Permission denied) while connecting to upstream, client: 203.0.113.10, server: example.internal, request: "GET /index.php HTTP/1.1", upstream: "fastcgi://unix:/run/php/php8.3-fpm.sock:", host: "example.internal"

What it means: nginx reached the filesystem path, tried to connect, kernel said “no.”
Decision: stop looking at PHP code; start inspecting socket ownership and directory permissions.

Task 2: Validate what nginx thinks the upstream is

cr0x@server:~$ sudo nginx -T 2>/dev/null | grep -R --line-number -E 'fastcgi_pass|upstream' /etc/nginx | head
/etc/nginx/sites-enabled/app.conf:24:        fastcgi_pass unix:/run/php/php8.3-fpm.sock;

What it means: The active config points to that socket.
Decision: if the path is wrong (old PHP version, wrong pool), fix nginx first. If it’s right, move on.

Task 3: Check that PHP-FPM is running and which unit is relevant

cr0x@server:~$ systemctl status php8.3-fpm --no-pager
● php8.3-fpm.service - The PHP 8.3 FastCGI Process Manager
     Loaded: loaded (/lib/systemd/system/php8.3-fpm.service; enabled; preset: enabled)
     Active: active (running) since Tue 2025-12-30 09:10:03 UTC; 12min ago
       Docs: man:php-fpm8.3(8)
    Process: 1024 ExecStartPost=/usr/lib/php/php-fpm-socket-helper install /run/php/php8.3-fpm.sock /etc/php/8.3/fpm/pool.d/www.conf 83 (code=exited, status=0/SUCCESS)
   Main PID: 1012 (php-fpm8.3)
      Tasks: 8 (limit: 18925)
     Memory: 44.2M
        CPU: 2.021s
     CGroup: /system.slice/php8.3-fpm.service
             ├─1012 "php-fpm: master process (/etc/php/8.3/fpm/php-fpm.conf)"
             ├─1018 "php-fpm: pool www"
             └─1019 "php-fpm: pool www"

What it means: Service is up; it even ran a post-start helper related to sockets.
Decision: the upstream failure is probably permission/path, not “php-fpm is dead.”

Task 4: Inspect the socket object itself

cr0x@server:~$ sudo ls -l /run/php/php8.3-fpm.sock
srw-rw---- 1 root root 0 Dec 30 09:10 /run/php/php8.3-fpm.sock

What it means: Socket is owned by root:root with mode 660.
nginx running as www-data can’t connect unless it’s in group root (it shouldn’t be).
Decision: set correct listen.owner/listen.group (or ACL) so the socket is accessible.

Task 5: Confirm nginx worker user

cr0x@server:~$ ps -o user,group,comm -C nginx
USER     GROUP    COMMAND
root     root     nginx
www-data www-data nginx
www-data www-data nginx
www-data www-data nginx

What it means: Workers run as www-data.
Decision: the socket must be connectable by www-data (user or group), or you need to change nginx’s user intentionally.

Task 6: Check directory permissions on /run/php

cr0x@server:~$ sudo namei -l /run/php/php8.3-fpm.sock
f: /run/php/php8.3-fpm.sock
drwxr-xr-x root root /
drwxr-xr-x root root run
drwxr-x--- root www-data php
srw-rw---- root root php8.3-fpm.sock

What it means: nginx (www-data) can traverse /run/php because group is www-data and mode includes x.
But the socket itself is root:root, so it still fails.
Decision: fix socket ownership/mode, not the directory (the directory is already reasonable here).

Task 7: Locate the PHP-FPM pool config that defines listen settings

cr0x@server:~$ sudo grep -R --line-number -E '^listen(\.| =)|^user =|^group =' /etc/php/8.3/fpm/pool.d
/etc/php/8.3/fpm/pool.d/www.conf:31:user = www-data
/etc/php/8.3/fpm/pool.d/www.conf:32:group = www-data
/etc/php/8.3/fpm/pool.d/www.conf:41:listen = /run/php/php8.3-fpm.sock
/etc/php/8.3/fpm/pool.d/www.conf:42:listen.owner = root
/etc/php/8.3/fpm/pool.d/www.conf:43:listen.group = root
/etc/php/8.3/fpm/pool.d/www.conf:44:listen.mode = 0660

What it means: The pool explicitly creates a root-owned socket. That’s the bug, not a mystery.
Decision: set listen.owner/listen.group to a user/group nginx can use, typically www-data.

Task 8: Validate PHP-FPM configuration parses cleanly

cr0x@server:~$ sudo php-fpm8.3 -t
[30-Dec-2025 09:24:10] NOTICE: configuration file /etc/php/8.3/fpm/php-fpm.conf test is successful

What it means: No syntax errors.
Decision: safe to restart PHP-FPM after changes; if this fails, don’t restart in production until fixed.

Task 9: Restart PHP-FPM and verify socket ownership changed

cr0x@server:~$ sudo systemctl restart php8.3-fpm
cr0x@server:~$ sudo ls -l /run/php/php8.3-fpm.sock
srw-rw---- 1 www-data www-data 0 Dec 30 09:25 /run/php/php8.3-fpm.sock

What it means: Socket is now owned by www-data, mode 660.
Decision: nginx should be able to connect. Next, validate with a request and logs.

Task 10: Confirm nginx config is valid and reload

cr0x@server:~$ sudo nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
cr0x@server:~$ sudo systemctl reload nginx

What it means: nginx accepted the config and reloaded without dropping connections.
Decision: if you changed any upstream paths, this is mandatory; otherwise you’re testing the wrong reality.

Task 11: Reproduce and verify a successful FastCGI request

cr0x@server:~$ curl -sS -D- http://127.0.0.1/index.php -o /dev/null | head -n 8
HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=UTF-8
Connection: keep-alive

What it means: nginx reached PHP-FPM successfully.
Decision: close the incident, then make the fix durable (tmpfiles/overrides) so it survives restarts and boots.

Task 12: If it still fails, test connecting to the socket as the nginx user

cr0x@server:~$ sudo -u www-data bash -lc 'php -v >/dev/null; test -S /run/php/php8.3-fpm.sock && echo "socket exists"; cat /run/php/php8.3-fpm.sock'
socket exists
cat: /run/php/php8.3-fpm.sock: No such device or address

What it means: The user can see the socket. The cat error is normal for sockets; it’s not a file you read.
Decision: visibility is okay; if nginx still errors, look for AppArmor or incorrect nginx path (or multiple pools).

Task 13: Look for AppArmor denials that block access

cr0x@server:~$ sudo journalctl -k --since "30 min ago" | grep -i apparmor | tail -n 5
Dec 30 09:26:01 server kernel: audit: type=1400 audit(1767086761.123:81): apparmor="DENIED" operation="connect" profile="/usr/sbin/nginx" name="/run/php/php8.3-fpm.sock" pid=1842 comm="nginx" requested_mask="wr" denied_mask="wr" fsuid=33 ouid=33

What it means: Permissions are fine, but policy blocks nginx from connecting.
Decision: adjust the AppArmor profile (or use distro defaults paths), not filesystem permissions.

Task 14: Verify php-fpm is actually listening on the expected socket

cr0x@server:~$ sudo ss -xlpn | grep php-fpm
u_str LISTEN 0      4096   /run/php/php8.3-fpm.sock  11159            * 0 users:(("php-fpm8.3",pid=1012,fd=8))

What it means: The master process is listening on that socket.
Decision: if it’s listening somewhere else, you’re chasing the wrong path in nginx or the wrong pool.

Task 15: Confirm systemd is not recreating directories with unexpected modes

cr0x@server:~$ systemd-tmpfiles --cat-config | grep -nE '/run/php|php-fpm' | head -n 20
219: d /run/php 0755 root root -

What it means: tmpfiles defines how /run/php is created at boot.
Decision: if this doesn’t match your desired ownership/mode, add an override in /etc/tmpfiles.d/.

The tiny fix that kills 502s (and why it works)

The “tiny fix” is almost always in the PHP-FPM pool config, not nginx. You want PHP-FPM to create the socket with
an owner/group/mode that matches the nginx worker. On Debian, nginx typically runs as www-data. So do that.

Fix the pool: set listen.owner, listen.group, listen.mode

Edit the pool file (often /etc/php/8.3/fpm/pool.d/www.conf, but your pool may be named differently).
These are the lines that matter:

cr0x@server:~$ sudo sed -n '35,55p' /etc/php/8.3/fpm/pool.d/www.conf
listen = /run/php/php8.3-fpm.sock
listen.owner = www-data
listen.group = www-data
listen.mode = 0660

Why it works: PHP-FPM creates the socket. If it creates it as root:root, nginx won’t connect.
If it creates it as www-data:www-data with 0660, nginx connects cleanly and you’re done.

When not to use www-data

If you run multiple sites and want isolation, don’t throw everything into www-data. Use a dedicated group
(e.g., nginx-php), keep nginx as www-data, and add www-data to that group,
while PHP-FPM pools set listen.group = nginx-php. That gives you a controlled shared boundary.

cr0x@server:~$ sudo groupadd --system nginx-php
cr0x@server:~$ sudo usermod -aG nginx-php www-data
cr0x@server:~$ id www-data
uid=33(www-data) gid=33(www-data) groups=33(www-data),990(nginx-php)

Decision: If you need per-site separation, pick a shared group strategy. If it’s a single app box, www-data everywhere is fine.

Make the fix stick: don’t chmod the socket directly

If you do this:

cr0x@server:~$ sudo chown www-data:www-data /run/php/php8.3-fpm.sock
cr0x@server:~$ sudo chmod 660 /run/php/php8.3-fpm.sock

It may immediately stop the 502s. It will also evaporate on the next restart. That’s not an engineering fix; it’s a stress-relief ritual.

Joke #2: Chmodding a runtime socket by hand is like fixing a leaky roof with a sticky note—comforting, brief, and fundamentally disrespectful to physics.

systemd-tmpfiles: the missing piece for sockets under /run

Here’s the subtlety that bites people: even if PHP-FPM creates the socket correctly, it can fail to create it if the directory
isn’t there or isn’t traversable. And /run is ephemeral. You need the directory /run/php created at boot,
with sane ownership and mode. Debian usually does this for you, but custom hardening or cleanup jobs can break it.

Check /run/php directory state

cr0x@server:~$ sudo ls -ld /run/php
drwxr-x--- 2 root www-data 80 Dec 30 09:25 /run/php

What it means: Group is www-data and execute bit is present. nginx can traverse; good.
Decision: if this were drwx------ root root, nginx would fail even if the socket was perfect.

Create a tmpfiles rule (when defaults aren’t enough)

If you have a nonstandard group (like nginx-php) or you changed ownership conventions, create your own tmpfiles entry.
This avoids relying on package defaults that may not match your environment.

cr0x@server:~$ printf '%s\n' 'd /run/php 0750 root nginx-php -' | sudo tee /etc/tmpfiles.d/php-sockets.conf
d /run/php 0750 root nginx-php -

Apply it immediately:

cr0x@server:~$ sudo systemd-tmpfiles --create /etc/tmpfiles.d/php-sockets.conf
cr0x@server:~$ sudo ls -ld /run/php
drwxr-x--- 2 root nginx-php 80 Dec 30 09:28 /run/php

Decision: if you manage fleets, this is the difference between “works on my box” and “stays working after reboot.”

Don’t forget to restart services that cached group membership

When you add www-data to a new group, processes already running may not pick it up. nginx reloads workers, but some service managers
do odd things depending on your setup. Verify with ps and logs, or restart nginx if you must.

cr0x@server:~$ sudo systemctl restart nginx
cr0x@server:~$ ps -o pid,user,group,comm -C nginx
  PID USER     GROUP    COMMAND
 2101 root     root     nginx
 2102 www-data www-data nginx
 2103 www-data www-data nginx

What it means: Primary group still shows www-data (that’s fine). Supplementary groups are not shown here.
Decision: use /proc status to confirm supplementary groups if needed.

cr0x@server:~$ sudo awk '/Groups:/{print}' /proc/2102/status
Groups:	33 990

What it means: Worker has both www-data and nginx-php.
Decision: group-based socket access will now work reliably.

nginx upstream config: don’t sabotage yourself

nginx configuration errors can masquerade as socket permission errors. Or combine with them for a two-for-one outage.
Treat nginx as a precise instrument, not a text file you poke until it stops screaming.

Use the correct fastcgi_pass syntax

On Debian, you’ll commonly see:

cr0x@server:~$ sudo awk 'NR>=1 && NR<=60 {print NR ":" $0}' /etc/nginx/sites-enabled/app.conf | sed -n '15,40p'
15:location ~ \.php$ {
16:    include snippets/fastcgi-php.conf;
17:    fastcgi_pass unix:/run/php/php8.3-fpm.sock;
18:}

Decision: keep it boring. Don’t invent a custom include unless you have a specific need and test coverage.

Beware stale paths after PHP upgrades

Debian 13 upgrades often coincide with PHP version bumps. If you upgraded from a previous Debian release, you may have an nginx site
still pointing to /run/php/php8.2-fpm.sock while PHP 8.3 is installed. That produces “No such file.”
Your fix then isn’t permissions; it’s accuracy.

cr0x@server:~$ sudo ls -1 /run/php
php8.3-fpm.sock

Decision: if only one socket exists, align nginx to that socket or create a pool/socket that matches your intended path.

Multiple pools: name them and point nginx explicitly

If you have multiple applications, don’t point everything at “www” out of habit.
Create separate pool files and separate sockets. Then use an upstream block per app if that helps readability.

Logging that earns its disk space

Socket permission problems are easy to spot if you have the right log detail. If you don’t, they’re an expensive guessing game.
nginx error logs are usually enough, but PHP-FPM logs can confirm pool-level behavior (especially in multi-pool setups).

Check PHP-FPM logs around start/restart

cr0x@server:~$ sudo journalctl -u php8.3-fpm --since "30 min ago" --no-pager | tail -n 30
Dec 30 09:25:03 server php-fpm8.3[1012]: NOTICE: fpm is running, pid 1012
Dec 30 09:25:03 server php-fpm8.3[1012]: NOTICE: ready to handle connections

Decision: if you see bind/listen failures here, fix PHP-FPM pool config or directory creation. If logs are clean, keep focus on nginx-to-socket access.

Turn on useful nginx error detail during incident response

Temporarily bump nginx error log level if you’re stuck. Then turn it back down. Logging is like caffeine: helpful in a pinch, bad as a lifestyle.

Three corporate mini-stories from the trenches

1) Incident caused by a wrong assumption: “root owns it, so it must be secure”

A mid-sized company migrated a PHP monolith from an old VM to Debian 13. The engineers did the right things—immutable images,
a deploy pipeline, config in Git. They also carried forward an old “hardening” snippet that set
listen.owner = root and listen.group = root “to prevent access.”

The assumption was that nginx, being “the web server,” must be privileged enough to connect anyway. In their heads,
nginx was a single process with authority. In reality, nginx has a privileged master and unprivileged workers,
and it’s the workers that need to connect to upstreams. The workers were www-data.

The cutover looked fine at first because the old environment used TCP, not sockets. New environment used Unix sockets for “performance.”
Immediately after cutover, 502s appeared—only on PHP paths. Static files were fine, which made the incident feel like a PHP regression.
A developer rolled back code. No change. Someone restarted php-fpm. No change. Someone restarted nginx. Brief relief, then failure again.

The turning point was a single line in nginx error logs: errno 13. Once they saw the socket was root:root 0660, it was over.
They changed listen.group to a shared group, restarted PHP-FPM, and traffic stabilized. The postmortem takeaway was blunt:
permission models aren’t vibes. If you don’t know which process needs access, you’re configuring based on superstition.

2) Optimization that backfired: “Unix sockets are faster, so let’s flip everything”

A larger org had a platform team standardizing web stacks. They ran a lot of nginx+PHP-FPM pairs.
Someone noticed that Unix sockets avoid TCP overhead and decided to standardize on sockets across the fleet.
The change was rolled out as a “safe optimization” behind a feature flag.

It worked in staging. Of course it did. Staging was a single host with a single pool, vanilla permissions,
and nobody had touched the tmpfiles configuration since the dawn of time.

Production was messy. Some hosts ran multiple pools with per-app users. Some had nginx in a container.
Some had custom AppArmor profiles. Some had cleanup jobs that removed runtime directories during “maintenance”
because someone once confused /run with “cache.”

The rollout didn’t cause a clean outage. It caused a slow bleed: a small percentage of requests failed with 502,
correlated with specific nodes. That’s worse. A clean outage gets attention; partial failure gets blamed on “the app.”

The fix wasn’t “go back to TCP.” It was to treat sockets as infrastructure: define directory creation explicitly,
define socket ownership via pool config, and document group strategy. The optimization stopped backfiring when it stopped being “just a toggle.”

3) Boring but correct practice that saved the day: “nginx -T and namei before you panic”

A financial services team ran a strict change process and got teased for it. But their on-call rotations were quieter.
During a Debian 13 rollout, a node began returning 502s for PHP routes. The on-call engineer didn’t restart anything at first.
They followed a short runbook: capture nginx error, print active nginx config, inspect socket and directory with namei.

In five minutes they identified the culprit: nginx was pointing to a socket path from an older pool name. The new pool file existed,
php-fpm was running, and the right socket existed—just not at the path nginx was configured for. It wasn’t permissions at all.
It was configuration mismatch.

They updated the nginx site file, ran nginx -t, reloaded nginx, and the 502s stopped. No restarts of PHP-FPM needed.
That matters in environments where restarting php-fpm can drop in-flight requests or trigger slow warmups.

The practice that saved the day was dull: always verify what’s actually running (nginx -T) and always verify
filesystem traversal (namei -l). Boring is reliable. Reliable is profitable.

Common mistakes: symptom → root cause → fix

1) Symptom: nginx error says “Permission denied (13)” on the socket

Root cause: Socket owned by the wrong user/group, or mode too restrictive.

Fix: Set in the PHP-FPM pool:
listen.owner, listen.group, listen.mode = 0660. Restart PHP-FPM, verify with ls -l.

2) Symptom: “No such file or directory (2)” for the socket path

Root cause: nginx points to the wrong socket path (PHP version bump, renamed pool), or PHP-FPM didn’t create the socket.

Fix: Verify nginx active config with nginx -T. Verify PHP-FPM pool listen =. Verify socket exists in /run/php.

3) Symptom: Works after chmod, breaks after reboot

Root cause: Manual socket permission change is lost when PHP-FPM recreates the socket or when /run resets.

Fix: Change pool config (creation-time permissions) and ensure /run/php is created via tmpfiles if needed.

4) Symptom: Permissions look correct, still “Permission denied”

Root cause: AppArmor policy blocks nginx connecting to that socket path.

Fix: Confirm with kernel audit logs. Adjust AppArmor profile or revert to standard socket paths covered by existing profiles.

5) Symptom: Random 502s during deploys, especially with reloads

Root cause: Socket path changes between versions/pools, or multiple pools get restarted out of order, leaving nginx pointing at a temporarily missing socket.

Fix: Stabilize the socket path, coordinate restarts, and avoid changing socket names as part of routine deploys.

6) Symptom: Only some vhosts fail; others are fine

Root cause: One site points to the wrong pool/socket; other sites are correct.

Fix: Audit each vhost fastcgi_pass value; don’t assume they’re consistent. Use explicit pool sockets per site.

7) Symptom: PHP-FPM is active, but socket never appears

Root cause: Pool misconfiguration, missing directory, or permission failure on directory creation.

Fix: Check journalctl -u php8.3-fpm. Ensure /run/php exists with correct mode/owner. Validate config with php-fpm8.3 -t.

Checklists / step-by-step plan

Incident response checklist (15 minutes, single host)

  1. Grab the nginx error line for a failing request from /var/log/nginx/error.log.
  2. Confirm the upstream socket path from active config using nginx -T.
  3. Check PHP-FPM service health with systemctl status php8.3-fpm.
  4. Inspect the socket with ls -l and the path traversal with namei -l.
  5. Confirm nginx worker user with ps.
  6. Compare pool config: listen, listen.owner, listen.group, listen.mode.
  7. Validate config: php-fpm8.3 -t.
  8. Restart PHP-FPM (only after config validation), then re-check socket ownership/mode.
  9. Reload nginx (after nginx -t) and test with curl.
  10. If still blocked, check AppArmor denials via kernel audit logs.

Hardening checklist (make it durable across reboots)

  1. Decide on a socket access model: single user (www-data) or shared group (recommended for multi-user pools).
  2. Ensure /run/php directory ownership/mode matches the model.
  3. If you diverge from distro defaults, add a tmpfiles rule under /etc/tmpfiles.d/.
  4. Keep socket paths stable; don’t encode PHP minor versions into nginx configs unless you intend to update them every upgrade.
  5. Document the pool-to-vhost mapping: which socket each server block uses.
  6. Make log sampling part of your validation: grep for Permission denied after changes.

Change plan (safe rollout for a fleet)

  1. Inventory all fastcgi_pass values across nginx configs.
  2. Inventory all PHP-FPM pools and their listen settings.
  3. Pick a standard: either per-site socket with shared group, or a single shared pool.
  4. Stage the change: adjust pool configs first, then nginx, then reload/restart in a controlled order.
  5. Automate verification: check socket exists, ownership, mode, and that nginx can serve a PHP health endpoint.
  6. Roll forward with canaries. Don’t “flip everything” unless you enjoy learning about unknown unknowns at 2 a.m.

FAQ

1) Why does nginx show 502 instead of a clearer permissions error?

nginx is a reverse proxy. When it can’t talk to the upstream, it returns “Bad Gateway.” The detailed reason lives in the nginx error log,
not the HTTP response.

2) Should I use TCP instead of Unix sockets to avoid this?

TCP avoids filesystem permissions and AppArmor path rules, but introduces port management and potentially broader exposure.
Unix sockets are fine—just configure ownership/mode correctly and keep the path stable.

3) Is chmod 777 on the socket ever acceptable?

No. It “works” by eliminating the access control you needed. It also trains your team to solve incidents by widening blast radius.
Use correct owner/group/mode or a dedicated shared group.

4) What’s the safest permission mode for a socket nginx connects to?

Typically 0660 with ownership php-fpm-user:shared-group, and nginx in that group.
Avoid world-writable. Avoid root:root unless nginx is also privileged (it shouldn’t be).

5) Why does changing permissions on the socket file not persist?

PHP-FPM creates the socket when the pool starts. On restart, it deletes and recreates it, applying pool config settings.
Also, /run is tmpfs and resets on reboot.

6) I set listen.owner and listen.group, but it still creates root-owned sockets. Why?

You may be editing the wrong pool file, or the service is using a different config directory, or you have multiple pools creating different sockets.
Confirm with grep across /etc/php/8.3/fpm/pool.d and verify the socket path nginx uses.

7) Can directory permissions break this even if the socket looks correct?

Yes. nginx must be able to traverse every parent directory to reach the socket. Use namei -l to see where traversal fails.

8) How does AppArmor show up differently than filesystem permissions?

Filesystem permissions typically produce nginx error logs with errno 13 and no kernel audit line.
AppArmor will often emit a kernel audit “DENIED” entry naming /usr/sbin/nginx and the socket path.

9) I run PHP-FPM pools as per-app users. What’s the cleanest socket strategy?

Give each pool its own socket path and set listen.group to a shared group that nginx belongs to.
Keep listen.mode = 0660. This keeps access controlled while avoiding “everything is www-data” sprawl.

10) How do I prevent future upgrades from breaking socket paths?

Don’t hardcode PHP minor version sockets into dozens of vhosts without automation. Use stable socket names (per pool),
or manage nginx configs via a system that updates them during the upgrade.

Conclusion: next steps you can do today

Debian 13 didn’t invent PHP-FPM socket problems. It just made them easier to encounter: new installs, upgrades,
more systemd integration, and the usual “this server is special” drift that accumulates quietly until it doesn’t.

If you take nothing else: stop chmodding runtime sockets by hand. Fix socket creation at the source—the PHP-FPM pool—and make sure
/run/php exists with predictable permissions across boots. Then verify with logs and one clean request.

Practical next steps:

  1. Run nginx -T and record the active socket path for each vhost.
  2. Inspect socket ownership/mode with ls -l and traversal with namei -l.
  3. Set listen.owner, listen.group, listen.mode in the correct pool file.
  4. If you use custom groups or custom paths, add a tmpfiles rule to make /run/php deterministic.
  5. Re-test with curl and watch nginx error logs go quiet—the best kind of quiet.
← Previous
VPN certificates: do it properly without permanent self-signed pain
Next →
The clock-speed arms race: why “more GHz” stopped working

Leave a comment