You type the right username. The right password. You even paste it from the password manager, which means it’s definitely right. And OpenVPN replies with the coldest message in networking: AUTH_FAILED.
This is the kind of failure that wastes hours because it’s labeled like a human problem (“bad password”) while often being a systems problem (policy, plugin, clock, cipher, revocation, environment). Let’s treat it like an incident, not a morality play.
What AUTH_FAILED really means (and what it doesn’t)
AUTH_FAILED means: the server decided to reject the client during the “user authentication” stage and told the client so. That’s it. It does not mean “your password is wrong.” It doesn’t even mean the server checked your password.
OpenVPN authentication can involve multiple layers, and different layers produce deceptively similar client-facing outcomes:
- TLS identity: certificates, CAs, revocation, EKU, clock validity, TLS options.
- User auth: static file, PAM, LDAP, RADIUS, OAuth-ish wrappers, MFA gateways, custom scripts.
- Policy: per-user restrictions, group membership, “no concurrent sessions,” source IP rules, route permissions.
- Session plumbing: auth-token, renegotiation, auth caching, `client-config-dir` overrides.
Here’s the operationally annoying part: depending on configuration, the client may see AUTH_FAILED even if the real issue was a TLS detail, a plugin crash, a backend timeout, or a username normalization mismatch.
A single error string, many failure modes
OpenVPN is intentionally pluggable. That’s why it survives in enterprises that still have a mainframe and three identity providers. But pluggability means the “decision” can happen in a script or plugin that’s outside OpenVPN’s core. And when the script says “no,” the client hears “AUTH_FAILED,” even if the script said “no because LDAP is down” or “no because the username contains an @ and we strip it.”
One paraphrased idea from Werner Vogels (Amazon CTO) that operations folks repeat for a reason: everything fails, all the time; design and operate assuming it will
(paraphrased idea). AUTH_FAILED is a perfect example—your job is to find which dependency failed.
Short joke #1: Passwords are like milk—everyone swears theirs is fresh until it starts failing in production.
Fast diagnosis playbook (check first/second/third)
This is the “stop the bleeding” flow. It’s biased toward reducing mean time to clarity, not explaining theory. You can do the deeper archaeology later.
First: confirm where you are failing (TLS vs user auth)
- Client log verbosity to 4–6. Look for: “TLS: Initial packet,” “VERIFY OK,” “Peer Connection Initiated,” then “AUTH_FAILED.”
- If TLS never finishes, don’t chase passwords. Chase certs, time, ciphers, CA chain, and server listen/port/firewall.
- If TLS completes and then AUTH_FAILED, chase user auth pipeline: plugin, script, PAM/LDAP/RADIUS/MFA, and server policy checks.
Second: check the server logs at the exact timestamp
- Find the connection line for that client (common fields: common-name, username, real address).
- Search for `AUTH_FAILED`, `PLUGIN_CALL`, `AUTH-PAM`, `radiusplugin`, `auth-user-pass-verify`, `client-denied`.
- If you see a backend error (LDAP bind failure, RADIUS timeout), treat it as an identity outage, not a user typo.
Third: validate your assumptions about username, realm, and auth method
- Does the server expect `samaccountname`, `user@domain`, or `DOMAIN\user`?
- Are you using `–auth-user-pass` with a password file, and does it include trailing whitespace?
- Is MFA required? Some setups reject “password-only” with AUTH_FAILED and a useless log message.
Fourth: check policy and state
- Concurrent session rules (duplicate-cn, management interface, session limits).
- Revoked cert or username disabled but still “correct credentials.”
- Client-config-dir overrides (per-user push, iroute, or `disable`).
Fifth: reproduce in a controlled way
- Try the same credentials from a known-good client config.
- Try a known-good user from the failing client config.
- This isolates “identity” from “client config” in minutes.
Facts & context: why OpenVPN auth is weirdly complicated
Some small, concrete facts help explain why AUTH_FAILED is a catch-all in the real world:
- OpenVPN began in 2001, and its extension points (scripts/plugins) were designed for heterogeneous enterprise auth long before “SSO” was a default expectation.
- User/password auth is not intrinsic to TLS VPNs; OpenVPN bolted it on to satisfy organizations that wanted a second factor beyond certificates.
- PAM integration made OpenVPN feel “native” on Unix—and also inherited PAM’s complexity: account policies, expired passwords, locked accounts, and module order.
- RADIUS support became a common enterprise pattern because it centralizes auth decisions and MFA, but it also introduces timeouts that look like bad passwords.
- OpenVPN’s `client-config-dir` is powerful: you can silently override per-user settings. That power includes silently breaking one user’s access.
- Revocation lists (CRLs) are operationally hard: they’re files, they expire, they get forgotten. A revoked cert can coexist with “correct” user credentials and still block you.
- Clock correctness is security-critical: certificates and some tokens are time-bounded. Skewed time yields authentication failures that masquerade as credential issues.
- Auth caching evolved (auth-token, renegotiation) to reduce repeated password prompts, but misconfiguration can cause loops or “stale token” failures.
- Modern cipher negotiation changed defaults in OpenVPN 2.5+ (data-ciphers). Mismatches can lead to early disconnects that users interpret as “auth problems.”
Client-side triage: prove what you sent
When someone says “the credentials are correct,” what they often mean is “the credentials work somewhere else.” That’s not the same claim. The OpenVPN client might be sending a different username than the user thinks. Or it might be sending the right username with the wrong encoding. Or the password file might contain a newline, carriage return, or BOM.
Look for evidence in the client log (not feelings)
On the client, increase verbosity and look for the sequence:
- TLS handshake established.
- Server pushes options.
- Client prompts for username/password or reads them from file.
- Server responds with AUTH_FAILED or pushes an auth-token.
If you don’t see a completed TLS handshake, don’t chase user auth. If you do, the server is reachable and the “password” conversation happened (or was attempted).
Pay attention to the username format
Enterprise OpenVPN setups frequently normalize usernames. Some strip realms; some require them. Some accept `DOMAIN\user` and reject `user@domain`. Some lower-case everything. Some treat “john.smith” and “John.Smith” as different users if they’re glued to a legacy directory.
When you’re troubleshooting, decide on one canonical representation and test that. Otherwise you’ll be “right” in three different ways and still wrong in production.
Credential files are footguns
`–auth-user-pass /path/to/file` expects a file with two lines: username and password. That file can betray you with:
- Windows CRLF line endings (
\r\n), creating a hidden\rin the password. - A trailing space copied from a ticket.
- A UTF-8 BOM at the start of the username line.
Short joke #2: The only thing more persistent than an expired VPN password is a Windows carriage return hiding at the end of it.
Server-side triage: prove what the server decided
Server logs are the source of truth, but only if you’re logging enough and looking in the right place. OpenVPN can log via syslog, a file, systemd journal, or management interface. Plugins and auth scripts can log elsewhere. And sometimes the only clue is that the plugin returned non-zero.
Know your auth mechanism
On the server, user auth typically comes from one of these:
- PAM (`openvpn-plugin-auth-pam.so`): uses system PAM stack, can enforce password expiry/lockouts.
- External script (`auth-user-pass-verify`): your script decides. If it crashes, you get AUTH_FAILED.
- RADIUS plugin: delegates to RADIUS server; timeouts and shared secret issues look like bad passwords.
- LDAP plugin/script: binds to LDAP; wrong base DN, filter, or TLS settings cause rejects.
- MFA gateway: Duo-like flows often modify the password field (append push code) or use RADIUS challenge/response.
Differentiate “auth rejected” vs “auth broken”
An auth pipeline can reject because the user is wrong. It can also reject because the pipeline is broken. Operationally those are different severities:
- User wrong: one user, consistent, backend healthy.
- Pipeline broken: many users, timeouts, plugin errors, log bursts, CPU spikes, resolver failures.
Be ruthless about this distinction. If five people fail at once right after a deploy, it’s not five people suddenly forgetting how to type.
Practical tasks: commands, outputs, decisions
Below are hands-on checks you can run. Each one includes a realistic output snippet and the decision you make from it. Use them like a runbook, not like a buffet.
Task 1: Confirm the client actually reaches the server and completes TLS
cr0x@server:~$ sudo openvpn --config client.ovpn --verb 5
...
TCP/UDP: Preserving recently used remote address: [AF_INET]203.0.113.10:1194
TLS: Initial packet from [AF_INET]203.0.113.10:1194, sid=2c0e1f9d 1a2b3c4d
VERIFY OK: depth=1, CN=corp-vpn-ca
VERIFY OK: depth=0, CN=vpn-gateway-1
Control Channel: TLSv1.3, cipher TLS_AES_256_GCM_SHA384, peer certificate: 256 bit RSA
[corp-vpn] Peer Connection Initiated with [AF_INET]203.0.113.10:1194
AUTH: Received control message: AUTH_FAILED
SIGTERM[soft,auth-failure] received, process exiting
Meaning: TLS worked; the failure is in user auth/policy. Decision: Stop debugging certificates and move to server auth logs/plugins.
Task 2: Check server journal for auth-related lines at the failure time
cr0x@server:~$ sudo journalctl -u openvpn-server@corp -S "2025-12-27 09:10:00" -U "2025-12-27 09:15:00"
Dec 27 09:12:31 vpn1 openvpn[1842]: 198.51.100.44:51233 TLS: Username/Password authentication deferred for username 'j.smith'
Dec 27 09:12:31 vpn1 openvpn[1842]: 198.51.100.44:51233 PLUGIN_CALL: POST /usr/lib/openvpn/openvpn-plugin-auth-pam.so/PLUGIN_AUTH_USER_PASS_VERIFY status=1
Dec 27 09:12:31 vpn1 openvpn[1842]: 198.51.100.44:51233 AUTH_FAILED: user 'j.smith'
Meaning: PAM plugin rejected. Decision: Validate PAM config, account state, and whether PAM is returning “expired/locked” vs “bad password.”
Task 3: Verify which auth mechanism the server is using
cr0x@server:~$ sudo grep -E '^(plugin|auth-user-pass-verify|client-cert-not-required|verify-client-cert|management)' /etc/openvpn/server/corp.conf
plugin /usr/lib/openvpn/openvpn-plugin-auth-pam.so login
verify-client-cert require
management 127.0.0.1 7505
Meaning: PAM plugin is in play; client cert is required too. Decision: Troubleshoot PAM and also check certificate validity/revocation if symptoms don’t match user auth.
Task 4: Test PAM auth directly on the server (isolating OpenVPN)
cr0x@server:~$ sudo pamtester login j.smith authenticate
Password:
pamtester: Authentication failure
Meaning: PAM itself rejects; OpenVPN is just the messenger. Decision: Check directory connectivity, PAM module order, password expiry, account lockout, and NSS/SSSD health.
Task 5: Inspect PAM stack used by the plugin
cr0x@server:~$ sudo sed -n '1,200p' /etc/pam.d/login
auth required pam_securetty.so
auth requisite pam_nologin.so
auth include common-auth
account include common-account
session include common-session
Meaning: The plugin is using the “login” service, which can be stricter than you expect (nologin, securetty). Decision: Consider a dedicated PAM service file (e.g., /etc/pam.d/openvpn) to avoid unrelated restrictions.
Task 6: Check SSSD / directory health (common silent culprit)
cr0x@server:~$ sudo systemctl status sssd --no-pager
● sssd.service - System Security Services Daemon
Loaded: loaded (/lib/systemd/system/sssd.service; enabled)
Active: active (running) since Fri 2025-12-27 08:40:18 UTC; 32min ago
...
Meaning: SSSD is up, but that doesn’t guarantee it can reach LDAP. Decision: Query a user and check logs for timeouts.
Task 7: Verify user lookup and group membership (policy may block)
cr0x@server:~$ id j.smith
uid=110245(j.smith) gid=10000(domain users) groups=10000(domain users),12010(vpn-users)
Meaning: User exists and is in vpn-users. Decision: If auth still fails, it’s not “user missing”; it’s password/account status or an upstream policy/MFA flow.
Task 8: Check account status (locked/expired) when using local or LDAP-backed auth
cr0x@server:~$ sudo passwd -S j.smith
j.smith L 2025-11-02 0 99999 7 -1
Meaning: “L” indicates locked account (for local accounts; interpretation varies with directory setups). Decision: Unlock/reset through the appropriate identity system; stop telling the user to “try again.”
Task 9: Validate client credential file for hidden characters
cr0x@server:~$ sed -n '1,2p' -n auth.txt | cat -A
j.smith$
MyS3cretP@ssw0rd^M$
Meaning: The password line ends with ^M (CR). That’s a different password. Decision: Convert file to Unix line endings and retry.
Task 10: Fix CRLF in the credential file and retest
cr0x@server:~$ sed -i 's/\r$//' auth.txt
cr0x@server:~$ sed -n '1,2p' auth.txt | cat -A
j.smith$
MyS3cretP@ssw0rd$
Meaning: CR removed. Decision: Re-run the client; if it works now, you just saved an hour of unnecessary LDAP blaming.
Task 11: Check for certificate revocation / CRL issues
cr0x@server:~$ sudo grep -E '^(crl-verify|ca|cert|key)' /etc/openvpn/server/corp.conf
ca /etc/openvpn/pki/ca.crt
cert /etc/openvpn/pki/issued/vpn-gateway-1.crt
key /etc/openvpn/pki/private/vpn-gateway-1.key
crl-verify /etc/openvpn/pki/crl.pem
Meaning: CRL is enforced. A revoked client cert can block access even if password is correct. Decision: Check whether the connecting cert is revoked and whether the CRL file is current/readable.
Task 12: Inspect the CRL file freshness and permissions
cr0x@server:~$ sudo ls -l /etc/openvpn/pki/crl.pem
-rw-r----- 1 root openvpn 2451 Nov 1 2024 /etc/openvpn/pki/crl.pem
Meaning: Old timestamp. If your PKI rotates, you may be enforcing a stale CRL (or missing recent revocations). Decision: Regenerate/refresh CRL as part of PKI ops and ensure OpenVPN can read it.
Task 13: Confirm OpenVPN is reading the config you think it is
cr0x@server:~$ sudo systemctl cat openvpn-server@corp
# /lib/systemd/system/openvpn-server@.service
ExecStart=/usr/sbin/openvpn --status %t/openvpn-server/status-%i.log --status-version 2 --suppress-timestamps --config /etc/openvpn/server/%i.conf
Meaning: The instance uses /etc/openvpn/server/corp.conf. Decision: Stop editing the wrong file under /etc/openvpn/ and expecting miracles.
Task 14: Check management interface for real-time auth hints (if enabled)
cr0x@server:~$ printf "status 3\nquit\n" | nc 127.0.0.1 7505
OpenVPN STATISTICS
Updated,2025-12-27 09:13:02
CLIENT_LIST,UNDEF,198.51.100.44:51233,10.8.0.0,0,0,2025-12-27 09:12:31
END
Meaning: The client never became fully established (UNDEF common name is a tell). Decision: Focus on auth stage, not routing/push options.
Task 15: Verify server time (token/MFA and cert validity depend on it)
cr0x@server:~$ timedatectl
Local time: Fri 2025-12-27 09:13:55 UTC
Universal time: Fri 2025-12-27 09:13:55 UTC
RTC time: Fri 2025-12-27 09:13:55
Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no
Meaning: Time is synced. Decision: If this were unsynchronized, fix NTP before you debug “random” auth failures, especially with short-lived tokens.
Task 16: Check DNS resolution on the VPN server (LDAP/RADIUS often uses hostnames)
cr0x@server:~$ getent hosts ldap01.corp.local
10.20.30.40 ldap01.corp.local
Meaning: Name resolves. Decision: If it fails or resolves to an old IP, fix DNS/hosts/SSSD resolver; auth backends won’t be reachable reliably.
Task 17: If using RADIUS, confirm connectivity (timeout vs reject)
cr0x@server:~$ nc -vz radius01 1812
Connection to radius01 1812 port [tcp/*] succeeded!
Meaning: TCP connectivity exists (note: RADIUS is typically UDP; this only proves name/route/firewall basics). Decision: If network is blocked, RADIUS plugin may present it as AUTH_FAILED.
Task 18: Check OpenVPN version alignment (behavior changes across versions)
cr0x@server:~$ openvpn --version
OpenVPN 2.6.8 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD]
library versions: OpenSSL 3.0.13 30 Jan 2024, LZO 2.10
Meaning: Server is 2.6.x with OpenSSL 3. Decision: If clients are ancient, check cipher/data-ciphers and TLS min version. A mismatch can look like “auth weirdness” to users.
Common mistakes (symptom → root cause → fix)
1) Symptom: “AUTH_FAILED” immediately after “Peer Connection Initiated”
Root cause: Auth plugin or script is rejecting; credentials may be fine but the backend is down or policy denies the user.
Fix: Inspect server logs around `PLUGIN_CALL` / `AUTH-PAM` / script output. If it’s a backend outage, treat it as an incident and restore dependency health.
2) Symptom: Credentials work via SSH/PAM, but OpenVPN says AUTH_FAILED
Root cause: OpenVPN PAM plugin is pointing at a different PAM service (e.g., `login`) with additional restrictions (nologin, securetty, time-of-day rules).
Fix: Use a dedicated PAM service file (e.g., `/etc/pam.d/openvpn`) and set `plugin … openvpn` accordingly.
3) Symptom: Only Windows clients fail; Linux/macOS succeed
Root cause: Credential file saved with CRLF, adding `\r` to password; or GUI client sending username in a different format.
Fix: Normalize line endings; avoid password files when possible; test with manual prompt entry; standardize username format.
4) Symptom: Only one user fails, repeatedly, across devices
Root cause: Account locked, password expired, user removed from required group, or cert revoked while password still valid.
Fix: Check identity provider status for that user; verify group membership; check CRL and certificate status.
5) Symptom: Everyone fails right after an OpenVPN upgrade
Root cause: Cipher/TLS defaults changed; plugin binary incompatible; or stricter verification reveals existing cert problems.
Fix: Validate `data-ciphers` and TLS minimums; confirm plugin path and ABI compatibility; rollback if needed, then reintroduce changes deliberately.
6) Symptom: Random users fail, then succeed on retry
Root cause: Backend timeouts (RADIUS/LDAP), DNS flaps, overloaded auth server, or rate limiting/lockout policy.
Fix: Instrument and monitor backend latency; reduce timeouts carefully; add redundancy; fix DNS. Retrying is not a strategy.
7) Symptom: User enters correct password + MFA, still AUTH_FAILED
Root cause: MFA integration expects a different format (append OTP, use challenge/response) or requires a specific RADIUS attribute.
Fix: Confirm the MFA flow with identity team; update client instructions; ensure plugin supports the MFA mode you deployed.
8) Symptom: AUTH_FAILED on reconnect/renegotiation, not on first connect
Root cause: Auth token caching/renegotiation mismatch; `auth-nocache` and token policies fighting; session state lost on server restart.
Fix: Decide whether you want tokens; align server/client settings; test renegotiation windows; avoid “half-on” token configs.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
The company had a VPN gateway that “used LDAP.” That phrase had been repeated in tickets for years, like a charm to ward off questions. A new SRE got paged for an AUTH_FAILED storm after a directory change. Users insisted their credentials were correct. The helpdesk insisted the VPN was fine. The dashboard insisted nothing, because it didn’t exist.
The SRE tailed the OpenVPN logs and saw `PLUGIN_CALL` followed by `status=1`. No detail. Great. So they assumed LDAP bind failures and went hunting: connectivity, DNS, firewall. All looked normal. They spent an hour checking the directory team’s change calendar, preparing the polite “did you break LDAP?” message.
Before sending it, they checked the OpenVPN server config and noticed the plugin was PAM, not LDAP. LDAP was only an upstream dependency via SSSD, and SSSD’s cache was stale because the server couldn’t reach the time source after a routing change. The Kerberos/LDAP layer didn’t outright fail; it became slow and inconsistent. PAM returned failures. OpenVPN reported AUTH_FAILED.
The wrong assumption was subtle: “AUTH_FAILED after directory change implies LDAP bind failure.” The real issue was “identity resolution is timing out because time sync broke,” which is the kind of thing that makes you respect NTP like it’s a production database.
They fixed routing to NTP, restarted SSSD to clear the worst of the cache mess, and the failures vanished. The directory team never knew there was a near-miss blame email sitting in drafts.
Mini-story 2: The optimization that backfired
A security-minded team wanted fewer password prompts. They enabled token-based reauth so users wouldn’t have to type credentials on reconnect. It sounded good: fewer prompts, fewer lockouts, fewer tickets. They rolled it out during a “quiet” week, which in corporate time means “right before a holiday.”
At first, success. Then laptops started sleeping and waking, moving between networks, and trying to reconnect. Some reconnects worked. Others failed with AUTH_FAILED. The helpdesk escalated it as “password problems.” Users began resetting passwords, which of course didn’t fix token failures. Now there were both VPN failures and password reset churn.
The root cause was a mismatch between token lifetimes and renegotiation behavior. Some clients presented a stale token after a long sleep; the server treated it as invalid and rejected. Logging was insufficient, so everything looked like the user typed a bad password even though no password was typed at all.
The fix was boring: align token policies with realistic device behavior, increase logging around token validation, and add a client-side fallback to prompt for credentials when token auth fails. Also: don’t do auth-behavior changes right before holidays unless you enjoy spending holidays learning new swear words.
Mini-story 3: The boring but correct practice that saved the day
An internal platform team ran two OpenVPN gateways behind a load balancer. Nothing fancy. The fancy part was their discipline: every change had a canary, and every canary had a known-good test user and a known-bad test user. They also kept a tiny suite of “identity dependency checks” that ran every minute: can we resolve LDAP hostnames, can we bind, can we reach RADIUS, is time synced.
One morning, users started reporting AUTH_FAILED—only on one gateway. The canary checks caught it in under two minutes: LDAP bind latency spiked on vpn2, but not vpn1. The load balancer was still sending traffic to vpn2 because health checks were just “port open.”
Because they had per-node logs and structured checks, they didn’t debate whether users forgot passwords. They drained vpn2 from the pool, restoring service, then debugged calmly: vpn2 had a resolver config drift that pointed at an old DNS server. LDAP name resolution was intermittently failing, leading to auth timeouts and rejections.
They fixed resolver config, added a load balancer health check that actually exercised authentication dependencies, and wrote a postmortem with exactly one sentence of blame: “We trusted a port check to represent an identity pipeline.” It was boring, and it prevented the same incident from recurring in a new costume.
Checklists / step-by-step plan
Step-by-step: when one user reports AUTH_FAILED
- Capture a timestamp (including timezone) and client public IP.
- Ask for the exact username format they entered (including domain/realm).
- Confirm if they used a credential file or interactive prompt.
- Check server logs at that timestamp. Find the connection line and the auth decision line.
- Identify auth backend: PAM vs script vs RADIUS vs LDAP plugin.
- Test backend directly: PAM test, LDAP bind, RADIUS connectivity, group membership.
- Check policy layers: group membership, concurrent session limits, per-user CCD overrides.
- If certs involved: verify client cert not expired/revoked; validate CRL freshness.
- Decide: user remediation (unlock/reset) vs infrastructure incident (backend down).
Step-by-step: when many users report AUTH_FAILED
- Assume dependency outage until disproven.
- Check one known-good user from a known-good client config. If it fails, it’s not user education.
- Check identity backends: directory availability, RADIUS server health, MFA provider status, DNS, time sync.
- Check recent changes: OpenVPN upgrade, plugin updates, PAM config edits, cert rotations, firewall changes.
- Isolate by node if you have multiple gateways: drain one at a time to see if failures are localized.
- Increase logging temporarily (with care for sensitive data) to capture plugin/script exit paths.
- Stabilize first: rollback or bypass non-critical auth features if policy allows (e.g., temporarily disable strict group checks).
- Write down the timeline. You will forget in 24 hours, and future you will deserve better.
What to avoid (opinionated)
- Don’t ask users to reset passwords as a first step. It creates noise and hides systemic failures.
- Don’t debug from the client only. AUTH_FAILED is a server-side decision.
- Don’t change three variables at once (“we updated OpenVPN, rotated certs, and modified PAM”). That’s not engineering; that’s gambling.
- Do create one known-good client config and keep it pristine. It’s your control group.
FAQ
Why does OpenVPN say AUTH_FAILED when the password is correct?
Because the server rejected the authentication attempt for reasons that include—but are not limited to—wrong password: locked account, expired password, missing group, backend timeout, plugin failure, or MFA policy.
How do I tell if it’s TLS/cert problems or username/password problems?
Look for a completed TLS handshake in the client log. If you see “Peer Connection Initiated” and then AUTH_FAILED, TLS likely succeeded and user auth/policy failed.
Can a revoked certificate cause AUTH_FAILED even with correct credentials?
Yes. If the server requires client certs and enforces a CRL, a revoked cert will block you regardless of username/password correctness.
What’s the fastest server-side log to check on systemd-based Linux?
journalctl -u openvpn-server@<instance> around the failure time. You’re looking for plugin calls, script invocation, and explicit AUTH_FAILED lines with usernames.
Why do only Windows users fail?
Common causes: CRLF in a credential file, different username formatting in the GUI, or users copying a password with trailing whitespace. Validate the credential file with cat -A and standardize input.
We use PAM. Why does SSH work but OpenVPN fails?
PAM is modular. SSH might use the sshd PAM service while OpenVPN uses login (or another). Those stacks can differ in account rules, allowed TTYs, or nologin behavior.
Can DNS issues really present as AUTH_FAILED?
Yes. If your auth backend uses hostnames (LDAP, RADIUS, MFA gateways), resolution failures or stale DNS can cause timeouts that end in a generic auth reject.
What about time sync—does it matter for VPN auth?
Absolutely. Certificates have validity windows; MFA and token systems are time-dependent; Kerberos-based directory auth is extremely sensitive to clock skew. Fix time first if it’s wrong.
Should we enable more verbose logging permanently?
Keep baseline logs sufficient to distinguish TLS failures from auth backend failures. Increase verbosity temporarily during incidents. Be careful not to log sensitive material from scripts/plugins.
Is it safe to rely on “port open” health checks for VPN gateways?
No. A listener can be healthy while the auth pipeline is broken. Add health checks that validate dependency reachability (DNS, directory bind, RADIUS responsiveness) and drain nodes when those fail.
Conclusion: next steps that actually reduce pager noise
If you take only a few actions from this:
- Separate TLS failures from user-auth failures using client log evidence. Stop treating AUTH_FAILED as automatically “bad password.”
- Make the server explain itself: ensure logs include plugin/script outcomes and are easy to query by timestamp.
- Test auth backends directly (PAM/SSSD/LDAP/RADIUS) so you can say “identity outage” with a straight face.
- Standardize username formats and document MFA flows in one place that won’t rot.
- Create a known-good test user and a known-good client config, and use them as your control group during incidents.
AUTH_FAILED is not a verdict. It’s a hint that the server made a decision. Your job is to find which component whispered in its ear.