Your WordPress site says “Message sent.” Your app logs “Email queued.” Customers swear they never got anything.
Marketing pings you every 30 minutes. You check spam folders. Nothing. You re-send. Still nothing.
This is the worst kind of failure: silent, reputation-damaging, and just plausible enough that everyone assumes it’s “probably fine.”
Let’s make it not fine. Let’s make it boring. Boring is good in email delivery.
What actually breaks when WordPress/app emails “don’t send”
When someone says “email isn’t sending,” they usually mean one of four different failures:
1) The app never hands mail to anything real
Classic WordPress/PHP problem: mail() returns success because it handed content to a local MTA interface,
not because the message left the building. If there’s no local MTA, or it’s misconfigured, you get the illusion of success.
This is why “it works on my laptop” is not an email strategy.
2) The message leaves your host but dies at the next hop
Common causes: port 25 blocked by the hosting provider; outbound SMTP denied by a firewall; or your MTA tries to deliver directly
and gets rate-limited, greylisted, or rejected with a policy code. The mail might queue for hours, then bounce,
or it might be accepted and then filtered later. Which leads to…
3) The recipient accepts it, then spam filters bury it
This is the “delivered but not seen” failure. It’s also the most politically awkward: your system says “sent,”
the receiving server says “accepted,” and the user says “nope.” All three can be true.
Missing SPF/DKIM/DMARC alignment, bad reputation, broken reverse DNS, and “From” domains you don’t control are top offenders.
4) You’re sending legitimate traffic that looks illegitimate
Transactional mail (password resets, receipts, contact forms) has a pattern: bursts during events, identical templates,
similar subject lines, and links. That can look spammy if you don’t authenticate or if your relay IP is “new” to the world.
Deliverability is not morality. It’s pattern matching plus reputation.
The fix is not “try another plugin.” The fix is establishing a proper SMTP relay path with authentication,
consistent identity, DNS alignment, and logs you can subpoena from your own systems.
Interesting facts and brief history (so today’s failures make sense)
- SMTP predates the web. It was standardized in the early 1980s, and it still assumes a mostly-trusting network.
- Port 25 used to be “the internet’s mail pipe.” Now it’s frequently blocked for customer servers because compromised hosts spew spam.
- SPF came from “stop forged envelope senders.” It checks who may send mail for a domain via DNS, but it only covers one layer of identity.
- DKIM was built to survive forwarding. It signs message content so recipients can verify it wasn’t altered in transit (or by a “helpful” gateway).
- DMARC is policy and alignment, not magic. It tells receivers what to do when SPF/DKIM fail and requires identifiers to line up with your visible “From.”
- Greylisting is intentionally annoying. Some servers temporarily reject first attempts to see if you retry like a real MTA. Apps that “fire and forget” lose.
- “Accepted” is not the same as “in inbox.” SMTP acceptance means “we took responsibility,” not “we displayed it prominently to a human.”
- Reverse DNS is still a vibe check. Many receivers heavily distrust IPs without rDNS or with mismatched rDNS because that’s common in botnets.
- Big providers run machine learning on mail streams. Your message is judged by domain reputation, IP reputation, engagement, and content features.
Fast diagnosis playbook (first/second/third checks)
When email “isn’t sending,” speed matters. You want to identify the bottleneck in minutes, not after a meeting.
Here’s the triage order I use in production.
First: determine whether mail is leaving the app host at all
- Send a single test email with a unique subject.
- Check local logs or queue (Postfix/Exim) on the host. If there’s no queue and no logs, the app never handed it off.
- Confirm you’re not depending on PHP
mail()without an MTA behind it.
Second: determine whether the next SMTP hop is reachable and accepting
- Test outbound connectivity to your relay on port 587 (and 465 if you must). Don’t assume port 25 works.
- Attempt an authenticated SMTP session. If auth fails, you don’t have “deliverability problems,” you have “login problems.”
- Look for immediate rejects with clear SMTP codes (5xx) versus deferrals (4xx).
Third: determine whether the message is being rejected later (policy, reputation, authentication)
- Inspect headers of a message that does arrive somewhere (even to a test mailbox) for SPF/DKIM/DMARC results.
- Confirm From domain alignment and that you’re not sending as a domain you don’t control.
- Check if bounces are going somewhere you never monitor (classic: VERP/bounce address points to nowhere).
The goal of this playbook is to sort the failure into: app handoff, network/auth, or deliverability policy.
Each category has a different fix. Mixing them wastes days.
The architecture that actually delivers (and why)
The reliable pattern is simple: your app talks SMTP submission to a relay, the relay handles delivery.
The relay is authenticated, logged, rate-limited, and aligned with your domain identity.
What you should do (opinionated)
- Use SMTP submission (587) with authentication from WordPress/app to a relay.
- Send from a domain you control and set DNS records correctly (SPF/DKIM/DMARC).
- Centralize mail sending so you can enforce policy and see logs in one place.
- Separate transactional and marketing streams if volumes differ; different reputations, different failure modes.
- Prefer a managed SMTP relay unless you have a reason to operate outbound reputation yourself.
What you should avoid
- Direct-to-MX delivery from random web servers. It works until it doesn’t, then you own reputation, rate limits, and blocklists.
- Using “From: yourbrand.com” but actually sending via “somethingelse.com”. That breaks alignment and trust.
- Sprinkling different SMTP plugins/configs across servers. You’ll have 12 subtly broken variants.
One joke, as a treat: Email deliverability is like dating—if you show up with a fake name and no references, nobody’s calling back.
Choosing an SMTP relay: what matters and what’s noise
You can run your own relay (Postfix), use your cloud provider’s email relay, or buy a specialized transactional email service.
The “best” answer depends on compliance, volume, and whether you enjoy debugging reputation scores like it’s a hobby.
What matters
- Authentication options: SMTP AUTH over TLS, ideally per-application credentials.
- DKIM support: either the relay signs, or you sign; but it must be consistent.
- Bounce handling: you need visibility. At minimum, a mailbox for bounces; better, event logs.
- Rate controls: both to protect your reputation and to prevent accidental floods from a broken loop.
- IP/domain reputation management: who owns it, and how quickly it recovers after an incident.
- Logging access: you need delivery outcomes with message IDs. “We sent it” is not evidence.
What’s mostly noise
- “Unlimited” sending promises without a reputation plan.
- Fancy dashboards if you can’t export or correlate message IDs.
- “One click WordPress integration” that hides the actual SMTP identity and breaks alignment.
Hands-on tasks (commands, expected output, and decisions)
These tasks assume a Linux server running Postfix as a local MTA or relay, plus a WordPress/app host that submits mail.
Even if you’re using a managed relay, many of these checks still apply (network, TLS, DNS, and message tracing).
Task 1 — Confirm Postfix is installed and running
cr0x@server:~$ systemctl status postfix
● postfix.service - Postfix Mail Transport Agent
Loaded: loaded (/lib/systemd/system/postfix.service; enabled)
Active: active (running) since Mon 2026-01-04 08:11:02 UTC; 2h 13min ago
Docs: man:postfix(1)
Main PID: 1327 (master)
Tasks: 3 (limit: 4567)
Memory: 12.3M
CGroup: /system.slice/postfix.service
├─1327 /usr/lib/postfix/sbin/master -w
├─1331 qmgr -l -t unix -u
└─1332 tlsmgr -l -t unix -u
What it means: If it’s not active (running), your “SMTP relay” is imaginary.
Decision: If inactive, start it and check configuration before touching WordPress.
Task 2 — Check whether anything is stuck in the mail queue
cr0x@server:~$ postqueue -p
-Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
A1B2C3D4E5 3120 Mon Jan 4 09:22:14 no-reply@example.com
user@gmail.com
(connect to gmail-smtp-in.l.google.com[142.250.102.27]:25: Connection timed out)
What it means: Postfix accepted mail but can’t reach the remote MX (timeout). That’s network or port 25 egress.
Decision: If you see timeouts, stop trying direct-to-MX and relay via submission (587) to a provider.
Task 3 — Inspect recent mail logs for rejects/deferrals
cr0x@server:~$ sudo tail -n 50 /var/log/mail.log
Jan 4 10:05:12 server postfix/smtp[24518]: A1B2C3D4E5: to=<user@gmail.com>, relay=gmail-smtp-in.l.google.com[142.250.102.27]:25, delay=67, delays=0.1/0.02/67/0, dsn=4.4.1, status=deferred (connect to gmail-smtp-in.l.google.com[142.250.102.27]:25: Connection timed out)
Jan 4 10:05:45 server postfix/smtpd[24544]: warning: unknown[203.0.113.50]: SASL LOGIN authentication failed: authentication failure
Jan 4 10:05:45 server postfix/smtpd[24544]: disconnect from unknown[203.0.113.50] ehlo=1 auth=0/1 rset=1 quit=1 commands=3/4
What it means: You have two distinct problems: outbound delivery timeouts (direct-to-MX) and inbound auth failures (clients misconfigured or malicious).
Decision: Lock down submission, and fix your relay path so you’re not delivering on port 25 from random web nodes.
Task 4 — Verify outbound connectivity to your intended relay (port 587)
cr0x@server:~$ nc -vz smtp.relay.local 587
Connection to smtp.relay.local 587 port [tcp/submission] succeeded!
What it means: Network path exists. If this fails, your firewall/security group/DNS is the culprit.
Decision: Fix routing/firewall before changing app settings.
Task 5 — Test TLS handshake and certificate details
cr0x@server:~$ openssl s_client -starttls smtp -connect smtp.relay.local:587 -servername smtp.relay.local -brief
CONNECTION ESTABLISHED
Protocol version: TLSv1.3
Ciphersuite: TLS_AES_256_GCM_SHA384
Peer certificate: CN = smtp.relay.local
Verification: OK
What it means: TLS is working and cert validates. If verification fails, many clients will refuse to send.
Decision: Fix certificates (or trust store) rather than “disabling verification” in the app.
Task 6 — Try an authenticated SMTP session (manual, fast signal)
cr0x@server:~$ swaks --to user@example.net --from no-reply@example.com --server smtp.relay.local --port 587 --auth LOGIN --auth-user app-smtp --auth-password 'REDACTED' --tls
=== Trying smtp.relay.local:587...
=== Connected to smtp.relay.local.
=== TLS started with cipher TLS_AES_256_GCM_SHA384
<= 220 smtp.relay.local ESMTP Postfix
=> EHLO server
<= 250-smtp.relay.local
<= 250-AUTH PLAIN LOGIN
<= 250 STARTTLS
=> AUTH LOGIN
<= 235 2.7.0 Authentication successful
=> MAIL FROM:<no-reply@example.com>
<= 250 2.1.0 Ok
=> RCPT TO:<user@example.net>
<= 250 2.1.5 Ok
=> DATA
<= 354 End data with <CR><LF>.<CR><LF>
=> .
<= 250 2.0.0 Ok: queued as 9F8E7D6C5B
=== Message sent.
What it means: You can submit mail successfully. The queued ID is traceable in relay logs.
Decision: If this works but WordPress doesn’t, your app plugin/config is wrong, not the network.
Task 7 — Trace a message ID through logs (stop guessing)
cr0x@server:~$ sudo grep '9F8E7D6C5B' /var/log/mail.log
Jan 4 11:14:22 server postfix/cleanup[27102]: 9F8E7D6C5B: message-id=<20260104111422.9F8E7D6C5B@smtp.relay.local>
Jan 4 11:14:22 server postfix/qmgr[1331]: 9F8E7D6C5B: from=<no-reply@example.com>, size=1023, nrcpt=1 (queue active)
Jan 4 11:14:23 server postfix/smtp[27105]: 9F8E7D6C5B: to=<user@example.net>, relay=mx.example.net[198.51.100.20]:25, delay=1.1, delays=0.1/0.02/0.4/0.6, dsn=2.0.0, status=sent (250 2.0.0 OK)
What it means: End-to-end delivery to the recipient MX succeeded.
Decision: If users still don’t see it, you’re in “spam filtering or mailbox rules” territory—move to header analysis and authentication checks.
Task 8 — Confirm the server’s hostname and FQDN are sane
cr0x@server:~$ hostnamectl
Static hostname: smtp.relay.local
Icon name: computer-vm
Chassis: vm
Machine ID: 2b3f4c5d6e7f8a9b
Boot ID: 0a1b2c3d4e5f6a7b
Operating System: Ubuntu 22.04.4 LTS
Kernel: Linux 5.15.0-91-generic
Architecture: x86-64
What it means: A consistent hostname helps avoid weird HELO/EHLO identity mismatches.
Decision: If your hostname is localhost or changes, fix it and update Postfix myhostname.
Task 9 — Check DNS for SPF (and interpret the result)
cr0x@server:~$ dig +short TXT example.com
"v=spf1 include:_spf.relayvendor.net -all"
What it means: SPF is present and strict (-all). Good—if correct.
Decision: If you’re not actually sending through _spf.relayvendor.net, fix SPF or your mail will fail SPF checks.
Task 10 — Check for DKIM key publication (selector)
cr0x@server:~$ dig +short TXT s1._domainkey.example.com
"v=DKIM1; k=rsa; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8A..."
What it means: DKIM public key exists for selector s1.
Decision: If missing, DKIM signing may still happen but won’t validate; publish the correct key or configure the relay to sign.
Task 11 — Check DMARC policy and reporting addresses
cr0x@server:~$ dig +short TXT _dmarc.example.com
"v=DMARC1; p=quarantine; rua=mailto:dmarc-reports@example.com; ruf=mailto:dmarc-forensics@example.com; adkim=s; aspf=s"
What it means: DMARC is enforcing quarantine and strict alignment.
Decision: If you’re still onboarding a relay, consider p=none temporarily while you validate alignment—then enforce.
Task 12 — Verify reverse DNS for your outbound IP
cr0x@server:~$ dig +short -x 198.51.100.10
smtp.example.com.
What it means: rDNS exists. If it points to something generic or missing, some receivers will distrust you.
Decision: If you operate your own IPs, set rDNS to a stable hostname that matches forward DNS reasonably.
Task 13 — Check whether port 25 egress is blocked (direct-to-MX reality check)
cr0x@server:~$ nc -vz gmail-smtp-in.l.google.com 25
nc: connect to gmail-smtp-in.l.google.com (142.250.102.27) port 25 (tcp) failed: Connection timed out
What it means: You can’t deliver directly to Gmail on port 25 from this host (blocked or filtered).
Decision: Use a submission relay on 587. Don’t fight your provider’s egress policy with “clever” hacks.
Task 14 — Confirm WordPress host isn’t trying to run “sendmail” without an MTA
cr0x@server:~$ command -v sendmail || echo "sendmail not found"
sendmail not found
What it means: PHP mail() will likely fail or silently no-op depending on configuration.
Decision: Configure WordPress to use SMTP submission explicitly; don’t rely on a missing local binary.
Task 15 — Watch live logs while sending a test (real-time truth)
cr0x@server:~$ sudo tail -f /var/log/mail.log
Jan 4 11:29:17 server postfix/smtpd[28110]: connect from web01[203.0.113.11]
Jan 4 11:29:18 server postfix/smtpd[28110]: 1A2B3C4D5E: client=web01[203.0.113.11], sasl_method=LOGIN, sasl_username=app-smtp
Jan 4 11:29:18 server postfix/cleanup[28113]: 1A2B3C4D5E: message-id=<6f9d2a3c9e1b@web01>
Jan 4 11:29:18 server postfix/qmgr[1331]: 1A2B3C4D5E: from=<no-reply@example.com>, size=1845, nrcpt=1 (queue active)
Jan 4 11:29:19 server postfix/smtp[28115]: 1A2B3C4D5E: to=<user@example.net>, relay=mx.example.net[198.51.100.20]:25, delay=1.2, dsn=2.0.0, status=sent (250 2.0.0 OK)
What it means: You’re seeing the client connect, authenticate, enqueue, and deliver. That’s the whole story.
Decision: If the connection never appears, WordPress isn’t using your relay. Fix plugin settings and credentials.
Task 16 — Validate Postfix configuration for obvious footguns
cr0x@server:~$ sudo postconf -n
myhostname = smtp.relay.local
mydestination = $myhostname, localhost.localdomain, localhost
inet_interfaces = all
smtpd_tls_security_level = may
smtp_tls_security_level = may
smtpd_sasl_auth_enable = yes
smtpd_recipient_restrictions = permit_sasl_authenticated,reject_unauth_destination
What it means: Key lines: SASL auth enabled, and reject_unauth_destination prevents open relay.
Decision: If you don’t see reject_unauth_destination, stop and fix it before exposing this server to the network.
Second joke (and last): An open relay is like leaving your car unlocked with the keys inside—except the tickets show up at your boss’s house.
A production-grade Postfix relay setup (with guardrails)
There are two common “relay” roles people confuse:
(1) a submission server that accepts authenticated mail from your apps,
and (2) a smart host that relays outbound mail through an upstream provider.
You can run both roles on one Postfix instance, but be explicit.
Pattern A: Postfix submission server that relays to a managed provider
This is what I recommend for most teams: apps submit to your Postfix on 587 with SMTP AUTH, Postfix forwards to a provider on 587.
You get centralized logs and policy, and you don’t own IP reputation.
Install packages (Debian/Ubuntu example)
cr0x@server:~$ sudo apt-get update
Hit:1 http://archive.ubuntu.com/ubuntu jammy InRelease
Reading package lists... Done
cr0x@server:~$ sudo apt-get install -y postfix libsasl2-modules mailutils
Reading package lists... Done
Building dependency tree... Done
Setting up postfix (3.6.4-1ubuntu1.3) ...
Decision: If you’re in an environment that already has Exim or another MTA, don’t stack MTAs. Pick one.
Configure Postfix for authenticated submission and safe relaying
Edit /etc/postfix/main.cf (showing the meat, not every default). Values are examples; adjust hostnames/domains.
cr0x@server:~$ sudo postconf -e "myhostname = smtp.example.com"
cr0x@server:~$ sudo postconf -e "myorigin = example.com"
cr0x@server:~$ sudo postconf -e "mydestination = \$myhostname, localhost.localdomain, localhost"
cr0x@server:~$ sudo postconf -e "inet_interfaces = all"
cr0x@server:~$ sudo postconf -e "mynetworks = 127.0.0.0/8 [::1]/128"
cr0x@server:~$ sudo postconf -e "smtpd_recipient_restrictions = permit_sasl_authenticated, reject_unauth_destination"
cr0x@server:~$ sudo postconf -e "smtpd_sasl_auth_enable = yes"
cr0x@server:~$ sudo postconf -e "smtpd_tls_security_level = may"
cr0x@server:~$ sudo postconf -e "smtpd_tls_auth_only = yes"
cr0x@server:~$ sudo postconf -e "smtp_tls_security_level = may"
What this does:
The server is not an open relay; only authenticated users can relay.
TLS is used for auth (so you’re not handing passwords over plaintext).
Enable the submission service on port 587
In /etc/postfix/master.cf, ensure submission is enabled and requires encryption.
You can do it with a safe append + manual review. After editing, verify with postfix check.
cr0x@server:~$ sudo grep -n '^submission' -A6 /etc/postfix/master.cf
submission inet n - y - - smtpd
-o syslog_name=postfix/submission
-o smtpd_tls_security_level=encrypt
-o smtpd_sasl_auth_enable=yes
-o smtpd_recipient_restrictions=permit_sasl_authenticated,reject
-o milter_macro_daemon_name=ORIGINATING
Decision: If you can’t enforce TLS on submission, don’t run submission on an untrusted network segment.
Configure Postfix to relay outbound via an upstream smart host
If your provider gives you smtp.provider.tld:587 with credentials, set relayhost and SASL password map.
cr0x@server:~$ sudo postconf -e "relayhost = [smtp.provider.tld]:587"
cr0x@server:~$ sudo postconf -e "smtp_sasl_auth_enable = yes"
cr0x@server:~$ sudo postconf -e "smtp_sasl_security_options = noanonymous"
cr0x@server:~$ sudo postconf -e "smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd"
cr0x@server:~$ sudo postconf -e "smtp_tls_security_level = encrypt"
cr0x@server:~$ sudo bash -c 'cat > /etc/postfix/sasl_passwd <<EOF
[smtp.provider.tld]:587 relay-user:relay-password
EOF'
cr0x@server:~$ sudo postmap /etc/postfix/sasl_passwd
cr0x@server:~$ sudo chmod 600 /etc/postfix/sasl_passwd /etc/postfix/sasl_passwd.db
Decision: If you’re checking /etc/postfix/sasl_passwd into Git, stop. Put credentials in a secret store and template them at deploy time.
Restart and validate configuration
cr0x@server:~$ sudo postfix check
cr0x@server:~$ sudo systemctl restart postfix
cr0x@server:~$ sudo systemctl is-active postfix
active
What it means: Postfix accepted the config and is running.
Decision: If postfix check complains, fix that first; don’t “restart until it works.”
Hardening basics you should not skip
- Firewall: expose 587 only to app networks/VPN, and 25 only if you truly need it (many don’t).
- Rate limiting: throttle per client to avoid a compromised WordPress sending 200k “invoices.”
- Auth separation: different SMTP credentials per environment (prod/stage/dev) and per app if possible.
- Logging retention: keep enough to investigate disputes; but don’t log message bodies unless policy allows.
One engineering quote to keep you honest: “Hope is not a strategy.” — paraphrased idea attributed to operations culture (often repeated in SRE teams).
DNS authentication that affects deliverability (SPF, DKIM, DMARC, rDNS)
Mail delivery is half SMTP and half identity. You need both.
Authentication doesn’t guarantee inbox placement, but lack of it guarantees pain.
SPF: who is allowed to send
SPF checks the envelope sender (Return-Path / MAIL FROM). Receivers query DNS for a TXT record on the domain.
If you send through a relay provider, your SPF must include them.
What to do: publish a single SPF record, keep it under DNS lookup limits (practical reality),
and use -all when you’re confident.
What to avoid: multiple SPF records (receivers treat it as permerror), and “just put +all” (which defeats the purpose).
DKIM: content signature
DKIM signs headers and body; receivers verify using the public key in DNS.
It’s sensitive to message modification—some “security” gateways and footers can break DKIM.
What to do: have exactly one component sign (either your relay or your provider), rotate keys, and keep selectors stable per sending stream.
DMARC: policy + alignment
DMARC adds two crucial ideas:
(1) tell receivers what to do on authentication failure, and
(2) require alignment between the domain in the visible From header and the authenticated SPF/DKIM domains.
Recommended rollout: start with p=none to observe, then move to quarantine, then reject.
If your org has multiple senders (helpdesk, CRM, marketing), do the inventory first or you’ll break legitimate mail.
Reverse DNS (rDNS): the IP’s identity
If you operate the outbound IP, set rDNS. If you use a provider, they usually handle it.
Receivers use rDNS as a credibility hint. Not strictly required everywhere, but missing rDNS is like showing up to a job interview with no ID.
Alignment reality check: “From” must be yours
The single most common self-inflicted wound: WordPress sends “From: wordpress@server-hostname” or “From: info@gmail.com”
while the envelope sender is something else entirely. DMARC sees misalignment and receivers treat you as suspicious.
Fix the From domain. Own it.
WordPress/app configuration patterns that don’t sabotage you
WordPress itself is not the villain; inconsistent configuration is.
Your job is to make WordPress behave like a well-mannered client: authenticated SMTP submission, stable identity, sane headers.
Pick one sending path and enforce it
If you run Postfix submission internally, everything should submit to it. If you use a managed relay directly from WordPress,
do that consistently. What you don’t want is:
- some emails going through PHP
mail(), - some going through SMTP plugin A,
- and password resets going through plugin B because it “fixed” one thing last year.
Use a dedicated sender address for transactional mail
Use something like no-reply@yourdomain or notifications@yourdomain.
Keep it consistent. Receivers learn patterns. Humans do too.
Set Reply-To intentionally
If users should respond, set Reply-To to a monitored address. If they shouldn’t, still consider a monitored mailbox for bounce handling.
Silent failures often hide in bounces you never read.
Don’t send as a domain you don’t control
Sending “From: yourbrand@gmail.com” from your server is deliverability self-harm. It also makes auditing a mess.
Use your domain, authenticate it, and route replies where you want them.
Apps: prefer SMTP libraries with explicit error handling
Whether it’s PHP, Node, Python, or Java: configure your SMTP client to fail loudly and log SMTP responses.
If your app eats exceptions, you will learn about email failures from angry customers. That’s a rough monitoring system.
Monitoring, alerting, and evidence: stop guessing
Reliable email isn’t “set and forget.” It’s set, observe, and enforce.
You need at least three layers: service health, queue health, and deliverability signals.
Service health
- Alert if Postfix is down.
- Alert if disk is near full (mail queues and logs need space; full disk produces creative failures).
- Alert on TLS certificate expiry if you terminate TLS yourself.
Queue health
A growing mail queue is not “normal background noise.” It’s a visible symptom of upstream rejection, network trouble,
credential failure, or throttling.
cr0x@server:~$ postqueue -p | tail -n 5
-- 2 Kbytes in 1 Request.
Decision: If it’s consistently non-zero and increasing, alert and investigate SMTP response codes in logs.
Deliverability signals
At minimum: keep samples of delivered headers and verify SPF/DKIM/DMARC results are consistently passing.
Better: collect bounce codes and categorize (auth, policy, mailbox full, spam rejection).
Correlate on message IDs
Your relay should log queue IDs and upstream IDs. Your app should log a correlation ID for “email intent.”
If you can’t trace “user clicked reset password” → “message accepted by relay” → “delivered or bounced,”
you don’t have an email system. You have vibes.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
A mid-size SaaS company ran WordPress for marketing pages and a custom app for the product.
Password resets were “sent” (according to the app logs). Contact forms were “sent” (according to WordPress).
Support tickets spiked: users couldn’t log in.
The wrong assumption was subtle: everyone believed that a successful API call to the application’s mailer meant a successful email delivery.
In reality, the app was configured to use the local sendmail interface, which just handed messages to a local queue. That queue existed—barely.
The VM image had an MTA installed months ago, but nobody owned it. No one monitored it. No one read mail logs.
During a routine OS patch cycle, the host rebooted and came back with a changed hostname and a missing resolver config (cloud-init “helped”).
Postfix started, but couldn’t resolve remote MX records. Messages queued and queued. The app kept returning success.
Meanwhile, a few messages trickled out after DNS recovered, arriving hours late and confusing everyone further.
The fix wasn’t heroic. They moved to SMTP submission to a single relay, logged queue IDs back into the app,
and set an alert for queue size > small threshold for more than a few minutes.
The next time DNS hiccuped, they saw it immediately. No guessing. No customer archaeology.
Mini-story 2: The optimization that backfired
Another company decided their managed transactional email provider was “too expensive.”
They had a shiny Kubernetes cluster, a smart SRE team, and just enough confidence to be dangerous.
The plan: send mail directly from app pods using a lightweight SMTP library to recipients’ MX servers.
On paper, it looked efficient: fewer dependencies, no provider lock-in, and “SMTP is just a protocol.”
In reality, they rediscovered why outbound email is a specialized service.
Their cloud provider blocked port 25 from nodes. They requested an exception and got it—sort of.
Some egress paths worked; others were silently throttled. Delivery became non-deterministic across pods.
Then reputation entered the room. Their outbound IPs changed as nodes rotated.
DKIM signing was inconsistent because different deployments had slightly different secrets.
Some messages passed, some failed. DMARC alignment was unstable. Receivers began rate-limiting and junking.
The “optimization” didn’t just cost time; it trained providers to distrust their domain.
They backed out. The boring fix: a centralized relay with stable identity, consistent DKIM, and a provider that actually manages reputation.
The SRE team didn’t lose face; they gained an operational lesson: the cheapest email is the one you don’t have to explain.
Mini-story 3: The boring but correct practice that saved the day
A large enterprise had a habit I genuinely respect: every transactional email had a trace ID in a header,
and every SMTP hop logged it. Not “sometimes.” Every time.
It was enforced in CI: if your mailer didn’t add the header, the build failed.
One morning, finance reported missing invoice emails. Sales escalated. The usual storm began:
“Is the app down?” “Is it the network?” “Did someone change DNS?” People started composing theories like it was a group project.
Instead, the on-call pulled a trace ID from the app logs, grepped it in the relay logs, and found a clean SMTP 250 acceptance from the relay,
followed by repeated 4xx deferrals from the upstream provider: temporary policy throttling.
The queue was growing, but within configured bounds. Retries were happening. Nothing was “broken,” it was slow.
Because they had evidence, they did the only sensible thing: notified stakeholders of delay, reduced burst rate temporarily,
and let the queue drain. No panic changes. No config flailing.
Boring practice, boring day. That’s the dream.
Common mistakes: symptom → root cause → fix
“WordPress says sent, but nothing arrives” → PHP mail() hands off locally → use SMTP submission
Symptom: WordPress plugin reports success; no bounces; no mail logs.
Root cause: PHP mail() doesn’t guarantee delivery; local MTA missing or broken.
Fix: Configure WordPress to use SMTP (587) to your relay; verify with live relay logs and a test like swaks.
“Mail is queued for hours” → direct-to-MX blocked or deferred → relay via provider on 587
Symptom: postqueue -p shows many deferred messages; logs show timeouts to port 25.
Root cause: outbound port 25 blocked; greylisting without retries; network filtering.
Fix: route outbound via an upstream smart host on 587 with auth; keep retry logic in Postfix, not in your app.
“Gmail accepts but messages go to spam” → authentication/alignment weak → fix SPF/DKIM/DMARC
Symptom: messages arrive in spam; headers show SPF softfail/neutral or DKIM fail; DMARC fail.
Root cause: sending domain not aligned; missing DKIM; SPF points to wrong senders.
Fix: align From domain with authenticated domains; publish SPF include(s), enable DKIM signing, set DMARC policy after validation.
“Some emails send, others don’t” → multiple sending paths → enforce one route
Symptom: password reset works; contact form doesn’t; or vice versa.
Root cause: different plugins/functions using different MTAs or different credentials.
Fix: standardize: all mail through one SMTP submission endpoint; disable legacy plugins and local mail paths.
“Auth failed” → wrong credentials or TLS mismatch → fix SMTP AUTH and cert validation
Symptom: logs show SASL failures; app reports “Could not authenticate.”
Root cause: wrong username/password; using port 25 without submission; TLS required but not used.
Fix: confirm with swaks; enforce TLS on 587; rotate credentials and use per-app secrets.
“Bounces never show up” → Return-Path points nowhere → configure bounce handling
Symptom: users report non-delivery; you have no bounce mailbox.
Root cause: envelope sender not monitored; relay rewrites Return-Path; or you never set up a bounce domain.
Fix: ensure bounces go to a monitored mailbox/system; log and categorize DSNs.
“After enabling DMARC reject, legitimate mail breaks” → forgotten senders → inventory first
Symptom: helpdesk/CRM/legacy systems’ mail disappears.
Root cause: DMARC enforcement without mapping all sources; no DKIM signing on some systems.
Fix: start DMARC with p=none, review reports, then enforce; migrate stray senders to the relay or sign correctly.
Checklists / step-by-step plan
Plan A (recommended): WordPress/app → Postfix submission → managed relay provider
- Pick your sender domain for transactional mail (example:
example.com). - Stand up a submission endpoint (Postfix on 587) reachable from your app servers only.
- Configure upstream relayhost in Postfix to your provider on 587 with SMTP AUTH.
- Configure WordPress/app to use SMTP to your Postfix submission host with TLS and per-app credentials.
- Publish SPF for the domain to include the provider (and only what you actually use).
- Enable DKIM signing (provider or relay) and publish selector keys.
- Deploy DMARC as p=none initially; confirm alignment; then move to quarantine/reject.
- Set bounce handling (Return-Path) and monitor DSNs.
- Add monitoring: postfix service, queue size, log-based alerts for repeated 4xx/5xx.
- Run a test matrix: send to a few major providers; inspect headers; confirm pass results.
Emergency “make it send now” checklist (without creating future debt)
- Stop using direct-to-MX from web servers if port 25 is flaky.
- Get SMTP submission working with
swaksfirst; then point WordPress/app to it. - Use a From address on your domain (not a free mailbox domain).
- Publish SPF and DKIM before you start blasting.
- Turn on logging and keep the message IDs.
“We want to run our own outbound delivery” checklist (the reality tax)
- Confirm port 25 egress is permitted and stable across all egress paths.
- Set rDNS and forward DNS; maintain stable IPs.
- Implement DKIM signing and key rotation.
- Implement feedback loop processing where available (complaints and bounces).
- Warm up IPs and domains (gradual traffic ramp).
- Build monitoring for blocklist events and rejection patterns.
If that list sounds like a second job, that’s because it is.
FAQ (the questions people ask at 2 a.m.)
1) Why does WordPress say “sent” when nothing arrives?
Because many paths only confirm local handoff (to PHP’s mail interface or a local MTA),
not final delivery. “Sent” often means “I dropped it into a pipe.”
2) Should I use SMTP port 25, 465, or 587?
Use 587 for client submission with authentication and TLS. Port 25 is server-to-server delivery and commonly blocked for customers.
Port 465 is legacy “implicit TLS” and still used, but 587 is the modern standard choice.
3) Can I just use Gmail SMTP for my site?
You can for low volume, but it’s usually a poor long-term fit: rate limits, credential management, and alignment issues.
Also, using a personal mailbox as a production dependency is how you end up debugging email from an airport lounge.
4) What’s the minimum DNS setup for deliverability?
SPF and DKIM at minimum, with DMARC added quickly. If you operate your own sending IPs, add rDNS.
Without these, you’re betting against the receivers’ default skepticism.
5) What does DMARC alignment actually mean?
It means the domain humans see in the From: header should match (or be a parent of, depending on strictness)
the domains authenticated by SPF and/or DKIM. Misalignment is a common reason legitimate mail gets junked.
6) Why do I see “deferred” messages in the queue?
Deferrals are temporary failures (4xx). Causes include greylisting, rate limits, temporary policy blocks, or network issues.
Deferrals are not “fine” if they accumulate; they’re only fine if they retry and drain predictably.
7) How do I prove an email was delivered?
You can prove your SMTP server got a 250 “sent/accepted” from the recipient’s MX.
You cannot prove a human read it. If you need human-level proof, that’s a product problem (in-app notifications, receipts, etc.).
8) Should my WordPress server run Postfix directly?
Sometimes. But it’s usually better to keep WordPress as a client and centralize mail through a relay/submission host.
Web servers get compromised; you don’t want compromised web servers to have direct outbound email power.
9) Why do password reset emails fail more often than newsletters?
They don’t always fail more—people notice them more. Also, resets are often triggered during incidents (traffic spikes),
and their timing is urgent. Any queuing delay feels like a failure.
10) Is it okay to disable TLS verification in the SMTP plugin?
No, not in production. It “fixes” the symptom by removing security and making debugging harder later.
Fix certificates and trust properly. You want secure, repeatable behavior.
Conclusion: next steps that stick
Reliable email isn’t about the perfect plugin. It’s about having a real relay path, consistent identity, and logs that tell the truth.
Build a submission endpoint, relay through a provider unless you truly need to self-host reputation, and align SPF/DKIM/DMARC.
Do this this week
- Run the fast diagnosis playbook and classify the failure (handoff vs network/auth vs deliverability).
- Standardize on SMTP submission (587) from WordPress/app to one relay.
- Publish/verify SPF, DKIM, and DMARC alignment for the From domain you actually use.
- Add queue and log monitoring so you find out before your customers do.
Do this next
- Split transactional and marketing streams if you send both.
- Rotate credentials, separate prod/stage, and treat SMTP creds like any other production secret.
- Write down the trace path (app intent → relay queue ID → upstream outcome) and make it part of incident response.
Make email boring. Your future self will thank you, probably quietly, because they’ll finally have time for something else.