You ran the checks. The certificate is not expired. It’s signed by a public CA. The chain looks fine in a browser.
And yet your mail logs are full of TLS alerts, deferrals, or “Untrusted” warnings that refuse to die.
This is the part nobody tells you: email TLS is not “HTTPS but on port 25.” It’s a different culture, different defaults,
and different failure modes. A certificate can be perfectly valid and still be wrong for SMTP in a way that breaks real delivery.
What “valid certificate” actually means in SMTP land
When someone says “the cert is valid,” they usually mean: the leaf certificate’s signature verifies, it’s not expired,
and it chains to a trusted root in some client. That’s table stakes. SMTP adds more ways to be wrong.
In email, the server’s TLS identity is tied to DNS names, MX behavior, and policy. Some clients validate hostnames strictly.
Some don’t. Some care about SNI. Some can’t do SNI. Some enforce modern ciphers. Some are stuck in 2012 but still deliver mail for a bank.
“Valid” is a spectrum, and your counterpart MTA chooses where to draw the line.
Interesting facts and historical context (short, concrete, and useful)
- STARTTLS for SMTP was standardized in 2002 (RFC 3207). It was designed for opportunistic upgrade, not strict enforcement.
- SMTP originally shipped without encryption because the early Internet assumed a cooperative environment; encryption was bolted on later.
- “Opportunistic TLS” became the default norm for years: encrypt if possible, fall back if not. Great for deliverability, mediocre for confidentiality.
- Certificate name checks were historically inconsistent across MTAs. That inconsistency is why “it works with Gmail” is not a certificate test.
- SNI arrived later than many SMTP stacks. Some older MTAs still don’t send SNI, which matters on multi-tenant servers.
- Perfect Forward Secrecy became mainstream after 2013. Older cipher preferences can still break interop when one side disables “legacy” too aggressively.
- MTA-STS is relatively new (RFC 8461, 2018) and shifted parts of email TLS from “best effort” to “policy-driven enforcement.”
- DANE for SMTP exists (TLSA records, RFC 7672) and can enforce TLS without public CAs—great, until DNSSEC isn’t really deployed end-to-end.
- Many MTAs still treat port 465 as “SMTPS” for submission-style use, while port 25 remains STARTTLS-centric for server-to-server.
There’s a fundamental mismatch at play: the web is “client to one server name,” while SMTP is “server to whatever MX DNS returns, potentially many names,
potentially changing, potentially behind load balancers.” Certificates hate ambiguity. SMTP loves it.
One quote worth keeping in your head while debugging: Hope is not a strategy.
(paraphrased idea attributed to operations culture, widely used in SRE and project management)
“Valid” is what the remote side accepts. Your job is to predict what a wide variety of remote sides will accept—then configure accordingly.
Fast diagnosis playbook (check this, then that)
When TLS mail breaks, you can lose hours to vibes-based debugging. Don’t. Run this in order and you’ll usually find the bottleneck fast.
-
First: confirm what name is being validated.
- Are you connecting to
mx1.example.com,mail.example.com, or the bare domain? - Does the certificate cover that exact name via SAN?
- Are you behind a load balancer or proxy that changes the TLS endpoint?
- Are you connecting to
-
Second: capture the actual handshake the remote MTA sees.
- Use
openssl s_clientwith STARTTLS and (if applicable) SNI. - Verify chain presentation: leaf + intermediate(s), in order.
- Use
-
Third: read your MTA logs for policy enforcement.
- Is the remote side enforcing MTA-STS, DANE, or “TLS required” for submission?
- Is your side enforcing hostname checks or minimum TLS versions?
-
Fourth: verify protocol and cipher overlap.
- TLSv1.2 is the floor for many modern MTAs; TLSv1.0/1.1 are increasingly refused.
- But disabling everything but TLSv1.3 can still hurt if the peer is old.
-
Fifth: eliminate “it’s the firewall” with evidence.
- Confirm ports, NAT, and any TLS interception/inspection devices.
- Check MTU/fragmentation if you see odd stalls mid-handshake.
Joke #1: Email TLS troubleshooting is like a detective novel where the villain is always “one missing intermediate,” wearing a different hat each chapter.
Why TLS fails even with a valid certificate
1) Hostname mismatch: the certificate is valid, just not for that MX name
This is the number-one “but it’s valid!” problem. The cert might be valid for mail.example.com,
while the sending MTA connects to mx1.example.com. Or it connects to the domain in the MX record which points to
a provider hostname you forgot exists. The chain is fine; the identity is wrong.
SMTP adds a twist: many MTAs validate against the name they connected to (the MX target), not what the server says in the SMTP banner.
The banner can be honest and still irrelevant.
Fix: put the MX hostnames in the certificate SANs, or change MX to a name you do control and can certify.
“Works in my browser” is meaningless if your browser never connects to the MX name.
2) Missing intermediate: browsers fetch it, MTAs often won’t
Modern browsers are extremely forgiving. They cache intermediates, fetch them via AIA, and paper over sloppy server configuration.
Many MTAs, by design, don’t do AIA fetching. They expect the server to present a complete chain.
This is why a certificate can verify on your laptop and fail in production mail delivery.
The remote MTA is running an older OpenSSL build in a container with a minimal trust store and no AIA fetching.
It sees the leaf, can’t build a chain, and rejects it.
Fix: configure your MTA to present leaf + intermediates in the right order. Do not include the root.
3) SNI and multi-tenant endpoints: you got the “default” cert
If multiple domains share an IP, the server needs SNI to choose the right certificate. Many SMTP clients now send SNI,
but not all of them. Some older MTAs will connect without SNI and receive the default certificate, which may be valid, just not for you.
The fun part: this can be intermittent. Some senders succeed (they send SNI), some fail (they don’t). You get a ghost story in the logs.
Fix: set the default certificate to something that safely covers the primary MX name, or dedicate IPs for mail where possible.
If you must multi-tenant, test with and without SNI.
4) TLS version and cipher suite mismatch: “modernize” can mean “break mail”
Security teams love hardening guides. SREs love not being paged at 2 a.m. These loves are not always aligned.
If you disable TLSv1.2 or remove common ciphers, you can strand older but still legitimate senders.
On the other side, if the remote insists on TLSv1.2+ and your stack is stuck on TLSv1.0, you’re the legacy.
This shows up as handshake failures, protocol version alerts, or cipher negotiation errors.
Fix: keep TLSv1.2 enabled on SMTP servers. TLSv1.3 is great, but don’t assume universal support.
Disable known-bad ciphers, but keep a sane overlap set.
5) STARTTLS policy confusion: opportunistic vs required
STARTTLS was designed to be opportunistic. That means: if encryption fails, mail can still be delivered in plaintext.
But multiple mechanisms can change that behavior:
- Submission (587) often requires TLS, because it carries user credentials.
- Partner-to-partner transport might enforce TLS via explicit policy or connector settings.
- MTA-STS tells senders: “Only deliver to me over valid TLS, or don’t deliver.”
- DANE can enforce TLS at DNS level (when DNSSEC validates).
So you can have a “valid” certificate that still fails because policy requires hostname validation, or a particular name,
or a particular chain, not just encryption.
6) Incorrect EHLO/HELO name and reverse DNS: not TLS, but often confused with TLS
This one wastes time because it sits adjacent to TLS errors in logs. Some MTAs will refuse or penalize
connections with bad rDNS, inconsistent HELO names, or generic banners. That can look like a “TLS problem”
when it’s really a reputation/identity issue.
Fix: align PTR (reverse DNS), forward A/AAAA, and the HELO name your server uses. It won’t fix a broken chain,
but it will reduce noise while you fix TLS.
7) Middleboxes: TLS inspection, NAT, and “helpful” firewalls
SMTP STARTTLS is especially good at triggering corporate middleboxes that think everything should look like HTTPS.
Some devices mangle STARTTLS negotiation, strip extensions, or reset connections when they see unfamiliar handshakes.
Fix: bypass TLS inspection for SMTP, or use a path that doesn’t traverse the inspection device.
If you can’t, prepare for suffering and document every exception you add.
8) Wrong certificate format / key mismatch / permissions: basic but real
The cert can be valid and still not load. Postfix, Exim, and others will fail open or fail closed depending on configuration.
If your service falls back to plaintext or presents a default cert because it can’t read the right one, you’ll get confusing symptoms.
Fix: verify the running service is actually using the file you think it is, and that the key matches.
9) DANE and MTA-STS: when the rules change underneath you
DANE with DNSSEC can allow self-signed or private CA certs if the TLSA record matches. That’s powerful—and fragile.
Rotate a cert and forget to update TLSA? You just enforced your own outage.
MTA-STS is different: it still relies on public PKI, but it enforces “valid certificate” and “correct name” in a way opportunistic TLS doesn’t.
If you publish an MTA-STS policy and your MX endpoints aren’t consistently correct, you’ll see sudden delivery failures from senders that honor it.
Practical tasks: commands, outputs, decisions (12+)
These are production-grade checks. They’re not theory. Run them, read the output, decide the next move.
Commands assume a Linux server with common tooling.
Task 1: See what certificate your SMTP server actually presents (STARTTLS on 25)
cr0x@server:~$ openssl s_client -starttls smtp -connect mx1.example.com:25 -servername mx1.example.com -showcerts -verify_return_error
CONNECTED(00000003)
depth=2 C=US, O=Internet Security Research Group, CN=ISRG Root X1
verify return:1
depth=1 C=US, O=Let's Encrypt, CN=R11
verify return:1
depth=0 CN=mail.example.com
verify return:1
---
Certificate chain
0 s:CN=mail.example.com
i:C=US, O=Let's Encrypt, CN=R11
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
1 s:C=US, O=Let's Encrypt, CN=R11
i:C=US, O=Internet Security Research Group, CN=ISRG Root X1
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
---
SSL-Session:
Protocol : TLSv1.3
Cipher : TLS_AES_256_GCM_SHA384
...
What it means: The leaf cert CN is mail.example.com, but you connected to mx1.example.com.
If SANs don’t include mx1.example.com, strict senders will reject you.
Decision: Re-issue a certificate that includes the MX hostnames in SAN, or change MX to match the cert.
Task 2: Verify SAN coverage explicitly
cr0x@server:~$ openssl s_client -starttls smtp -connect mx1.example.com:25 -servername mx1.example.com
cr0x@server:~$ openssl x509 -noout -text | sed -n '/Subject:/,/X509v3 Subject Alternative Name:/p'
Subject: CN = mail.example.com
X509v3 Subject Alternative Name:
DNS:mail.example.com, DNS:smtp.example.com
What it means: No mx1.example.com in SAN. CN alone isn’t reliable, and modern validation uses SAN anyway.
Decision: Add mx1.example.com (and any other MX targets) to SAN during issuance.
Task 3: Check for missing intermediates (chain completeness)
cr0x@server:~$ openssl s_client -starttls smtp -connect mx1.example.com:25 -servername mx1.example.com -showcerts 2>/dev/null | awk '/BEGIN CERTIFICATE/{i++}{print > ("cert" i ".pem")}'
cr0x@server:~$ for f in cert*.pem; do echo "== $f =="; openssl x509 -noout -subject -issuer -in "$f"; done
== cert1.pem ==
subject=CN = mail.example.com
issuer=C = US, O = Let's Encrypt, CN = R11
== cert2.pem ==
subject=C = US, O = Let's Encrypt, CN = R11
issuer=C = US, O = Internet Security Research Group, CN = ISRG Root X1
What it means: Server presented leaf + intermediate. Good. If you only saw cert1.pem,
many MTAs would fail chain building.
Decision: If intermediate is missing, fix the smtpd_tls_cert_file (Postfix) or equivalent to include fullchain.
Task 4: Confirm the private key matches the leaf certificate
cr0x@server:~$ openssl x509 -noout -modulus -in /etc/ssl/mail/fullchain.pem | openssl md5
MD5(stdin)= 6f0c9a7d5b0c0b3f4b6a7e3c2d1a9e10
cr0x@server:~$ openssl rsa -noout -modulus -in /etc/ssl/mail/privkey.pem | openssl md5
MD5(stdin)= 6f0c9a7d5b0c0b3f4b6a7e3c2d1a9e10
What it means: Modulus hashes match: key and cert belong together.
If they didn’t, your MTA may fail TLS or present something else.
Decision: If mismatch, deploy the correct key/cert pair and reload the service.
Task 5: Confirm what Postfix is configured to use (and whether it’s actually enabled)
cr0x@server:~$ postconf -n | egrep 'smtpd_tls_|smtp_tls_|myhostname|mydomain'
myhostname = mx1.example.com
smtpd_tls_cert_file = /etc/ssl/mail/fullchain.pem
smtpd_tls_key_file = /etc/ssl/mail/privkey.pem
smtpd_tls_security_level = may
smtp_tls_security_level = may
smtp_tls_loglevel = 1
smtpd_tls_loglevel = 1
What it means: Inbound and outbound TLS are opportunistic (may), and loglevel is low but non-zero.
Decision: If you’re trying to satisfy an enforced partner requirement, you may need per-domain policies, not global encrypt.
Task 6: Turn logs into evidence (Postfix TLS log detail)
cr0x@server:~$ sudo postconf -e 'smtpd_tls_loglevel=2' 'smtp_tls_loglevel=2'
cr0x@server:~$ sudo systemctl reload postfix
cr0x@server:~$ sudo tail -n 30 /var/log/mail.log
Feb 4 12:10:21 mx1 postfix/smtpd[22110]: Anonymous TLS connection established from mail-oi1-f169.google.com[209.85.167.169]: TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)
Feb 4 12:10:44 mx1 postfix/smtpd[22110]: warning: TLS library problem: error:0A000086:SSL routines::certificate verify failed: TLS alert unknown ca
What it means: First line: success with a modern sender. Second: another peer failed verification (or your server failed verifying theirs, depending on direction).
Decision: If failures are inbound and you don’t require client certs, ensure you’re not misconfigured to verify clients. If outbound, check your trust store.
Task 7: Identify whether SNI changes the certificate presented
cr0x@server:~$ openssl s_client -starttls smtp -connect 203.0.113.25:25 -showcerts 2>/dev/null | openssl x509 -noout -subject
subject=CN = default.invalid
cr0x@server:~$ openssl s_client -starttls smtp -connect 203.0.113.25:25 -servername mx1.example.com -showcerts 2>/dev/null | openssl x509 -noout -subject
subject=CN = mx1.example.com
What it means: Without SNI you get the default certificate (default.invalid), with SNI you get the correct one.
Decision: If you can’t dedicate IPs, set the default cert to something acceptable for your primary MX, and test with a non-SNI client.
Task 8: Test protocol overlap (force TLS versions)
cr0x@server:~$ openssl s_client -starttls smtp -connect mx1.example.com:25 -tls1_2 -servername mx1.example.com
cr0x@server:~$ openssl s_client -starttls smtp -connect mx1.example.com:25 -tls1_1 -servername mx1.example.com
...
ssl3_read_bytes:tlsv1 alert protocol version
...
What it means: TLS 1.2 works; TLS 1.1 is refused. That’s good hygiene in 2026.
If your peer only supports TLS 1.1, they will fail.
Decision: Keep TLS 1.2 enabled. Decide whether to support legacy peers case-by-case (usually not on public MX).
Task 9: Confirm what ciphers your server offers (from the server side)
cr0x@server:~$ sudo postconf -n | egrep 'smtpd_tls_mandatory_protocols|smtpd_tls_protocols|smtpd_tls_mandatory_ciphers|tls_high_cipherlist'
smtpd_tls_mandatory_protocols = !SSLv2, !SSLv3, !TLSv1, !TLSv1.1
smtpd_tls_protocols = !SSLv2, !SSLv3
smtpd_tls_mandatory_ciphers = high
What it means: Mandatory protocols are TLSv1.2+. Opportunistic still allows more unless limited.
“high” is vague; it depends on OpenSSL.
Decision: If interop issues exist, explicitly set a modern but compatible cipher list rather than relying on “high” magic.
Task 10: Check MX records and see what names the world actually connects to
cr0x@server:~$ dig +short MX example.com
10 mx1.example.com.
20 mx2.example.com.
cr0x@server:~$ dig +short A mx1.example.com
203.0.113.25
cr0x@server:~$ dig +short AAAA mx1.example.com
2001:db8:25::25
What it means: These are the hostnames that must be covered by certificates (or at least present correct defaults).
Decision: Ensure every MX target has a certificate whose SAN includes that exact hostname, for both IPv4 and IPv6 endpoints.
Task 11: Verify reverse DNS alignment (cuts down “TLS-looking” delivery issues)
cr0x@server:~$ dig +short -x 203.0.113.25
mx1.example.com.
cr0x@server:~$ hostname -f
mx1.example.com
What it means: PTR matches your FQDN. Many receivers like this. Some require it for reputation scoring.
Decision: If PTR is wrong, fix it with your IP provider. Don’t chase TLS ghosts until basic identity is aligned.
Task 12: Check MTA-STS status from DNS (do you enforce strict TLS?)
cr0x@server:~$ dig +short TXT _mta-sts.example.com
"v=STSv1; id=2024020401"
cr0x@server:~$ dig +short TXT _smtp._tls.example.com
"v=TLSRPTv1; rua=mailto:tlsrpt@example.com"
What it means: You publish MTA-STS and TLS reporting. Senders that honor STS may now enforce certificate validity and name checks.
Decision: If you’re not ready for strictness, don’t publish MTA-STS. If you are, ensure MX names and certs are consistent everywhere.
Task 13: Look at TLSRPT (if you collect reports) to see who is failing and why
cr0x@server:~$ sudo grep -R "sts-policy" -n /var/mail/tlsrpt/ | head
/var/mail/tlsrpt/report-2026-02-04.json:42: "result_type": "sts-policy-invalid"
/var/mail/tlsrpt/report-2026-02-04.json:43: "sending_mta_ip": "198.51.100.10"
What it means: Some senders are failing due to STS policy issues, not raw TLS mechanics.
Decision: Fix the STS policy publication and alignment, or temporarily adjust policy mode to avoid hard failures during transition.
Task 14: Confirm your service is listening and advertising STARTTLS
cr0x@server:~$ nc -v mx1.example.com 25
Connection to mx1.example.com 25 port [tcp/smtp] succeeded!
220 mx1.example.com ESMTP Postfix
cr0x@server:~$ printf "EHLO test.example\r\nQUIT\r\n" | nc -v mx1.example.com 25
Connection to mx1.example.com 25 port [tcp/smtp] succeeded!
220 mx1.example.com ESMTP Postfix
250-mx1.example.com
250-PIPELINING
250-SIZE 52428800
250-STARTTLS
250-ENHANCEDSTATUSCODES
250 8BITMIME
What it means: STARTTLS is offered. If it’s missing, senders can’t upgrade—even if you have a great certificate.
Decision: If STARTTLS isn’t advertised, fix MTA TLS enablement and ensure no proxy strips extensions.
Three corporate mini-stories from the trenches
Mini-story 1: The outage caused by a wrong assumption (hostname validation)
A mid-sized SaaS company migrated inbound mail to a new cluster. The team did the “grown-up” thing:
new IPs, new MX records, and a shiny certificate from a public CA. They tested with a couple of external accounts,
watched mail flow, and declared victory.
Then a large customer’s security gateway started deferring messages with a TLS failure reason. Not bouncing. Deferring.
That’s worse: it piles up and becomes tomorrow’s problem, right when you were hoping to sleep.
The company’s cert was valid for mail.example.com. Their MX records pointed to mx1.example.com and mx2.example.com.
They assumed CN would cover it (“isn’t that what certificates do?”), and they assumed sending MTAs validate against the banner name.
The customer’s gateway validated against the MX name it connected to. Strictly. Correctly.
The fix was boring: re-issue the certificate with SANs covering both MX names and the submission name,
deploy fullchain, reload Postfix. Delivery recovered immediately for that customer and several others who had been quietly deferring too.
The lesson: a valid certificate is not an identity; it’s a claim. SMTP peers decide whether to believe it, and what they compare it to.
Your assumptions don’t get a vote.
Mini-story 2: The optimization that backfired (hardening into incompatibility)
A security initiative rolled through a large enterprise: standardize “modern TLS everywhere.” The mail team was told to disable
older protocols and ciphers “to align with compliance.” Nobody argued with the goal; they argued with the timeline.
The timeline won.
They shipped a configuration that effectively required TLS 1.3 for inbound SMTP. On paper, that looked progressive.
In practice, it turned their MX into a bouncer with a clipboard and no patience.
Some senders negotiated TLS 1.3 fine. Others—industrial devices, small business MTAs, and a few partner systems running old libraries—
could only do TLS 1.2. Delivery started failing intermittently. Not just spammy sources. Real invoices. Real ticket updates.
The incident was initially misclassified as “network instability” because it appeared random across senders.
The remediation was a controlled rollback: allow TLS 1.2, keep TLS 1.0/1.1 disabled, and use a sane cipher suite that still overlaps
with common OpenSSL builds. They also implemented monitoring that tracks TLS version distribution on inbound connections.
The lesson: hardening is not a single dial. It’s a negotiation. SMTP is an ecosystem, not your app.
Optimize for security without testing interop, and you’ll “securely” lose mail.
Mini-story 3: The boring, correct practice that saved the day (TLSRPT + staged enforcement)
Another company wanted to enforce encryption for inbound mail. They had legal requirements and a real threat model.
Instead of flipping every switch at once, they staged it like adults: observe, measure, then enforce.
They published TLS reporting and collected TLSRPT reports centrally. For weeks, the mail team reviewed failure categories.
Most were expected noise, but a few were real: one MX node presented the wrong chain after a package update;
another had an IPv6 address serving a stale certificate because automation only updated IPv4 VIPs.
They fixed the inconsistencies, added a daily job that checks every MX endpoint over IPv4 and IPv6,
and only then moved to MTA-STS enforce mode. When they finally enforced, the failure rate didn’t spike.
It actually dropped—because they removed misconfigurations that had been silently causing opportunistic TLS downgrades.
The lesson: boring observability beats heroic debugging. If you want strict TLS, earn it by making your endpoints boringly consistent first.
Joke #2: The only thing more persistent than a missing intermediate is the meeting invite to “circle back” on it next quarter.
Common mistakes: symptom → root cause → fix
1) “TLS handshake failed” with a certificate that looks fine in a browser
Symptom: Remote MTAs log certificate verify failed or unknown ca; your browser shows a lock.
Root cause: Missing intermediate in the chain. Browsers fetch; many MTAs don’t.
Fix: Serve fullchain.pem (leaf + intermediate) from your MTA. Do not rely on AIA fetching.
2) Works for some senders, fails for others (seemingly random)
Symptom: A subset of senders consistently fails STARTTLS validation; others succeed.
Root cause: SNI-dependent certificate selection, or different validation strictness across MTAs.
Fix: Test with and without SNI. Make default cert acceptable for your MX name or dedicate IPs.
3) “hostname mismatch” or “certificate does not match” errors
Symptom: Logs mention mismatch; delivery deferred or TLS downgraded.
Root cause: Certificate SAN doesn’t include the MX target hostname (common during migrations).
Fix: Re-issue the cert to include all MX hostnames in SAN; or adjust MX to match an existing cert name.
4) STARTTLS not offered, even though you configured certs
Symptom: EHLO output lacks 250-STARTTLS.
Root cause: TLS disabled in MTA config; chroot path issues; permission problems; or proxy stripping extensions.
Fix: Ensure TLS is enabled, cert/key readable by the MTA, and no middlebox modifies SMTP extensions.
5) Sudden failures after enabling MTA-STS
Symptom: Previously-delivered mail now defers from major providers; TLSRPT shows policy failures.
Root cause: Publishing MTA-STS turned opportunistic TLS into enforced validation, revealing name/chain inconsistencies.
Fix: Align certificates across all MX endpoints (including IPv6), verify names, fix chain presentation, then enforce again.
6) Only IPv6 delivery fails TLS
Symptom: IPv4 path works; IPv6 path shows wrong certificate or expired chain.
Root cause: Separate VIP, separate listener, stale automation, or different load balancer pool for AAAA.
Fix: Test both A and AAAA endpoints explicitly; unify automation so cert rollout covers both.
7) “no shared cipher” or “handshake failure” after hardening
Symptom: TLS fails at negotiation; logs show cipher mismatch or protocol alert.
Root cause: Over-hardening removed common overlap (or forced TLS 1.3 only).
Fix: Keep TLS 1.2; use a modern-but-compatible cipher list. Validate against a matrix of real sender stacks.
8) Certificate rotates, then partners break (especially with DANE)
Symptom: After rotation, a subset of senders refuse TLS even though the new cert is valid.
Root cause: TLSA records not updated (DANE), or partner pinning / cached expectations.
Fix: Update TLSA before or during rotation; plan overlap windows; communicate with strict partners.
Checklists / step-by-step plan
Step-by-step: make an SMTP certificate actually work
-
Inventory your public-facing names.
- List all MX targets (including backup MX).
- List submission/IMAP/POP hostnames if they share endpoints.
- Include IPv6 AAAA endpoints.
-
Issue certificates for the names people connect to.
- Put every MX hostname in SAN.
- Do not rely on CN-only behavior.
- Decide whether you need wildcard SANs (usually avoid for mail; explicit names are cleaner).
-
Deploy full chain correctly.
- Use
fullchain.pem(leaf + intermediates). - Do not include the root CA in the presented chain.
- Use
-
Set the default certificate intentionally.
- Assume some clients won’t send SNI.
- Make the default cert valid for the primary MX name.
-
Choose TLS versions with interop in mind.
- Enable TLS 1.2 and TLS 1.3.
- Disable TLS 1.0/1.1 unless you have a narrowly-scoped business need and compensating controls.
-
Test from outside your network.
- Run STARTTLS checks against each MX name and each IP family.
- Test with and without SNI.
-
Decide on enforcement mechanisms carefully.
- Opportunistic TLS is fine for “baseline security.”
- MTA-STS/DANE is for when you can keep your configuration consistent under change.
-
Operationalize rotations.
- Automate deployment and reloads.
- Monitor expiry, chain correctness, and endpoint parity (IPv4/IPv6).
Checklist: before you publish MTA-STS enforce mode
- Every MX endpoint (all A/AAAA addresses) presents the same correct certificate chain.
- Certificate SAN includes every MX target hostname.
- Default certificate is acceptable when SNI is absent.
- TLS 1.2 works everywhere; TLS 1.3 works where supported.
- No middleboxes strip STARTTLS or tamper with negotiation.
- You have logging and a process for investigating TLS failures quickly.
Checklist: partner “TLS required” connector troubleshooting
- Get the exact hostname and port the partner connects to.
- Get the exact error text and whether it’s during handshake, name check, or chain validation.
- Test your side with
openssl s_client -starttls smtpagainst that hostname. - If you terminate TLS behind a load balancer, validate the LB cert and SNI behavior.
- Verify the partner isn’t pinning an old certificate or intermediate.
FAQ
1) Why does my certificate validate in a browser but fail for SMTP?
Browsers often fetch missing intermediates and have rich trust stores. Many MTAs don’t fetch intermediates and rely on what you present.
Also, SMTP peers validate the MX hostname they connect to, not whatever you happened to test in a browser.
2) Do I need a separate certificate for each MX server?
Not strictly. You can use one certificate with SAN entries for all MX hostnames, deployed everywhere.
Operationally, that’s often simpler—one issuance pipeline, one renewal schedule—if you can keep distribution tight.
3) Should I use a wildcard certificate for mail?
Usually: avoid it unless you have a strong reason. Wildcards don’t cover multi-label names and can hide sloppy inventory.
Explicit SANs make audits and policy enforcement less surprising.
4) Is port 465 required for secure email?
For server-to-server mail: no, port 25 with STARTTLS is the norm. For client submission, port 587 is standard with STARTTLS,
and port 465 is commonly used for implicit TLS submission. Use what your clients support, but don’t move MX traffic off 25.
5) What’s the difference between opportunistic TLS and enforced TLS in SMTP?
Opportunistic TLS upgrades when possible and can fall back to plaintext if negotiation fails. Enforced TLS (via MTA-STS, DANE,
or partner connectors) refuses delivery if TLS can’t be validated to policy.
6) Do all MTAs validate hostnames in certificates?
No, but more do than they used to—especially under MTA-STS or when configured for “TLS required.”
You must assume some peers will validate strictly and configure accordingly.
7) What’s the single most common misconfiguration you see?
Serving only the leaf certificate without the intermediate. It’s the classic “works on my laptop” bug in PKI clothing.
8) If we use DANE, can we skip public CAs entirely?
Potentially yes, if your domain is properly DNSSEC-signed and your peers validate DNSSEC and DANE for SMTP.
In practice, you need to understand your correspondent ecosystem before betting deliverability on universal DNSSEC validation.
9) Why do some senders connect to the IPv6 address and fail, while IPv4 works?
Because your AAAA points to a different listener or pool with a stale certificate, wrong chain, or different default cert.
Test and manage IPv6 as a first-class endpoint, not a checkbox.
10) Should I include the root certificate in the chain I present?
No. Present the leaf and the intermediates needed to build to a trusted root. Including the root is unnecessary and sometimes harmful.
Conclusion: practical next steps
“Valid certificate” is a comforting phrase, like “the server is up.” It’s also not an answer.
Email TLS fails in the cracks: wrong hostname, incomplete chain, SNI mismatch, policy enforcement, and inconsistent endpoints.
Fixing it is less about buying a better certificate and more about making identity, DNS, and deployment agree everywhere.
- Run STARTTLS tests against every MX name (and both IPv4/IPv6). Capture the presented chain and SANs.
- Make the certificate match how the world connects: SANs for all MX targets, correct default cert, consistent deployment.
- Serve the full chain (leaf + intermediates) and verify with non-browser tooling.
- Keep TLS 1.2 enabled; add TLS 1.3; avoid hardening that removes common overlap without evidence.
- If you want enforcement (MTA-STS/DANE), earn it by making your endpoints boringly consistent before you flip the switch.
Do those five things and most “mysterious TLS mail failures” stop being mysterious. They become what they always were:
configuration drift, identity mismatch, and optimistic assumptions meeting the real world.