Someone in Sales can’t send a 28 MB proposal, Legal can’t receive a 35 MB contract bundle, and Support is drowning in “552 Message size exceeds fixed maximum message size” tickets.
Meanwhile you’re staring at a mail queue that looks calm today, but you know what happens the minute you loosen the belt: your MTA becomes a free file-transfer service for every opportunistic spammer on the internet.
Raising message size limits is not hard. Doing it safely—end-to-end, without breaking delivery, without melting disk, without opening the abuse floodgates—is a systems job.
The limits live in more places than your mail admin remembers, and the failure modes are mostly boring until they’re catastrophic.
What actually limits email size (and why “just raise it” fails)
Email message size isn’t a single knob. It’s a chain of constraints: client → submission server → inbound gateway → content filters → mailbox store → outbound gateway → remote MX.
If any link rejects, the sender gets an NDR, a bounce, or (worse) a silent defer that turns into a stuck queue. The trick is to identify which link is the real constraint, then raise it only as far as your system and risk model allow.
The size you see is not the size you send
Attachments are usually base64-encoded, which adds roughly 33% overhead. Then there’s MIME structure, headers, and sometimes re-encoding by gateways.
A “25 MB attachment limit” often translates to something like a 33–36 MB message size limit in the MTA. If you don’t account for that, you’ll “raise it to 30 MB” and still block 23 MB PDFs that become 31 MB on the wire.
Where the limits hide
- MUA (client): Outlook, mobile clients, and webmail can enforce UI limits or server-provided policy.
- Submission (MSA): the authenticated SMTP endpoint may have stricter limits than inbound MX.
- MTA: Postfix/Exim/Sendmail/Exchange transport rules, plus per-connector caps.
- SMTP proxies: load balancers, TLS terminators, mail proxies (and “helpful” middleboxes) can cap request body size or timeouts.
- Filtering: antivirus, DLP, sandboxing, and spam filters may refuse large payloads or time out scanning them.
- Mailbox store: per-mailbox quotas, per-message limits, and database constraints.
- Remote side: your outbound limit doesn’t matter if the destination rejects anything above 10–25 MB.
One quote to keep handy for change reviews: Hope is not a strategy.
— Gene Kranz.
Here’s the uncomfortable reality: raising your limit doesn’t guarantee delivery. It only guarantees you’ll accept more data, hold it longer, and spend more CPU and disk on it before learning the remote site won’t take it.
That’s why “safe” is about controls, observability, and fallback paths, not just bigger numbers.
Facts and historical context you can use in meetings
These are the small, concrete points that help you explain why you can’t just turn email into Dropbox.
- MIME attachments weren’t in the original email design. Early email was mostly plain text; MIME (multipurpose internet mail extensions) arrived later to standardize attachments and rich content.
- Base64 encoding inflates data. Binary files are encoded into ASCII-safe text for transport, costing about a third in size overhead in common cases.
- “SIZE” is an SMTP extension, not a guarantee. Many MTAs advertise maximum message size in SMTP greetings, but intermediaries can still reject later.
- 25 MB became a social default, not a standard. Many providers converged around 10–25 MB because it balances usability and abuse risk; it’s operational convenience with a tuxedo on.
- Large messages are an availability risk. One giant message can tie up scanning, create long SMTP sessions, and increase queue latency; it’s not just “more storage.”
- Mailstores historically optimized for many small items. Lots of small messages are the normal workload; giant attachments create different IO patterns, compaction behavior, and index costs.
- Some filters decompress attachments. A “small” zip can inflate during scanning; limits may apply to the expanded size, not just the raw SMTP payload.
- Timeouts matter more as size increases. A 40 MB message on a slow client uplink can hold an SMTP transaction open long enough to hit per-connection limits or reverse proxy idle timeouts.
Joke #1: Email is the only file transfer protocol that apologizes to you while failing.
A decision framework: should you raise limits at all?
Raising limits is sometimes the right move. Sometimes it’s the wrong move dressed up as customer service. Decide like an operator, not like someone trying to close a ticket.
When raising limits makes sense
- Business-critical workflows rely on email. External parties won’t use your portal, and you can’t mandate a different channel.
- You have strong inbound controls. Modern spam filtering, malware scanning, and rate limiting are in place and monitored.
- Your storage and queue systems are sized for worst-case. “Normal day” doesn’t count; you need “marketing blast + incident + remote outages” day.
- You can tolerate nondelivery for some recipients. You’ll still hit remote limits; users must accept fallbacks (links, shared drives, secure file transfer).
When raising limits is a trap
- You’re trying to compensate for broken file sharing. Fix the file sharing. Email is not your backup plan for collaboration tooling.
- You don’t control your outbound reputation. Larger messages can increase retries, queue time, and spam score behavior; deliverability can get weird.
- Your filtering stack already runs hot. If your AV sandbox is near saturation, bigger payloads turn “near saturation” into “down.”
Set two limits, not one
Operators who survive set separate limits for:
- Inbound from the internet (stricter; higher abuse risk).
- Authenticated outbound/submission (slightly higher; still controlled by rate limits and user policies).
Then they create exceptions for specific senders/receivers or domains, not a global “everyone gets 100 MB now” party.
Fast diagnosis playbook (first / second / third checks)
If you’re on-call and someone screams “large emails are failing,” don’t spelunk configs first. Locate the bottleneck.
The winning move is to find the first component that knows the truth: the component returning the SMTP code.
First: identify where it fails (and get the SMTP code)
- Ask for the bounce / NDR text or message headers. You’re hunting for codes like 552, 554, 451 plus “message too big” strings.
- Check MTA logs for the transaction: did you reject at SMTP DATA time, or accept then bounce later?
- If the sender is internal, reproduce with
swaksor a controlled test from a known host.
Second: decide if it’s an advertised limit, a proxy, or a filter timeout
- Immediate 552 on DATA: a configured max size in the receiving MTA/proxy.
- Accept then later bounce: downstream filter/mailstore rejection or content scanning result.
- Long delay then timeout: proxy idle timeout, content scanner backlog, or slow disk.
Third: check queue health and disk pressure
- Look for queue growth, deferred messages, and retry storms.
- Confirm disk space and inode availability; large messages can create many temporary files.
- Check IO wait; large message processing is often a disk throughput/latency issue.
If you can’t answer “where is it failing?” in five minutes, you don’t have an email size problem; you have an observability problem.
Practical tasks with commands, outputs, and decisions (12+)
The goal of these tasks is not to memorize commands. It’s to build a repeatable way to prove where the limit lives, whether your system can handle the new size, and whether abuse controls still work.
Task 1: Confirm Postfix maximum message size
cr0x@server:~$ postconf message_size_limit
message_size_limit = 10485760
What the output means: Postfix will reject messages larger than 10,485,760 bytes (~10 MB) at SMTP DATA.
Decision: If you’re raising to “25 MB attachments,” consider setting message_size_limit to ~35–40 MB to account for base64/MIME overhead, and validate downstream limits first.
Task 2: Check mailbox size and message limits in Dovecot (common for IMAP backends)
cr0x@server:~$ doveconf -n | egrep 'mail_max_size|quota'
mail_max_size = 0
plugin {
quota = maildir:User quota
}
What the output means: mail_max_size = 0 often means “no Dovecot-enforced per-message cap,” but quotas still apply via the quota plugin.
Decision: If mail_max_size is non-zero and below your new target, you must raise it or you’ll accept via SMTP then fail at delivery to mailbox.
Task 3: Identify the effective size advertised via SMTP (SIZE extension)
cr0x@server:~$ swaks --server mx1.example.net --ehlo test.example --quit-after EHLO
=== Trying mx1.example.net:25...
=== Connected to mx1.example.net.
< 220 mx1.example.net ESMTP Postfix
> EHLO test.example
< 250-mx1.example.net
< 250-SIZE 10485760
< 250-PIPELINING
< 250-STARTTLS
< 250 HELP
=== Connection closed with remote host.
What the output means: The server advertises a 10 MB limit via SMTP extension.
Decision: If you raise message_size_limit but still advertise the old size, clients may refuse to send. Ensure the advertised SIZE matches your real receiving limit.
Task 4: Verify Nginx (or another proxy) isn’t capping SMTP/HTTP upload paths
cr0x@server:~$ nginx -T 2>/dev/null | egrep 'client_max_body_size|proxy_read_timeout|proxy_send_timeout'
client_max_body_size 10m;
proxy_read_timeout 60s;
proxy_send_timeout 60s;
What the output means: If this Nginx fronts webmail or API-based submission, it caps requests at 10 MB and has short timeouts.
Decision: Increase body size and timeouts if large messages traverse this proxy, or accept that webmail users will still fail even if SMTP works.
Task 5: Check Postfix queue depth and deferred messages
cr0x@server:~$ mailq | head -n 20
-Queue ID- --Size-- ----Arrival Time---- -Sender/Recipient-------
9D2F21C03A* 84231 Mon Jan 4 10:12:41 alerts@example.net
user@example.com
(connect to mx.remote.tld[203.0.113.10]:25: Connection timed out)
-- 1 Kbytes in 1 Request.
What the output means: You have at least one deferred delivery due to remote connection timeouts; queue isn’t huge here, but you need the full picture.
Decision: If queue grows after raising size, you may be accepting large messages that can’t be delivered. Add policy: reject early for known-small remote limits or provide user guidance.
Task 6: Summarize Postfix queue by status (active vs deferred)
cr0x@server:~$ postqueue -p | egrep -c '^[A-F0-9]'
42
What the output means: There are 42 queued entries (rough count based on queue ID lines).
Decision: If that number spikes during large-send events, you need throttling, better delivery routing, or lower limits for outbound to the internet.
Task 7: Watch log evidence of size-based rejects
cr0x@server:~$ sudo journalctl -u postfix --since "30 min ago" | egrep 'message size|too large|552' | tail -n 5
Jan 04 10:14:22 mx1 postfix/smtpd[23144]: NOQUEUE: reject: RCPT from mail.remote.tld[203.0.113.55]: 552 5.3.4 Message size exceeds fixed maximum message size; from=<sender@remote.tld> to=<user@example.com> proto=ESMTP helo=<mail.remote.tld>
What the output means: The reject is happening at your SMTP daemon. That’s a local configured cap, not a downstream mailstore issue.
Decision: Raise the relevant receiving limit (and confirm you’re not missing a stricter per-service setting like submission vs inbound).
Task 8: Confirm disk space and inode availability (large mail stresses both)
cr0x@server:~$ df -h /var/spool/postfix /var/mail
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 200G 176G 15G 93% /
/dev/sdb1 500G 120G 355G 26% /var/mail
What the output means: Spool lives on / and is at 93% usage. That’s a queue outage waiting to happen.
Decision: Move spool to a dedicated filesystem or expand capacity before raising size limits; large messages increase transient spool usage significantly.
Task 9: Check IO pressure during message ingestion
cr0x@server:~$ iostat -xz 1 3
Linux 6.5.0 (mx1) 01/04/2026 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
10.21 0.00 4.12 32.55 0.00 53.12
Device r/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_await wareq-sz aqu-sz %util
sda 3.00 64.00 0.00 0.00 2.10 21.33 220.00 18432.00 12.00 5.17 45.30 83.78 9.96 98.00
What the output means: High %iowait and near-100% disk utilization indicate the server is IO-bound. Large message processing will amplify this.
Decision: Before increasing size limits, fix IO (faster storage, separate spool, tune scanning concurrency). Otherwise you’ll turn a “sometimes slow” server into a “mostly down” server.
Task 10: Confirm Postfix recipient restrictions and auth separation (abuse control baseline)
cr0x@server:~$ postconf -n | egrep 'smtpd_recipient_restrictions|smtpd_client_restrictions|smtpd_tls_security_level|smtpd_sasl_auth_enable'
smtpd_client_restrictions = permit_mynetworks, reject_unknown_client_hostname
smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, reject_unauth_destination
smtpd_tls_security_level = may
smtpd_sasl_auth_enable = yes
What the output means: Relaying is restricted (good), and authenticated users are allowed through. TLS is opportunistic (may), which might be acceptable depending on policy.
Decision: If you raise message sizes for authenticated submission, keep inbound anonymous limits stricter. And make sure you’re not accidentally allowing open relay anywhere.
Task 11: Spot rate limiting gaps (connection and message concurrency)
cr0x@server:~$ postconf -n | egrep 'smtpd_client_connection_count_limit|smtpd_client_message_rate_limit|smtpd_client_recipient_rate_limit'
smtpd_client_connection_count_limit = 50
smtpd_client_message_rate_limit = 0
smtpd_client_recipient_rate_limit = 0
What the output means: Connection count is limited, but message and recipient rate limits are disabled (0). If you raise size limits, abuse can get expensive fast.
Decision: Implement rate limits for anonymous inbound and consider separate policies per listener (inbound MX vs submission). Bigger messages raise the cost per message; treat them like a scarce resource.
Task 12: Measure the largest messages you’re already processing (reality check)
cr0x@server:~$ sudo find /var/spool/postfix/deferred -type f -printf '%s %p\n' 2>/dev/null | sort -nr | head -n 5
38912044 /var/spool/postfix/deferred/9/9D2F21C03A
22100412 /var/spool/postfix/deferred/1/1A7B0D11F2
18000210 /var/spool/postfix/deferred/C/C2D11E9AA0
15011401 /var/spool/postfix/deferred/4/45B2A9B1C3
12000877 /var/spool/postfix/deferred/7/7CC9F1022B
What the output means: You already have ~39 MB items sitting deferred. Either your limit is higher than you think on ingress, or these grew later due to internal handling.
Decision: Don’t raise limits blindly. First, understand why large items are present: accepted before a config change, injected locally, or coming via a different listener with different caps.
Task 13: Validate end-to-end delivery to a known external provider (the uncomfortable truth)
cr0x@server:~$ swaks --server submission.example.net --port 587 --auth LOGIN --auth-user user@example.com --auth-password 'REDACTED' --tls --to external-test@remote.tld --from user@example.com --attach /tmp/25mb.bin
=== Trying submission.example.net:587...
=== Connected to submission.example.net.
< 220 submission.example.net ESMTP Postfix
> EHLO mx1.example.net
< 250-submission.example.net
< 250-SIZE 41943040
< 250-STARTTLS
< 250-AUTH PLAIN LOGIN
< 250 HELP
...
< 250 2.0.0 Ok: queued as 3F1A2B9C0D
What the output means: Submission advertises ~40 MB, and the message was accepted and queued. This is only half the story.
Decision: Track the queue ID through outbound delivery logs. If it later defers or bounces due to remote limits, you need user-facing guidance and possibly lower outbound size for internet destinations.
Task 14: Track that queue ID until delivered or bounced
cr0x@server:~$ sudo journalctl -u postfix --since "10 min ago" | egrep '3F1A2B9C0D' | tail -n 8
Jan 04 10:20:18 mx1 postfix/qmgr[1022]: 3F1A2B9C0D: from=<user@example.com>, size=35781234, nrcpt=1 (queue active)
Jan 04 10:20:49 mx1 postfix/smtp[24110]: 3F1A2B9C0D: to=<external-test@remote.tld>, relay=mx.remote.tld[203.0.113.10]:25, delay=31, delays=0.2/0.1/10/21, dsn=5.2.3, status=bounced (host mx.remote.tld[203.0.113.10] said: 552 5.2.3 Message size exceeds fixed maximum message size (in reply to end of DATA command))
What the output means: Your system accepted and tried to deliver a ~35.8 MB message, but the remote rejected it with 552 at end-of-DATA.
Decision: No amount of local limit raising fixes remote policies. Provide alternatives: file-sharing links, secure transfer, or domain-specific routing (if you have trusted partners with higher caps).
Task 15: Confirm content filter performance and backlog (example with Amavis-like log grep)
cr0x@server:~$ sudo journalctl --since "30 min ago" | egrep 'amavis|clamd|timeout|too large' | tail -n 8
Jan 04 10:18:03 mx1 amavis[1888]: TIMING [abc123]: smb=0.012 (0.012+0.000) out=0.004 (0.004+0.000) m=1.234 (1.221+0.013)
Jan 04 10:18:04 mx1 amavis[1888]: (1888-12) WARN: timed out while scanning, quarantined
Jan 04 10:18:05 mx1 postfix/smtp[24002]: warning: conversation with content-filter.example.net[192.0.2.25] timed out while sending message body
What the output means: Scanning timed out, and Postfix timed out sending the message to the content filter. Bigger messages will trigger more of this.
Decision: Tune filter timeouts and concurrency, scale the filter tier, or keep limits conservative. Your bottleneck is scanning, not SMTP.
Task 16: Verify TLS and session timeouts that matter more for large payloads
cr0x@server:~$ postconf -n | egrep 'smtp_data_xfer_timeout|smtpd_timeout|smtpd_data_restrictions'
smtp_data_xfer_timeout = 180s
smtpd_timeout = 60s
smtpd_data_restrictions = reject_unauth_pipelining
What the output means: Data transfer timeout for outbound is 180s; inbound session timeout is 60s. Large uploads from slow clients may be cut off.
Decision: Increase timeouts carefully on submission (authenticated users) if necessary, but keep inbound MX tighter to reduce abuse exposure and resource hogging.
Abuse controls when you increase limits
When you raise size limits, you’re raising the cost of each accepted message: disk, CPU, scanning time, and retry bandwidth.
Attackers love expensive operations. You’re offering them a pricier toy.
Separate trust zones: inbound MX vs authenticated submission
Do not apply the same policy to anonymous internet senders and authenticated employees. They are not the same threat model.
A clean setup:
- Inbound MX: lower size limit, strict timeouts, aggressive rate limiting, reject early.
- Submission (587/465): higher limit, per-user throttles, strong auth, monitoring tied to identity.
Rate limiting that actually matters
For large messages, rate limiting isn’t just “messages per minute.” It’s also:
- Concurrent connections per client IP (prevents upload swarms).
- Concurrent deliveries per domain (prevents retry storms to a failing remote).
- Bytes per unit time (many MTAs don’t do this directly; you approximate with session limits and message rates).
Early rejects: cheap is beautiful
The best rejection is the one you do before you read 80 MB into disk and run it through three scanners.
If you know a partner domain only accepts 10–15 MB, don’t accept 40 MB destined there and then bounce later. That’s operationally rude.
Use transport maps or policy services to enforce destination-based caps if your MTA supports it.
Identity-based controls for outbound
Once you allow bigger outbound messages, account compromise becomes more expensive. Add:
- Per-user daily send limits (messages and recipients).
- Attachment-type controls (block executables, risky archives).
- Alerts on unusual large-send patterns: “User who never sends attachments just sent 60 messages with 30 MB each.”
Joke #2: The quickest way to learn your size limit is too high is to watch a spammer treat your SMTP server like a free CDN.
Storage and performance: big mail is a disk problem wearing a tie
Most “we raised the limit and email got flaky” incidents are not SMTP bugs. They’re resource problems:
disk fills, IO latency spikes, scanners time out, queues swell, then retries create more load. A classic positive feedback loop, just with fewer graphs and more angry executives.
Spool vs mailstore: keep them apart
The Postfix queue/spool is a transient working set. It wants:
- predictable latency (low await times),
- headroom (space for bursts),
- fast fsync behavior (journaling tuned sanely),
- and isolation from mailstore compactions and backups.
If your spool shares a filesystem with OS logs, container images, or “temporary analytics exports,” you’re running your mail server on vibes.
Put the spool on dedicated storage, sized for worst-case deferred queues.
Large messages change the IO pattern
Small mail is metadata-heavy: lots of small files, directory lookups, indexes. Large mail adds throughput-heavy writes and reads.
Your system now needs both: good metadata performance and bulk throughput. On spinning disks, this is where dreams go to die.
Content scanning is the hidden CPU/disk tax
Antivirus and DLP scanning can:
- decompress archives (expanding workload),
- reassemble MIME parts,
- run heuristics that scale with size,
- and hold messages in temp storage while scanning.
If you raise size limits, you should treat scanning capacity like a first-class dependency. Otherwise you’ll accept mail and then stall in your own filter chain.
Timeouts aren’t “just network”
Large messages increase time spent in every phase: upload, filter handoff, scanning, queueing, delivery. A timeout that was “safe” at 10 MB may be wrong at 40 MB.
But don’t solve this by making every timeout huge. That’s how you let slowloris-style SMTP sessions hog sockets all day.
Tight timeouts on anonymous inbound, looser on authenticated submission—repeat it until it becomes policy.
Three corporate mini-stories (what actually happens)
Mini-story 1: The incident caused by a wrong assumption
A mid-sized company ran a clean-looking mail stack: inbound gateways, Postfix, content filtering, mailbox store. Support tickets came in: “Can’t email 20–30 MB files to customers.”
The mail admin raised message_size_limit on the inbound gateways and called it done. Everyone applauded.
Two days later, outbound queues grew. Not crazy at first—just a steady climb. Then Monday morning hit. Users were sending bigger attachments to partners, the gateway accepted them, and outbound delivery started bouncing with 552 errors from multiple recipient domains.
Now the queue contained a pile of large, undeliverable messages. Each retry attempt created more log noise, more disk churn, and more confusion.
The wrong assumption was simple: “If we accept bigger mail, it will get delivered.” The internet does not sign your internal policy document.
Many recipient domains enforce 10–25 MB caps and don’t negotiate. Your MTA will discover that only after reading the full DATA payload and attempting delivery.
The fix wasn’t “raise more limits.” The fix was to enforce destination-aware policies and to provide a sanctioned alternative for file transfer.
They ended up keeping higher limits only for a handful of trusted partner domains (with confirmed higher caps), while the general internet limit stayed conservative.
The real lesson: you don’t raise a limit; you change traffic shape. And traffic shape changes your failure modes.
Mini-story 2: The optimization that backfired
Another org tried to be clever. They had antivirus scanning delays, so they “optimized” by skipping deep scanning for messages above a certain size.
The logic sounded pragmatic: “Big PDFs from customers are usually fine, and scanning them takes forever. We’ll reduce latency.”
It worked for a week. Mail latency dropped. The queue looked healthier. Then they got hit with a campaign that used large attachments as a delivery vehicle—big enough to evade their shortcut and to slow down manual review.
A few compromised endpoints later, the security team got involved, and suddenly the email team was re-living the entire change approval meeting, but with more people and fewer jokes.
The optimization was based on a false premise: that “size correlates with safety.” Attackers can afford bytes. You can’t afford trust.
Large attachments are better for hiding payloads and exhausting systems; they’re not “probably safe.”
The eventual fix was boring: scale the scanning tier properly, keep consistent scanning policy, and add sane caps and timeouts so one message couldn’t monopolize resources.
They also implemented per-sender throttles to prevent “one compromised account sends 500 giant attachments” from becoming an infrastructure problem.
Mini-story 3: The boring but correct practice that saved the day
A regulated enterprise wanted to raise message size for authenticated submissions because a specific vendor only communicated via email and routinely sent large annotated documents.
The messaging team did something profoundly unfashionable: they staged it in a test environment that mirrored the real mail path, including the content filters and mailbox quotas.
They ran load tests: ten concurrent 35 MB sends, then twenty, then a mixed workload with normal mail plus large messages.
They watched disk IO, filter latencies, queue sizes, and mailbox write performance. They didn’t assume. They measured.
During testing they discovered an unexpected choke point: a temp filesystem used by the DLP scanner had a small size cap and default mount options. Under burst load, it filled and started failing scans.
In production this would have looked like random delivery deferrals, which is the worst kind of outage: intermittent enough to be doubted.
They fixed the temp storage sizing, separated spool storage, and set per-user throttles. When they finally raised the limit, nothing dramatic happened.
The change worked, users got their bigger emails, and on-call kept their weekend.
The practice that saved them wasn’t a fancy tool. It was end-to-end testing with realistic bottlenecks included.
Checklists / step-by-step plan
Step-by-step plan to raise limits safely
- Clarify the business requirement in bytes, not vibes. “25 MB attachments” is not a technical value. Decide target attachment size and compute wire size with overhead (plan ~1.35x for base64/MIME).
- Map your real mail path. Draw the chain: client → submission → gateways → filters → mailstore → outbound. Include proxies, DLP, archiving, journaling, and backups.
- Find current caps at every layer. MTA settings, proxies, filters, mailbox limits, quotas, remote connector caps.
- Pick separate limits by trust zone. Inbound MX cap lower; authenticated submission can be higher with per-user controls.
- Confirm scanning and temp storage behavior. Know where large messages are staged during scanning and what fills first (disk space, inodes, RAM, tempfs).
- Capacity plan the spool. Worst-case queue sizing: consider remote outages where messages defer for hours. Big messages make “hours” expensive.
- Implement rate limits and identity controls before raising size. Don’t ship a bigger attack surface without guardrails.
- Raise limits in small increments. Example: 10 MB → 20 MB → 35–40 MB wire. Observe after each step.
- Test end-to-end with real clients and SMTP tools. Validate both submission and inbound. Validate delivery to common external domains.
- Update user guidance. Make the “what to do when it bounces” path explicit: link-sharing, compression guidance, secure transfer options.
- Add dashboards and alerts specific to large-mail stress. Queue depth, deferred size distribution, filter timeouts, disk IO wait, temp filesystem usage.
- Post-change review. Look for shifts: increased average message size, queue churn, increased retry rates, increased scanning latency.
Go/no-go checklist for the change window
- Spool filesystem has at least 30–40% free space and healthy IO latency under load.
- Content filter tier has spare capacity; no timeouts in logs under test load.
- Submission and inbound listeners have clearly different policies.
- Rate limits are set and tested (including behavior for false positives).
- Monitoring exists for queue growth, deferred counts, and disk usage trends.
- A rollback plan is written and quick (single config revert + reload).
- User comms are ready: “Some external recipients may still reject messages above X MB; use Y instead.”
Common mistakes: symptom → root cause → fix
1) Symptom: Users still can’t send large attachments after you raised Postfix
Root cause: The client or submission service enforces a smaller cap than your inbound MX, or a proxy caps request size.
Fix: Check the authenticated submission listener (587/465) separately, confirm SMTP SIZE advertisement via swaks, and verify any webmail/proxy client_max_body_size or equivalent.
2) Symptom: Mail is accepted but bounces hours later
Root cause: Remote domains reject large messages (552) and your server retries until bounce; or downstream filter/mailstore rejects post-acceptance.
Fix: Track queue IDs. If remote rejects, set destination-aware guidance/limits. If internal components reject, align caps across filter and mailstore so you reject early at SMTP.
3) Symptom: Queue size explodes after raising limit
Root cause: You’re accepting expensive messages that can’t be delivered quickly; retries accumulate. Often triggered by external outages, bad routes, or remote throttling.
Fix: Add outbound concurrency controls, domain throttling, and conservative internet outbound caps. Consider lower max sizes for internet routes and higher caps for trusted partners.
4) Symptom: Random “timeout” errors in logs during large sends
Root cause: Proxy idle timeouts, content filter backlog, or disk IO saturation causing slow processing.
Fix: Increase timeouts on submission paths only, scale filter capacity, and fix IO bottlenecks. Don’t just crank timeouts globally; you’ll invite resource hogging.
5) Symptom: Disk fills even though average mail volume didn’t change much
Root cause: A few large messages can consume disproportionate spool space, especially when deferred or duplicated across scanning/journaling pipelines.
Fix: Size spool for bursts, separate spool and mailstore, monitor largest queued items, and set policy to reject or divert oversized mail early.
6) Symptom: Some users can send big mail, others can’t
Root cause: Different submission endpoints with different limits (e.g., regional gateways), or per-user policy/connector differences.
Fix: Inventory all listeners and connectors; standardize policy. Document exceptions explicitly instead of discovering them via user complaints.
7) Symptom: Security team reports increased malware detections after raising limits
Root cause: Larger attachments expand the threat surface; if scanning was skipped or timeouts increased without scaling, more risky content slips through or gets quarantined inconsistently.
Fix: Keep scanning consistent, scale scanning resources, tune timeouts carefully, and implement attachment type policies and sandboxing where appropriate.
8) Symptom: IMAP/webmail feels slow after the change
Root cause: Mailstore indexing, search, and client sync now deal with larger blobs; storage tier might be struggling with read amplification.
Fix: Revisit mailstore storage and indexing strategy, add caching where appropriate, and consider limiting large attachments to a maximum even if SMTP can accept more.
FAQ
1) If we set our limit to 100 MB, will partners be able to receive 100 MB?
Not reliably. Many remote domains will reject long before that, often around 10–25 MB. Your server will accept, queue, retry, then bounce.
If you need guaranteed transfer, use a file-sharing mechanism and email a link.
2) How do I translate “attachment limit” into MTA bytes?
Plan for base64/MIME overhead. A rough operational rule: multiply desired attachment size by ~1.35 to get a safe message_size_limit target, then add a little for headers and multi-part structure.
Test with real file types and clients; some clients bloat differently.
3) Should inbound and outbound have the same maximum size?
No. Inbound from the internet is higher risk. Authenticated submission is lower risk but still needs throttles because account compromise is real.
Separate limits by listener and enforce identity-based controls for outbound.
4) What’s the biggest operational risk when raising size limits?
Queue and storage blow-ups during external delivery failures. Large deferred messages consume spool space quickly and increase retries’ cost.
The second biggest is filter tier saturation/timeouts.
5) Why do we see “accepted” but then a bounce later?
SMTP acceptance means your server agreed to take responsibility for delivery. It doesn’t mean the recipient will accept it.
Also, downstream components (filters, DLP, mailbox store) can reject after acceptance if their limits are lower.
6) Can’t we just increase timeouts so large sends don’t fail?
You can, but do it selectively. Longer timeouts on anonymous inbound allow slow sessions to hog sockets and resources.
Prefer longer timeouts on authenticated submission, plus connection limits and rate limiting.
7) How do we prevent abuse if we allow larger outbound attachments?
Use per-user and per-IP throttles, monitor large-send anomalies, and enforce attachment policies. Make sure authentication is strong (MFA where applicable) and that compromised accounts can’t send unlimited large payloads.
8) What should we monitor specifically after raising limits?
Queue depth (active/deferred), size distribution of queued items, disk usage and inode consumption on spool/temp, IO wait, filter timeouts, and outbound bounce rates (especially 552/5.2.3).
Also track “accepted then bounced” volume—those are expensive failures.
9) Is it better to reject large messages at SMTP time or accept and handle later?
Reject early whenever possible. Early rejection is cheap and honest. Accept-then-fail wastes resources and creates user confusion.
Align limits across the chain so the first MTA can reject with a clear message.
10) What do we tell users when a recipient rejects big messages?
Give them a policy that acknowledges reality: “We support sending up to X MB, but some recipients may only accept Y MB. Use approved file sharing for larger files.”
If you don’t provide an alternative, they’ll create one, and you won’t like it.
Conclusion: practical next steps
Raising email message size limits is easy to do and surprisingly easy to regret. The safe version is deliberate: measure bottlenecks, align limits across the chain, add abuse controls, and accept that the public internet won’t follow your internal policy memo.
Next steps you can execute this week
- Inventory limits end-to-end: submission, inbound MX, proxies, filters, mailbox store, quotas.
- Run two controlled tests: one large message from an internal client to an internal mailbox; one to a common external domain. Track queue IDs to final status.
- Fix your most likely bottleneck: spool disk headroom/IO, filter timeouts, or mismatched caps.
- Implement guardrails before the increase: rate limits, per-user controls, monitoring for large-message pressure.
- Raise limits in increments and watch metrics like an adult watching a toddler near an open staircase.
If you do it right, users stop complaining and your queue graphs stay boring. Boring is the highest compliment production systems can give you.