Nothing spikes blood pressure like a “gigabit” office where copying a 30 GB folder to the NAS crawls at 20 MB/s. Users swear the network is fine because Teams calls work. Someone else swears the NAS is “fast” because the spec sheet said “up to 2,000 MB/s”. And you’re stuck watching a progress bar lie to your face.
Here’s the practical truth: SMB copy performance is usually limited by one of three things—disk latency, network behavior, or SMB features that quietly trade CPU and chatter for safety. You don’t fix it by randomly flipping registry keys. You fix it by measuring, then applying the few tweaks that actually change the physics.
What actually makes SMB copies slow
SMB (Server Message Block) is not a single thing. It’s a protocol family with versions, dialect negotiation, optional integrity, optional encryption, optional multichannel, and a lot of “helpful” client behaviors. Copy speed is the emergent property of:
- Storage latency (especially small random writes for “lots of tiny files”).
- Network bandwidth (obvious), plus loss, retransmits, and bufferbloat (less obvious).
- CPU cycles on client and server for signing/encryption and for any antivirus/filter drivers.
- SMB flow control: credits, outstanding I/O, and how many requests can be in flight.
- Metadata churn: creates, closes, attribute reads, ACL checks, durable handles, directory enumeration.
- Application behavior: Explorer copy vs robocopy vs an app doing sync-style copies.
Two copy jobs can be “the same size” and behave like different planets:
- One 50 GB ISO is mostly sequential I/O. It wants bandwidth and large buffers.
- Two million 16 KB files is metadata and random I/O. It wants low latency, fast directory operations, and sane antivirus policies.
SMB performance tuning, done right, is mostly about not fighting your bottleneck. Done wrong, it’s cargo cult: bigger MTU, more threads, more cache, more “optimization”. That’s how you end up with a NAS that’s “optimized” into a very expensive space heater.
Joke #1: SMB tuning is like seasoning soup—add salt slowly, because “just dump the whole shaker” is how you get paged at 2 a.m.
Fast diagnosis playbook (first/second/third)
If you only remember one thing, remember this: separate disk from network from SMB features. Don’t start by changing settings. Start by proving where the ceiling is.
First: classify the workload
- Single large file? Expect line-rate if disks are fine and SMB isn’t doing expensive security features.
- Many small files? Expect slower. Then ask: “Is it unreasonably slower than last month?”
- Read vs write? Writes are often slower due to sync behavior, snapshots, RAID parity, or SLOG absence (for ZFS).
Second: prove raw network capacity (without SMB)
- Use iperf3 between client and NAS (or NAS and a test host on the same VLAN).
- If you can’t get close to expected bandwidth here, SMB will not save you.
Third: prove NAS storage capability (without SMB)
- Measure disk throughput and latency on the NAS itself (fio is ideal; basic tools work too).
- If the NAS is already at 80–100% CPU, your SMB tweaks are cosmetic.
Fourth: validate SMB dialect and expensive features
- Check SMB version, signing state, encryption state, multichannel, RSS.
- Look for antivirus, DLP, file screening, or quota enforcement overhead.
Fifth: capture one clean test and one “real” copy
- One large file generated on the client (so it’s not limited by client disk reads).
- One representative “problem” folder.
- Compare behavior. If only the small-file workload is bad, tune for metadata/latency, not bandwidth.
Facts and history that explain today’s weirdness
SMB performance problems often look like mysteries until you remember where SMB came from and what the protocol is trying to guarantee.
- SMB originated in the 1980s (IBM roots, later adopted and expanded by Microsoft). A lot of “chatty” behavior is historical baggage.
- SMB1 was designed for LANs with different assumptions; its inefficiencies (and security problems) are why modern environments should treat SMB1 as “no”.
- SMB2 (Vista/Server 2008 era) dramatically reduced chattiness and improved pipelining—this is one reason SMB2/3 can be much faster on high-latency links.
- SMB3 introduced encryption and multichannel (Windows 8/Server 2012 era). Great features, but they can tax CPU or expose NIC/driver weirdness.
- SMB signing existed long before “zero trust” was a catchphrase; it protects integrity, but has a measurable performance cost—especially on weak CPUs or when offloads aren’t available.
- Explorer copy has changed behavior across Windows releases (buffer sizes, retries, “friendly” UI). Robocopy is often more predictable for testing.
- Opportunistic locks (oplocks) and leasing were introduced to reduce network round trips by caching—until an app or scanner invalidates caches and forces constant breaks.
- SMB credits are flow control; on fat pipes with high RTT (or busy servers), the wrong credit behavior can throttle outstanding I/O.
- Jumbo frames became fashionable with 10GbE, but SMB doesn’t magically require them; jumbo frames mostly reduce CPU and interrupt overhead when everything is correctly configured end-to-end.
Practical tasks: commands, outputs, and decisions (12+)
These are the tasks I actually run when someone says “SMB is slow”. Each one includes: the command, what “good” looks like, and the decision you make.
Task 1: Confirm SMB dialect, encryption, signing (Windows client)
cr0x@server:~$ powershell -NoProfile -Command "Get-SmbConnection | Select-Object ServerName,ShareName,Dialect,Encrypted,Signed,NumOpens | Format-Table -Auto"
ServerName ShareName Dialect Encrypted Signed NumOpens
--------- --------- ------- --------- ------ --------
nas01 data 3.1.1 False True 12
What it means: You’re on SMB 3.1.1, encryption is off, signing is on. That “Signed=True” can be policy-driven.
Decision: If performance is low and CPU is high, verify whether signing is required. If it’s required by security policy, stop arguing and optimize elsewhere (RSS, multichannel, CPU, NIC drivers).
Task 2: Measure raw network throughput with iperf3
cr0x@server:~$ iperf3 -c nas01 -P 4 -t 10
Connecting to host nas01, port 5201
[SUM] 0.00-10.00 sec 4.65 GBytes 3.99 Gbits/sec 0 sender
[SUM] 0.00-10.00 sec 4.64 GBytes 3.99 Gbits/sec receiver
What it means: Your network path can actually do ~4 Gbps with four streams. Great.
Decision: If iperf3 is low, fix NIC speed/duplex, cabling, switch config, VLAN, QoS, or Wi-Fi before touching SMB.
Task 3: Check NIC link speed and errors (Linux NAS)
cr0x@server:~$ sudo ethtool eno1
Settings for eno1:
Speed: 10000Mb/s
Duplex: Full
Auto-negotiation: on
Link detected: yes
What it means: Link is up at 10GbE full duplex. That’s the starting line.
Decision: If it says 1000Mb/s when you expect 10GbE, stop. Find the bad cable, wrong SFP, or switch port config.
Task 4: Check interface counters for drops/retransmit hints (Linux)
cr0x@server:~$ ip -s link show dev eno1
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
RX: bytes packets errors dropped missed mcast
987654321 1234567 0 12 0 1234
TX: bytes packets errors dropped carrier collsns
123456789 2345678 0 8 0 0
What it means: Non-zero drops. A few might be noise; persistent growth during tests is a red flag.
Decision: If drops climb with copy, you’re looking at congestion, buffer issues, or offload/driver problems. Fix that before Samba tuning.
Task 5: Confirm MTU end-to-end (Linux)
cr0x@server:~$ ping -M do -s 8972 nas01 -c 2
PING nas01 (10.20.0.10) 8972(9000) bytes of data.
8972 bytes from 10.20.0.10: icmp_seq=1 ttl=64 time=0.321 ms
8972 bytes from 10.20.0.10: icmp_seq=2 ttl=64 time=0.299 ms
--- nas01 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
What it means: Jumbo frames (MTU 9000) likely work on this path.
Decision: If this fails, do not enable jumbo frames “because faster”. Partial jumbo is worse than no jumbo; it produces fragmentation or blackholes depending on gear.
Task 6: Check SMB multichannel state (Windows)
cr0x@server:~$ powershell -NoProfile -Command "Get-SmbMultichannelConnection | Format-Table -Auto"
Server Name Client IP Server IP Client Interface Index Server Interface Index RSS Capable RDMA Capable
----------- --------- --------- ---------------------- --------------------- ----------- ------------
nas01 10.20.0.21 10.20.0.10 14 12 True False
What it means: Multichannel exists and RSS is capable. If there were multiple interfaces, you’d see multiple rows.
Decision: If multichannel is expected but absent, check SMB server support (Samba version/config), NIC RSS settings, or whether there’s only one viable path.
Task 7: Check client RSS / offload posture (Windows)
cr0x@server:~$ powershell -NoProfile -Command "Get-NetAdapterRss | Select-Object Name,Enabled,NumberOfReceiveQueues,Profile | Format-Table -Auto"
Name Enabled NumberOfReceiveQueues Profile
---- ------- --------------------- -------
Ethernet0 True 8 ClosestProcessor
What it means: RSS is on. Good: SMB can spread processing.
Decision: If RSS is disabled on a 10GbE client/server, you’ll often bottleneck a single CPU core. Fix RSS before you touch Samba.
Task 8: Verify server-side CPU saturation (Linux NAS)
cr0x@server:~$ mpstat -P ALL 1 5
Linux 6.6.0 (nas01) 02/05/2026 _x86_64_ (16 CPU)
03:11:01 PM CPU %usr %nice %sys %iowait %irq %soft %idle
03:11:02 PM all 22.10 0.00 8.33 1.50 0.00 2.10 66.0
03:11:02 PM 7 91.00 0.00 6.00 0.00 0.00 0.00 3.0
What it means: One core is pinned ~91% while overall CPU is fine. That’s a classic sign of an affinity/interrupt/RSS issue, or single-threaded work (crypto, single queue).
Decision: If one core is pegged during SMB copy, investigate RSS, NIC interrupts, and whether signing/encryption is burning a single thread.
Task 9: Observe disk latency during the copy (Linux NAS)
cr0x@server:~$ iostat -x 1 5
avg-cpu: %user %nice %system %iowait %idle
18.12 0.00 6.44 4.10 71.34
Device r/s w/s rkB/s wkB/s await svctm %util
nvme0n1 2.0 450.0 64.0 51200.0 2.10 0.20 9.5
md0 0.0 820.0 0.0 52000.0 18.50 0.00 98.0
What it means: The backing array (md0) is near 100% utilized with ~18 ms await. That’s your limiter for writes.
Decision: Stop “SMB tuning” and fix storage: write cache, RAID level, ZFS recordsize, SLOG, spindle count, or move the workload to SSD.
Task 10: Confirm Samba is actually using aio and sensible socket options (Linux NAS)
cr0x@server:~$ testparm -s | egrep -i "aio|socket|smb2 leases|server multi channel|server signing|smb encrypt"
aio read size = 1
aio write size = 1
server multi channel support = yes
server signing = mandatory
smb encrypt = disabled
socket options =
smb2 leases = yes
What it means: AIO is enabled (threshold 1 means basically always), multichannel is supported, signing is mandatory.
Decision: If signing is mandatory and CPU is your bottleneck, consider “signing desired” (if policy allows) or upgrade CPU / enable crypto acceleration. If multichannel is off, turn it on only after verifying NIC and switch stability.
Task 11: Check if SMB is falling back to SMB1 or NT1 anywhere (Linux server logs)
cr0x@server:~$ sudo journalctl -u smbd --since "1 hour ago" | egrep -i "SMB1|NT1|deprecated|protocol"
Feb 05 14:22:18 nas01 smbd[2143]: protocol negotiation failed: client requested NT1
What it means: Some client is trying SMB1 (NT1). This can cause horrible performance and security exposure.
Decision: Find the client, disable SMB1, and enforce minimum protocol SMB2_02 or higher.
Task 12: Test a controlled large-file write from a Windows client (robocopy)
cr0x@server:~$ powershell -NoProfile -Command "robocopy C:\temp \\nas01\data\perf-test bigfile.bin /np /r:0 /w:0 /mt:8"
-------------------------------------------------------------------------------
ROBOCOPY :: Robust File Copy for Windows
-------------------------------------------------------------------------------
Source : C:\temp\
Dest : \\nas01\data\perf-test\
Files : bigfile.bin
Options : *.* /MT:8 /R:0 /W:0 /NP
------------------------------------------------------------------------------
Total Copied Skipped Mismatch FAILED Extras
Dirs : 1 0 1 0 0 0
Files : 1 1 0 0 0 0
Bytes : 50.000 g 50.000 g 0 0 0 0
Times : 0:02:10 0:02:10 0:00:00 0:00:00
Speed : 410.123 MegaBytes/min
Speed : 6.835 MegaBytes/sec
What it means: 6.8 MB/s to a NAS is catastrophically low for a single stream. Now you know it’s not “small files”. Something is fundamentally limiting writes.
Decision: Correlate this with iostat (storage), iperf3 (network), and CPU. Don’t chase oplocks; chase the limiter.
Task 13: Confirm Windows isn’t writing with pathological buffering behavior (Performance counters)
cr0x@server:~$ powershell -NoProfile -Command "Get-Counter '\SMB Client Shares(*)\Avg. Write Queue Length' -SampleInterval 1 -MaxSamples 5 | Select-Object -ExpandProperty CounterSamples | Select-Object Path,CookedValue"
Path CookedValue
\\SMB Client Shares(\\nas01\data)\Avg. Write Queue Length 15
What it means: A large SMB client write queue suggests the client is waiting on server acknowledgments (server is slow or flow controlled).
Decision: If queue length stays high while network is underutilized, investigate server disk latency, SMB credits, or signing/encryption CPU.
Task 14: Check TCP retransmits on Linux (server or client)
cr0x@server:~$ netstat -s | egrep -i "retransm|segments retransm"
184 segments retransmitted
What it means: Retransmits exist. The absolute number isn’t the point; the trend during tests is.
Decision: If retransmits climb quickly during copies, fix layer 1/2/3 issues, offloads, or congestion. SMB tuning won’t compensate for packet loss.
Task 15: Validate Samba per-share options that affect caching and durability
cr0x@server:~$ testparm -s --section=data | egrep -i "strict sync|sync always|durable handles|kernel oplocks|oplocks|vfs objects"
strict sync = yes
sync always = yes
kernel oplocks = yes
oplocks = yes
durable handles = yes
vfs objects = acl_xattr
What it means: sync always = yes forces synchronous writes. That’s a performance killer unless you absolutely need it.
Decision: If this is a general-purpose share, turn off sync always. If it’s for a database that demands sync semantics, then you need proper storage (SLOG, BBWC, etc.), not wishful config.
The SMB tweaks that actually matter
Most “SMB tweaks” on the internet are either outdated, placebo, or dangerous in production. These are the ones that repeatedly move the needle—because they align with real bottlenecks.
1) Fix the boring stuff first: speed/duplex, cabling, and path symmetry
If your NAS negotiates 1GbE on one side and 10GbE on the other, SMB will dutifully deliver 1GbE performance. Same if you’ve got a “helpful” LACP bond that hashes all traffic onto one physical link because it’s one TCP session. You don’t need a tuning guide; you need to stop guessing.
2) SMB Multichannel: a real win, but only when your NICs behave
SMB Multichannel can use multiple TCP connections across NICs (or across multiple queues on one NIC with RSS). This helps both throughput and resilience. It also adds complexity: more flows, more chances to hit a buggy driver, and more sensitivity to asymmetry.
- Do it when you have 10/25/40GbE and modern Windows clients, and you can validate stable multi-queue behavior.
- Avoid “half-multichannel” where the server supports it but the client has RSS disabled (or vice versa). That’s how you pin a single core and wonder why the link is idle.
3) RSS and interrupt distribution: the silent throughput cap
If one CPU core is pegged during SMB transfers, you’ll plateau at a suspiciously consistent number (often a few Gbps or less) while the rest of the system naps. RSS is how you spread receive processing. Proper interrupt moderation and queue count matter, too.
What you do:
- Enable RSS on Windows NICs; verify multiple receive queues.
- On Linux, ensure multiqueue is enabled, and interrupts are distributed sensibly.
- Update NIC firmware/drivers. Yes, it’s boring. Yes, it matters.
4) Signing and encryption: know what you’re paying for
SMB signing ensures integrity: it prevents tampering. SMB encryption protects confidentiality. Both are good security controls. Both cost CPU and can reduce throughput.
Rules I use in production:
- If you’re on a trusted LAN and security policy allows: don’t require signing for every share. Set it to “desired”, not “mandatory”.
- If you must encrypt: make sure the NAS CPU has AES-NI (or equivalent) and that you’re not pinning crypto to one core.
- Don’t encrypt twice (SMB encryption plus VPN encryption) unless you’ve budgeted CPU for it.
One paraphrased idea from a reliability heavyweight: paraphrased idea: “Hope is not a strategy”
— attributed to Gene Kranz, often cited in engineering and ops discussions.
5) Don’t force synchronous writes unless you mean it
Settings like sync always (Samba) or storage-level “always sync” behaviors can turn a NAS into a write-latency machine. That may be correct for some transactional workloads. For user file shares, it’s usually self-harm.
If your organization says “we need zero data loss on power failure,” then buy hardware that supports it (battery-backed cache, proper journaling, or ZFS intent log design) and test it. Don’t bolt “sync always” onto a consumer SSD pool and call it enterprise.
6) Tune for the workload: big files vs tiny files
For big sequential transfers, focus on:
- Bandwidth, RSS, multichannel, CPU headroom, and avoiding expensive security features where policy permits.
For millions of small files, focus on:
- Metadata performance on the NAS (SSDs for metadata, sufficient RAM, fast directory operations).
- Client and server antivirus exclusions (carefully scoped).
- Reducing needless attribute/ACL lookups where possible.
7) Samba AIO and sendfile: good defaults, but validate
Samba’s asynchronous I/O and zero-copy sendfile support can help. On modern kernels, Samba generally does a decent job. But you must verify your NAS isn’t constrained elsewhere.
- AIO helps when there’s benefit to parallel I/O and the underlying filesystem supports it well.
- sendfile can reduce CPU for reads. It won’t fix slow writes.
Be suspicious of random “socket options” pasted from 2009. If you see someone recommending TCP_NODELAY and huge buffers as a universal fix, you’re reading folklore.
8) Disable SMB1. Still. Always.
If an ancient device forces SMB1, isolate it on its own VLAN/share, restrict access, and plan its retirement. SMB1 is not “legacy-compatible”; it’s “attack-compatible”. It’s also slow, chatty, and unpleasant under latency.
Joke #2: If you’re still running SMB1, you’re not “supporting legacy”; you’re running a museum exhibit that occasionally gets ransomware.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
A mid-size company rolled out a new NAS to replace an aging Windows file server. The purchase was justified on “raw throughput”—lots of disks, 10GbE uplinks, and a dashboard that proudly showed low CPU at idle. Migration weekend went fine. Monday didn’t.
Users complained that opening project folders took forever. Copies “stalled” at random. The helpdesk escalated it as “network issues” because ping looked fine and web apps were fast. Meanwhile, the storage team insisted the NAS was “barely doing anything” because the throughput graphs weren’t pegged.
The wrong assumption was that low throughput meant the system wasn’t busy. In reality, the workload had shifted from “big CAD files” to “thousands of small build artifacts” over the years. Explorer was doing attribute reads, ACL checks, and thumbnail-related metadata operations. On the NAS, the underlying RAID was fine for sequential throughput but had mediocre latency under metadata churn, and the share had strict sync enabled “for safety”.
Once someone graphed disk latency (await) and not just MB/s, it clicked. The array was spending its life at high utilization with small writes and flushes. The fix wasn’t an SMB magic flag. They moved metadata-heavy shares to SSD-backed storage, removed sync always from general shares, and added a scoped antivirus exclusion for build output directories. Copy speed jumped, and “folder open time” stopped being a ticket generator.
Mini-story 2: The optimization that backfired
A different org had slow SMB writes to a Samba-based NAS. Someone read a tuning blog and decided jumbo frames would “unlock performance”. They set MTU 9000 on the NAS, the core switch, and a handful of servers. Desktops and some access switches were left at 1500 because “it probably negotiates”. It did not.
For a day, performance was weird: sometimes it screamed, sometimes it crawled, sometimes it hung for seconds. Packet captures showed retransmits and fragmentation patterns that didn’t make sense for a LAN. The helpdesk saw intermittent “network drive disconnected” popups. The storage team saw no errors on disks. Everyone blamed everyone else, which is the circle of life.
The root problem was partial jumbo frames and inconsistent path MTU. Some flows happened to stay within a clean MTU domain; others crossed gear that couldn’t pass jumbo, leading to drops and retries. SMB, being a reliable protocol on top of TCP, dutifully retried—and throughput collapsed.
The eventual fix was brutally simple: revert to MTU 1500 everywhere, then reintroduce jumbo only after auditing every hop (including virtualization vSwitches and firewall interfaces). Performance became stable first, then fast. The “optimization” didn’t fail because jumbo frames are bad. It failed because the network was treated like a mood ring.
Mini-story 3: The boring but correct practice that saved the day
A large enterprise had recurring complaints: “SMB is slow” after every quarterly patch cycle. The storage team was tired of reactive firefighting, so they implemented a dull-sounding practice: a repeatable SMB performance baseline test, run weekly, stored with system metrics.
The baseline included three tests: iperf3 throughput, a large-file robocopy write, and a small-file metadata-heavy robocopy. They recorded NIC driver versions, SMB dialect, signing/encryption status, and NAS CPU/disk latency during each run.
One week, the large-file test dropped by half while iperf3 stayed normal. Disk latency was fine. CPU showed one core spiking. Because they had historical baselines, they quickly correlated the regression to a new NIC driver on a subset of Windows clients that disabled RSS due to a “compatibility” setting. No one would have noticed in normal monitoring because average CPU was okay and network graphs looked normal.
They rolled back the driver, re-enabled RSS, and performance returned. The win wasn’t a clever tweak. The win was having a baseline and treating SMB performance like a monitored SLO, not a rumor.
Common mistakes: symptom → root cause → fix
1) “Network is fast, but SMB copies are slow”
Symptom: iperf3 looks good, but file copies plateau low.
Root cause: CPU overhead from signing/encryption, or disk latency on the NAS, or single-core bottleneck due to RSS/offload issues.
Fix: Check SMB signing/encryption state, measure CPU per-core, confirm RSS/multichannel, then measure disk await during the copy.
2) “Single large file is fast, but folders with many small files are painfully slow”
Symptom: ISO copies fly; source trees crawl.
Root cause: Metadata overhead (creates/closes/ACL checks), antivirus scanning, snapshots, or slow directory operations.
Fix: Use robocopy with and without multithreading; review antivirus exclusions; put metadata-heavy shares on SSD; reduce pathological sync settings.
3) “It was fast yesterday; today it’s slow”
Symptom: Sudden regression, no topology changes (allegedly).
Root cause: Driver/firmware update, SMB policy change (signing required), path MTU changes, or a new inline security product.
Fix: Compare SMB connection properties and NIC settings; check change logs; validate MTU end-to-end; confirm no new filtering drivers or NAS updates.
4) “SMB copy stalls every few seconds”
Symptom: Bursty throughput; progress bar pauses.
Root cause: Bufferbloat/congestion, packet loss and retransmits, or storage flush behavior (sync writes, cache flush storms).
Fix: Check interface drops/retransmits; check switch queueing/QoS; measure NAS disk latency and cache settings; remove forced sync where not needed.
5) “Only some clients are slow”
Symptom: One department complains; others fine.
Root cause: Different NIC drivers, Wi-Fi vs wired, different GPO security (signing), or path differences (different switch stack).
Fix: Compare Get-SmbConnection outputs; run iperf3 from slow vs fast clients; check link speed, RSS, and error counters.
6) “Turning on jumbo frames made it worse”
Symptom: Intermittent hangs, retransmits, disconnects.
Root cause: Partial jumbo configuration or an intermediate device that can’t pass jumbo.
Fix: Revert to MTU 1500; prove jumbo with DF pings across every hop; then re-enable consistently or don’t bother.
7) “We enabled SMB encryption and now everything is slow”
Symptom: Throughput drops, CPU rises.
Root cause: CPU-bound crypto, no hardware acceleration, or single-threaded encryption path on that platform.
Fix: Ensure AES acceleration exists and is used; scale CPU; consider encrypting only sensitive shares; validate multichannel/RSS to spread load.
8) “We added LACP, so it should be faster”
Symptom: Still capped near one link speed.
Root cause: A single TCP flow hashes to one physical member; SMB without multichannel won’t necessarily stripe one file copy across links.
Fix: Use SMB multichannel for parallelism; or test multiple concurrent flows; tune hashing policy where appropriate; don’t sell LACP as per-flow throughput magic.
Checklists / step-by-step plan
Step-by-step: stabilize first, then speed up
- Pick two tests: one large file (10–50 GB) and one representative “small files” folder.
- Run iperf3 to validate raw network capacity and retransmit behavior.
- Verify link speed on client and NAS, and check counters for drops.
- Confirm SMB dialect and whether signing/encryption are enabled/required.
- Measure NAS CPU per-core during the copy; watch for a pinned core.
- Measure NAS disk latency (await/%util) during the copy.
- If network-limited: fix MTU consistency, switch congestion, cabling, NIC drivers.
- If CPU-limited: enable RSS, validate multichannel, consider reducing signing/encryption requirements where policy allows, or upgrade CPU.
- If disk-limited: move workload to faster storage, add SSDs, fix RAID/ZFS layout, remove forced sync for general shares.
- Only now: adjust Samba/SMB settings that match the proven bottleneck.
- Re-test with the same dataset and method. Keep results.
- Operationalize: create a weekly baseline job so you catch regressions before users do.
Quick “do / don’t” checklist
- Do treat SMB performance as a system: client + server + network + storage.
- Do measure latency, not just throughput.
- Do keep SMB1 disabled and enforce SMB2+.
- Don’t enable jumbo frames unless you can prove end-to-end MTU support.
- Don’t force synchronous writes on general-purpose shares.
- Don’t paste ancient Samba “socket options” and expect miracles.
- Don’t confuse LACP aggregate bandwidth with single-flow throughput.
FAQ
1) Why does Windows Explorer copy feel slower than robocopy?
Explorer is optimized for user experience: progress reporting, previews, and sometimes different buffering behavior. Robocopy is more deterministic for testing and can parallelize with /mt.
2) Should I enable SMB Multichannel on Samba?
If your clients are modern Windows and your NIC/driver stack is stable, yes—multichannel can increase throughput and reduce single-core bottlenecks. Validate with Get-SmbMultichannelConnection and watch CPU/interrupt behavior.
3) Do jumbo frames help SMB performance?
Sometimes. They reduce per-packet overhead and can improve CPU efficiency on 10GbE+. But only if every hop supports the MTU. Partial jumbo is a performance trap.
4) Why do small files copy so slowly even on fast SSD NAS?
Small-file copies are dominated by metadata operations, ACL checks, and open/close behavior—not bulk throughput. Antivirus scanning and snapshots can amplify the pain. Measure directory operation latency and server CPU, not just MB/s.
5) Is SMB signing worth the performance hit?
Signing protects against tampering and some classes of attacks. Whether it’s “worth it” is a security decision. Operationally: if you must sign, plan CPU accordingly and ensure RSS/multichannel are correct so you don’t bottleneck one core.
6) Should I disable SMB encryption to speed things up?
Only if policy allows and the network is trusted. A better first move is to ensure hardware crypto acceleration is available and to encrypt only sensitive shares where feasible.
7) Why does SMB performance cap at ~110 MB/s even on a “10GbE” NAS?
110 MB/s is suspiciously close to 1GbE line-rate. Check link speed negotiation, bad cables, a 1GbE switch in the path, or a client dock that’s only gigabit.
8) What’s the simplest way to tell if the NAS disks are the bottleneck?
Watch iostat -x during the copy. If %util is near 100% and await spikes, disks are gating you. If disks look fine but CPU or network counters spike, look there next.
9) Can LACP (bonding) double my SMB copy speed?
Not for a single TCP flow in most cases. LACP increases aggregate bandwidth across many flows. SMB Multichannel is the feature that can actually use multiple connections for a session.
10) Which Samba settings are most often “accidentally harmful”?
sync always and aggressive “durability” knobs applied to the wrong shares. Also, mandatory signing/encryption without CPU planning. And random socket options copied from old tuning posts.
Conclusion: practical next steps
If SMB copies to your NAS are slow, you don’t need a witch doctor. You need a short measurement loop and the discipline to change one thing at a time.
- Run iperf3 and validate link speed and errors. Prove the network.
- Run a large-file robocopy test and watch NAS CPU per-core and disk latency. Prove whether you’re CPU- or disk-limited.
- Check SMB features: dialect, signing, encryption, multichannel, RSS. Identify expensive options you may be paying for.
- Fix the root bottleneck, not the symptom: storage latency problems require storage fixes; packet loss requires network fixes; pinned cores require RSS/driver fixes.
- Establish a baseline so you catch regressions after patching, not after the CFO’s assistant tries to open “Q4 Budget FINAL v7.xlsx”.
Do those, and “SMB is slow” turns from a vague complaint into a solved problem with receipts.