Your VPN is slow. Video calls stutter, git pulls crawl, and someone in Finance starts saying “maybe we should just turn it off for a bit.” That’s how incidents are born: not from hackers in hoodies, but from a perfectly reasonable employee trying to get work done.
There’s one setting that reliably makes VPNs feel faster in minutes: split tunneling. It also reliably weakens your security model in ways that don’t show up in a speed test. This is the story of that trade-off—how it works, where it bites, and how to make an informed choice instead of a desperate one.
The setting: split tunneling, the performance cheat code
Split tunneling is the choice to send some traffic through the VPN and the rest directly to the internet (or to other networks) using your local default route. Full-tunnel (sometimes “force tunnel”) sends essentially everything through the VPN: internet, SaaS, updates, DNS—everything.
From the user’s chair, split tunneling looks like magic. Netflix buffers less. Zoom improves. “The VPN” stops being blamed for everything from latency to existential dread.
From the operator’s chair, split tunneling is a policy decision disguised as a performance toggle. You’re not “speeding up the tunnel.” You’re reducing how much you depend on it. That can be smart. It can also be the moment you unknowingly allow untrusted networks to become part of your threat model.
The exact knob you’re turning
- Routing scope: Do clients install
0.0.0.0/0(and::/0) routes to the VPN interface, or only routes for corporate subnets? - DNS scope: Do clients use corporate resolvers for all queries, or only for internal domains? Do you enforce it?
- Egress control: Does security monitoring, DLP, TLS inspection, and logging happen at corporate egress… or not?
If you remember one thing: split tunneling is not just routing; it’s accountability. Full-tunnel makes the corporate network responsible for the client’s traffic. Split tunneling makes the client’s local network part of the path. That’s a different world.
Why split tunneling is faster (and why full-tunnel gets blamed)
Performance is a pipeline. VPN overhead is only one stage. Full-tunnel is often slower because it forces traffic through additional chokepoints:
1) Hairpinning and path inflation
Full-tunnel means your user in Berlin accessing a SaaS service hosted in Frankfurt might route: Berlin → corporate VPN concentrator in Virginia → back to Frankfurt. That’s not “encryption overhead.” That’s geography punishing bad topology.
2) Security middleware becomes the bottleneck
When you full-tunnel, you commonly also enforce:
- Central DNS filtering
- Proxy policies
- CASB gateways
- DLP inspection
- SSL/TLS interception (where permitted)
All of those can be correct choices. They also add latency and reduce throughput, and they fail in creative ways at peak usage.
3) MTU/MSS issues show up under full-tunnel first
VPN encapsulation shrinks the effective MTU. If you don’t clamp MSS or set MTU sanely, you get fragmentation or blackholed PMTUD. Symptoms: some sites load, others hang; large uploads die; SMB is miserable; “works on hotspot” becomes the diagnostic tool of the damned.
4) Your VPN concentrator is now your internet gateway
With full-tunnel, you’ve created a centralized egress point. That concentrates bandwidth, NAT state, conntrack, CPU for encryption, and logs. If capacity planning was optimistic, users will feel it immediately.
Split tunneling feels fast because it avoids those chokepoints for non-corporate traffic. It doesn’t make the VPN inherently faster; it makes the VPN less involved.
Joke #1: Turning on split tunneling to fix performance is like fixing traffic by removing the road signs. Sure, things move—until you need to explain the crash.
How split tunneling breaks your security assumptions
Security models rely on assumptions. Split tunneling quietly invalidates a few common ones.
Assumption A: “If you’re on VPN, you’re on a trusted network”
With split tunneling, you’re on two networks at once: corporate and local. Your laptop becomes a dual-homed device. If local network traffic can reach your machine, and your machine can reach corporate resources, you’ve created a bridge. Whether it’s exploitable depends on host firewalling, endpoint posture, and segmentation. But the assumption is already broken.
Assumption B: “Corporate monitoring sees your outbound traffic”
Full-tunnel makes it plausible that web traffic, DNS, and certain telemetry pass through corporate controls. With split tunneling, a user can be “on VPN” while their internet traffic bypasses your egress logs entirely. That affects:
- Incident response timelines
- DLP enforcement
- Malware command-and-control detection
- Data residency / regulatory controls
Assumption C: “DNS is centralized, so we can control and detect”
Split tunneling frequently comes with split DNS: internal names go to corporate resolvers, external names go to local. If implemented poorly, you’ll leak internal queries to public resolvers, or you’ll create ambiguous resolution that breaks apps in ways that look like “VPN flakiness.”
Assumption D: “The VPN client policy is enforceable”
On managed devices, you can enforce routes, DNS, and firewall policies. On BYOD, your control is weaker. Split tunneling on BYOD is a policy you can’t reliably audit. That’s a bad combination.
Assumption E: “Lateral movement starts inside corporate”
With split tunneling, the attack surface includes local networks: coffee shops, hotels, home IoT chaos. If the endpoint is compromised via local exposure (evil twin AP, local broadcast attacks, exposed services), the attacker now has a VPN path into corporate.
What to do with this reality
If you choose split tunneling, act like you chose it. Don’t pretend you still have a full-tunnel security posture. Instead:
- Harden endpoints: host firewall default deny inbound, strict posture checks, rapid patching.
- Make corporate access zero-trust-ish: per-app access, identity-based policies, least privilege, short-lived credentials.
- Segment internal networks so “VPN user” doesn’t mean “can see everything.”
- Log where it matters: identity, access decisions, endpoint signals—not just egress logs.
And if you need full-tunnel for compliance or monitoring, then commit and engineer for it: regional egress, capacity, MTU correctness, and sane DNS.
Interesting facts and historical context (the short, useful kind)
- IPsec (IKE) became mainstream in the late 1990s as enterprises needed secure site-to-site links without dedicated private circuits.
- PPTP was widely deployed in the 1990s because it was easy, not because it was great. It later became synonymous with “don’t use this.”
- SSL VPNs rose in the early 2000s as NAT and firewalls made classic IPsec client deployments painful; HTTPS-friendly tunnels won political battles.
- NAT traversal changed VPN ergonomics: IPsec NAT-T (UDP encapsulation) made VPNs more viable from behind consumer routers and hotel Wi‑Fi.
- “Full tunnel” became the default as central security stacks grew; it aligned with web proxies, centralized DNS filtering, and later DLP/CASB patterns.
- Split tunneling has been debated for decades because it’s a policy question: do you trust the endpoint and the local network?
- MTU problems got worse as encapsulation layers piled up (VPN + VLAN + cloud overlays). PMTUD is fragile in the real world.
- WireGuard (mid-to-late 2010s) popularized a simpler model with lean crypto and configuration, making “fast VPN” a more realistic expectation.
- COVID-era remote work turned VPN concentrators into internet-scale services overnight, exposing capacity planning sins and forcing quick choices like split tunneling.
Three corporate mini-stories from the trenches
Mini-story 1: The incident caused by a wrong assumption
A mid-sized company rolled out split tunneling to reduce load on their overworked VPN gateways. It worked. Tickets about “VPN slow” dropped within a week, and leadership called it a win.
Then a security alert surfaced: an internal admin portal had been accessed from a user account at an unusual time. The logs showed the user was connected to VPN, which initially satisfied everyone’s mental checklist: “they were on the corporate network, so it’s fine.”
It wasn’t fine. The endpoint had been compromised on a home network by a commodity infostealer that later escalated using cached browser sessions. Because of split tunneling, the attacker’s command-and-control traffic never passed through corporate egress. The first real hint came from identity telemetry, not network logs.
The painful part wasn’t the compromise; it was the timeline. The team lost days because they searched the wrong place: proxy logs and firewall logs that never saw the traffic.
The fix wasn’t “ban split tunneling.” They kept it, but they changed the model: tighter conditional access, reauth for admin portals, endpoint isolation triggers, and explicit training for responders: “on VPN” does not mean “we saw the traffic.”
Mini-story 2: The optimization that backfired
An engineering org decided to speed up remote developer workflows. They enabled split tunneling and also exempted a handful of cloud subnets from the VPN to “reduce latency” to CI services. It shaved seconds off some builds. Everyone cheered.
Two weeks later, weirdness: internal package downloads occasionally fetched the wrong artifacts. Not malicious. Just wrong. Developers saw intermittent checksum mismatches, mostly from home internet.
The root cause was boring and brutal: their internal artifact domain had split-horizon DNS, but the split configuration occasionally resolved external endpoints via the local resolver when the VPN DNS was slow to respond. Some traffic took the non-VPN route, hit a public mirror, and returned artifacts that didn’t match internal expectations. Integrity checks saved them from a supply-chain incident, but productivity cratered.
They fixed it by being less clever: internal domains always used corporate DNS, and those corporate DNS servers were made reachable and fast from anywhere. They also tightened routing rules so internal subdomains never leaked to local resolvers.
Mini-story 3: The boring but correct practice that saved the day
A financial services firm had a full-tunnel policy for managed laptops. They also had a habit that looked like overkill: every VPN change required a rollback plan and a “smoke test list” executed from three networks (home fiber, phone hotspot, and a hostile guest Wi‑Fi).
During a routine VPN client update, a subtle MTU regression appeared on one ISP. Large HTTPS responses stalled; small ones worked. Developers could log in, but cloning large repos failed, and some internal web apps partially loaded and then hung.
The smoke tests caught it in staging with the exact symptom pattern: “works on hotspot, fails on fiber.” The team rolled back before broad impact, then re-tested with packet captures. The problem was MSS not being clamped after a client-side change.
No heroics. No all-nighter. Just a checklist, an ugly test matrix, and someone who had been burned before. That practice paid for itself in one avoided outage.
Fast diagnosis playbook: what to check first/second/third
This is the “stop guessing” sequence. Run it in order. Each step tells you where to look next.
First: identify whether the problem is routing scope or tunnel health
- Is the slow destination supposed to be inside the tunnel or not?
- Is the default route through VPN (full-tunnel) or local (split-tunnel)?
- Is DNS for that destination coming from corporate resolvers or local?
If traffic isn’t going where you think, every other measurement is noise.
Second: check MTU/MSS and “some sites hang” patterns
- Small pages load, large downloads stall: suspect MTU/PMTUD/MSS.
- Works on hotspot but not on home ISP: suspect MTU or ISP filtering.
Third: local CPU and crypto limits vs network limits
- High CPU on client during transfer: encryption overhead or kernel/network stack issues.
- Low CPU but low throughput: path/latency, packet loss, or gateway bottleneck.
Fourth: gateway capacity and middleboxes
- Check VPN gateway CPU, memory, conntrack, NIC drops.
- Check firewalls/proxies doing inspection or shaping.
Fifth: measure where latency is introduced
- Is latency in the VPN handshake? DNS? TCP setup? TLS handshake? Application layer?
- Use traceroute and packet captures to pick the layer to blame.
Hands-on tasks: commands, what the output means, and the decision you make
These are the tasks I actually run when someone says “VPN is slow” or “split tunneling fixed it.” Adjust interface names and addresses for your environment.
Task 1: Confirm the VPN interface and assigned address
cr0x@server:~$ ip -brief addr
lo UNKNOWN 127.0.0.1/8 ::1/128
eth0 UP 10.10.20.15/24 fe80::5054:ff:fe12:3456/64
wg0 UP 10.99.0.12/32
Meaning: wg0 exists and is up; the VPN address is 10.99.0.12/32.
Decision: If the VPN interface is missing or DOWN, stop. Fix client connectivity before tuning performance.
Task 2: Determine whether you’re full-tunnel or split-tunnel
cr0x@server:~$ ip route show default
default via 10.10.20.1 dev eth0 proto dhcp src 10.10.20.15 metric 100
Meaning: Default route is local via eth0. That’s split tunneling (at least for IPv4 default).
Decision: If you expected full-tunnel, your policy isn’t applied or is being overridden. Fix routing before chasing throughput.
Task 3: Check if corporate networks are routed to the VPN
cr0x@server:~$ ip route | grep -E '10\.50\.|172\.16\.|192\.168\.200'
10.50.0.0/16 dev wg0 proto static
192.168.200.0/24 dev wg0 proto static
Meaning: Only specific prefixes go to VPN. That’s split tunneling by design.
Decision: Confirm the prefix list matches reality. Missing a subnet looks like “VPN can’t reach app” and gets misdiagnosed as slowness.
Task 4: Confirm actual path selection to a destination
cr0x@server:~$ ip route get 10.50.12.34
10.50.12.34 dev wg0 src 10.99.0.12 uid 1000
cache
Meaning: Traffic to 10.50.12.34 will use wg0.
Decision: If it routes via eth0 or another interface, your split policy is wrong (or you have conflicting routes).
Task 5: Check DNS resolver in use (systemd-resolved example)
cr0x@server:~$ resolvectl status
Global
Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
resolv.conf mode: stub
Link 2 (eth0)
Current Scopes: DNS
Protocols: +DefaultRoute
Current DNS Server: 1.1.1.1
DNS Servers: 1.1.1.1 1.0.0.1
Link 3 (wg0)
Current Scopes: DNS
Protocols: -DefaultRoute
DNS Servers: 10.50.0.53
DNS Domain: corp.example
Meaning: External DNS goes to 1.1.1.1, internal domain corp.example goes to 10.50.0.53 via VPN.
Decision: If internal lookups are going to public resolvers, you have a DNS leak and likely intermittent resolution failures. Fix split DNS policy.
Task 6: Validate internal name resolution goes to corporate DNS
cr0x@server:~$ dig +short @10.50.0.53 intranet.corp.example
10.50.12.34
Meaning: Corporate DNS resolves the internal name.
Decision: If it times out, your VPN path to DNS is broken or filtered. Prioritize DNS reachability—apps will “feel slow” when DNS is slow.
Task 7: Spot DNS leakage by comparing resolvers
cr0x@server:~$ dig +time=1 +tries=1 @1.1.1.1 intranet.corp.example
; <<>> DiG 9.18.24-1 <<>> +time=1 +tries=1 @1.1.1.1 intranet.corp.example
;; connection timed out; no servers could be reached
Meaning: Public resolver doesn’t know the internal domain (good). If it did answer, you’ve likely published something you didn’t mean to.
Decision: If clients are querying public DNS for internal zones, fix routing of DNS or enforce resolver settings.
Task 8: Check MTU on interfaces
cr0x@server:~$ ip link show wg0
3: wg0: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1420 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/none
Meaning: WireGuard MTU is 1420, a common safe value.
Decision: If MTU is too high (e.g., 1500 over an encapsulated path), expect blackholes. Consider lowering MTU or enforcing MSS clamping.
Task 9: Reproduce MTU/PMTUD issues with ping “do not fragment”
cr0x@server:~$ ping -M do -s 1372 -c 3 10.50.12.34
PING 10.50.12.34 (10.50.12.34) 1372(1400) bytes of data.
1380 bytes from 10.50.12.34: icmp_seq=1 ttl=63 time=42.1 ms
1380 bytes from 10.50.12.34: icmp_seq=2 ttl=63 time=41.7 ms
1380 bytes from 10.50.12.34: icmp_seq=3 ttl=63 time=42.5 ms
--- 10.50.12.34 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms
Meaning: Packets of this size make it through without fragmentation.
Decision: If you see “Frag needed” or timeouts at moderate sizes, tune MTU/MSS. This is often the hidden “VPN is slow” root cause.
Task 10: Check TCP MSS clamping rules (iptables example)
cr0x@server:~$ sudo iptables -t mangle -S | grep -i mss
-A FORWARD -o wg0 -p tcp -m tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
Meaning: MSS clamping is enabled for forwarded TCP into the tunnel.
Decision: If absent and you see MTU symptoms, add MSS clamping (or fix MTU). If present but ineffective, your traffic may not traverse this box.
Task 11: Measure raw throughput with iperf3 through the tunnel
cr0x@server:~$ iperf3 -c 10.50.12.34 -t 10
Connecting to host 10.50.12.34, port 5201
[ 5] local 10.99.0.12 port 53322 connected to 10.50.12.34 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 11.2 MBytes 94.0 Mbits/sec 0 1.12 MBytes
[ 5] 1.00-2.00 sec 10.8 MBytes 90.4 Mbits/sec 3 1.02 MBytes
[ 5] 2.00-3.00 sec 10.6 MBytes 88.8 Mbits/sec 1 1.05 MBytes
[ 5] 3.00-4.00 sec 10.9 MBytes 91.2 Mbits/sec 0 1.09 MBytes
[ 5] 4.00-5.00 sec 10.7 MBytes 89.6 Mbits/sec 2 1.01 MBytes
[ 5] 5.00-6.00 sec 10.8 MBytes 90.4 Mbits/sec 0 1.08 MBytes
[ 5] 6.00-7.00 sec 10.7 MBytes 89.6 Mbits/sec 1 1.03 MBytes
[ 5] 7.00-8.00 sec 10.9 MBytes 91.2 Mbits/sec 0 1.10 MBytes
[ 5] 8.00-9.00 sec 10.7 MBytes 89.6 Mbits/sec 2 1.00 MBytes
[ 5] 9.00-10.00 sec 10.8 MBytes 90.4 Mbits/sec 0 1.06 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 108 MBytes 90.6 Mbits/sec 12 sender
[ 5] 0.00-10.00 sec 107 MBytes 90.0 Mbits/sec receiver
Meaning: ~90 Mbit/s through the tunnel, some retransmits but not catastrophic.
Decision: If throughput is low and retransmits high, suspect loss/MTU/path. If throughput is capped with low loss, suspect CPU/crypto or shaping.
Task 12: Check client-side CPU pressure during transfer
cr0x@server:~$ mpstat -P ALL 1 3
Linux 6.5.0-21-generic (laptop) 02/04/2026 _x86_64_ (8 CPU)
02:11:30 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
02:11:31 PM all 18.21 0.00 6.12 0.11 0.00 1.33 0.00 0.00 0.00 74.23
02:11:32 PM all 19.02 0.00 6.41 0.00 0.00 1.58 0.00 0.00 0.00 72.99
02:11:33 PM all 17.88 0.00 6.05 0.00 0.00 1.49 0.00 0.00 0.00 74.58
Meaning: CPU isn’t pegged; the client likely isn’t crypto-bound.
Decision: If CPU is near 100% during transfer, faster ciphers/hardware offload or a different VPN protocol may matter more than routing policy.
Task 13: Identify packet loss and jitter to the VPN gateway
cr0x@server:~$ mtr -rwzbc 50 vpn-gw.corp.example
Start: 2026-02-04T14:12:10+0000
HOST: laptop Loss% Snt Last Avg Best Wrst StDev
1.|-- 10.10.20.1 0.0% 50 1.2 1.3 0.9 3.9 0.5
2.|-- 100.64.0.1 0.0% 50 6.7 6.9 5.8 11.2 1.0
3.|-- 203.0.113.10 0.0% 50 18.0 18.3 16.7 25.9 1.9
4.|-- vpn-gw.corp.example 0.0% 50 42.8 43.1 40.9 52.0 2.1
Meaning: No loss, stable latency. Good underlay.
Decision: If loss appears before the gateway, tuning the VPN won’t fix the underlay. Consider alternate networks, ISP escalation, or multi-region gateways.
Task 14: Confirm whether your public IP changes with VPN (quick full-tunnel check)
cr0x@server:~$ curl -4 ifconfig.me
198.51.100.44
Meaning: This is your current IPv4 egress address. In full-tunnel, it should match corporate egress; in split tunnel, it’ll match the local ISP.
Decision: If policy requires corporate egress and you see an ISP address, you’re not full-tunneled (or you’re leaking).
Task 15: Check WireGuard handshake and transfer counters
cr0x@server:~$ sudo wg show
interface: wg0
public key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
private key: (hidden)
listening port: 51820
peer: YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY=
endpoint: 203.0.113.50:51820
allowed ips: 10.50.0.0/16, 192.168.200.0/24
latest handshake: 23 seconds ago
transfer: 1.92 GiB received, 336.45 MiB sent
persistent keepalive: every 25 seconds
Meaning: Handshake is recent; the tunnel is alive; routes are split by allowed ips.
Decision: If handshake is old and traffic stalls, suspect NAT timeouts, blocked UDP, or roaming issues. Consider keepalive or TCP-based VPN as fallback.
Task 16: Inspect conntrack pressure on a Linux VPN gateway
cr0x@server:~$ sudo sysctl net.netfilter.nf_conntrack_count net.netfilter.nf_conntrack_max
net.netfilter.nf_conntrack_count = 248912
net.netfilter.nf_conntrack_max = 262144
Meaning: Conntrack is near max; new connections may drop or stall. Users will call it “slow VPN.”
Decision: Increase conntrack max (with RAM considerations), reduce full-tunnel egress load, or scale out gateways.
Joke #2: VPN performance tuning is the art of moving packets faster by moving fewer packets. Accountants call it “cost optimization.”
Common mistakes: symptoms → root cause → fix
1) “VPN is slow, but only for some websites”
Symptom: Small pages load; larger assets hang; downloads stall; Slack works but your internal dashboard doesn’t fully render.
Root cause: MTU/PMTUD blackholing due to encapsulation, or missing MSS clamping.
Fix: Lower tunnel MTU and/or clamp MSS on the path that forwards into the tunnel. Verify with DF pings and a packet capture.
2) “Turning on split tunneling fixed everything”
Symptom: Performance improves dramatically when split tunneling is enabled.
Root cause: Full-tunnel path is hairpinning through distant egress, or overloaded egress/proxy stack.
Fix: Add regional VPN gateways/egress, bypass heavy inspection for low-risk destinations where acceptable, or use per-app VPN instead of all-or-nothing.
3) “Internal app is slow only on VPN; ping is fine”
Symptom: ICMP latency looks normal; the app is sluggish, especially during login or API calls.
Root cause: DNS latency, TLS handshake delays, proxy auth loops, or packet loss affecting TCP more than ICMP.
Fix: Measure DNS query time, TCP connect time, and TLS handshake time. Don’t use ping as your performance religion.
4) “After enabling split tunneling, we can’t reach some internal subnets”
Symptom: Only parts of corporate network are reachable; some hosts time out.
Root cause: Missing routes/allowed IPs, overlapping RFC1918 ranges with home networks, or conflicting route metrics.
Fix: Add explicit routes, avoid overlapping address space where possible, or use NAT/translation for remote access. Verify with ip route get.
5) “We enabled full-tunnel for security and now Zoom is unusable”
Symptom: Video/audio jitter, packet loss, poor call quality.
Root cause: Real-time traffic forced through distant egress and inspected middleboxes; DSCP markings may be lost; NAT state overload.
Fix: Use split tunneling for approved real-time services (if policy allows), or deploy local breakout/SD-WAN style egress closer to users. Prioritize UDP paths.
6) “Users are ‘on VPN’ but internal DNS randomly fails”
Symptom: Internal names sometimes resolve, sometimes time out; switching networks “fixes” it.
Root cause: Split DNS misconfiguration, DNS servers only reachable through a specific tunnel path, or resolver selection flapping.
Fix: Make corporate DNS reachable and redundant, pin internal zones to corporate resolvers, and set clear routing rules for DNS traffic.
7) “Security wants full-tunnel, networking wants split tunnel, nobody is happy”
Symptom: Endless debate, no decision, lots of exceptions.
Root cause: You’re trying to use a network tunnel as a universal security control.
Fix: Move controls up the stack: device posture, identity-aware proxies, app-level access, and segmentation. Then choose split/full based on measurable requirements, not vibes.
Checklists / step-by-step plan
Decision checklist: should you allow split tunneling?
- Compliance: Do you have requirements that mandate corporate egress for all traffic? If yes, default to full-tunnel and engineer for it.
- Device control: Are endpoints managed with enforceable host firewall and posture checks? If no, be very cautious with split tunneling.
- Internal segmentation: If a client is compromised, can it reach “everything” over VPN? If yes, you’re relying on the tunnel as a moat. Fix segmentation first.
- Monitoring strategy: Can IR succeed with identity + endpoint telemetry if network egress visibility is reduced? If no, don’t pretend split tunneling is harmless.
- Topology: Do you have regional gateways/egress? If no, your full-tunnel experience will be slow for distant users; either invest or accept split tunneling with safeguards.
Performance checklist: make full-tunnel not miserable
- Place gateways close to users (regionally) and keep failover tested.
- Right-size bandwidth, conntrack/NAT state, and CPU for crypto.
- Fix MTU/MSS deliberately; don’t “hope PMTUD works.”
- Make DNS fast, redundant, and reachable; measure it.
- Decide what gets inspected and what doesn’t; document exceptions as policy, not tribal knowledge.
- Measure user experience with synthetic tests from typical last-mile networks.
Security checklist: if you do split tunnel, do it like you mean it
- Default-deny inbound on endpoints; block local subnet inbound unless required.
- Use per-app access or identity-aware proxies for critical apps.
- Enforce MFA and step-up auth for sensitive operations.
- Shorten credential lifetimes; reduce reliance on long-lived sessions.
- Harden DNS: internal zones to corporate resolvers; prevent internal query leakage.
- Instrument endpoints and identity; assume less network visibility.
- Restrict VPN routes to what’s necessary; “entire RFC1918” is lazy and dangerous.
Change plan: rolling out split tunneling without chaos
- Define scope: Which traffic is excluded (internet default route? specific SaaS? real-time media?) and why.
- Model threats: Dual-homing risks, data exfil paths, DNS leaks, and responder visibility gaps.
- Pilot: Start with a controlled group on managed devices; collect performance and security telemetry.
- Guardrails: Enforce host firewall rules, posture checks, and internal segmentation before broad rollout.
- Measure: Use iperf3, DNS latency, and app-level synthetic checks pre/post.
- Document: Update IR runbooks: where logs exist and where they don’t.
- Rollback: Keep a one-click path back to full-tunnel for incidents.
The engineering quote (because this is really an operations problem)
Paraphrased idea — Werner Vogels: “Everything fails, all the time.”
Split tunneling is a failure-mode amplifier: it changes where failures happen and which teams can see them. If you adopt it, update your observability and incident muscle memory accordingly.
FAQ
1) What exactly is the “one setting” that makes VPN fast and insecure?
Split tunneling. It improves perceived speed by letting non-corporate traffic bypass the VPN path and corporate egress controls. That also reduces centralized visibility and can introduce dual-homing risks.
2) Is split tunneling always insecure?
No. It’s insecure relative to a model that assumes “VPN equals trusted, monitored network.” With strong endpoint security, segmentation, and identity-based access, split tunneling can be an acceptable trade.
3) Why does full-tunnel slow down SaaS so much?
Usually topology and middleboxes: hairpin routing through distant gateways, plus proxies/DLP/inspection stacks. Crypto overhead is rarely the main culprit on modern hardware.
4) Can I keep full-tunnel but improve performance?
Yes: regional gateways, local breakout with policy enforcement, capacity for NAT/conntrack, correct MTU/MSS, and fast DNS. Full-tunnel isn’t doomed; it’s just often underbuilt.
5) What’s the single fastest way to confirm whether traffic is going through the VPN?
Check the default route and a route lookup to the target with ip route get. For internet egress, compare public IP while connected.
6) What’s the most common “VPN slow” bug that isn’t actually bandwidth?
MTU/MSS problems. They create stalls and retransmits that feel like slowness and only appear for certain sites or payload sizes.
7) Does split tunneling cause DNS leaks?
It can, especially with split DNS setups that aren’t pinned properly. If internal queries go to local resolvers, you leak metadata and you get flaky app behavior.
8) Is WireGuard inherently faster than OpenVPN?
Often, yes—simpler design, kernel integration on many platforms, and modern crypto. But your bottleneck may still be underlay loss, gateway capacity, or egress inspection. A faster tunnel doesn’t fix a slow path.
9) How do overlapping home networks break split tunneling?
If corporate uses the same private ranges as home (common with 192.168.1.0/24), routes may prefer the local network and “internal” traffic never enters the VPN. Fix with non-overlapping addressing, NAT, or explicit routing/translation strategies.
10) What should we log if split tunneling reduces network visibility?
Identity events, device posture signals, VPN session metadata, and application access logs. Treat the endpoint and the app as the new perimeter—because they are.
Practical next steps
If you’re deciding between “fast” and “secure,” you’re already losing time. Do this instead:
- Measure the current bottleneck using the fast diagnosis playbook. Prove whether the issue is routing, MTU, loss, gateway capacity, or inspection.
- If you need full-tunnel, engineer it: regional gateways, correct MTU/MSS, enough conntrack/NAT state, and DNS that doesn’t wobble under load.
- If you choose split tunneling, change the security model: hardened endpoints, segmentation, and identity-based controls. Update incident response runbooks to match reality.
- Write down the policy in plain language: what bypasses the tunnel, what never does, and what the exceptions are. Then test it on real networks, not just the office Wi‑Fi.
The tunnel isn’t your product. Reliable access is. Split tunneling can be a tool, but it’s not a free lunch—more like a lunch that comes with a surprise audit.