Office VPN + VLAN: connect segments safely without flattening the network

Was this helpful?

The office is segmented. The VPN lands users “inside.” Suddenly the segmentation is… aspirational.
You thought you were giving accounting access to the finance app; you actually gave them the best seat in the house for scanning printers, hypervisors, and anything else with an IP.

This is the quiet failure mode of office networks: a VPN that acts like a giant layer‑2 extension cord.
It works great—right up to the moment you need it to be safe, auditable, and predictable. Then it’s a crime scene with nice uptime.

The mental model: VPN is a router, not a magic hallway

VLANs are about separating broadcast domains and creating administrative boundaries. VPNs are about transporting traffic across an untrusted or inconvenient network.
Combine them badly and you don’t get “secure remote access.” You get “one more path to everything,” with fewer eyeballs on it.

If you remember one thing: a remote-access VPN client is a host on a new interface. It needs routing, DNS, and firewall policy just like any other host.
Treating it like “inside” is how you flatten the network without even noticing.

What “flattening” usually looks like

  • VPN clients get an address in a “trusted” subnet that can route to all VLANs.
  • Firewall rules say “VPN → LAN allow” because a ticket demanded “make it work.”
  • Split tunneling is disabled “for security,” so every remote user becomes a hairpin router through your office.
  • No one can answer “which VPN user accessed which VLAN service” without guessing.

The goal isn’t to make VPN users miserable. The goal is to make their access deliberate: specific networks, specific ports, specific identities, with logs that stand up in daylight.

A few facts (and some history) that change decisions

  1. VLANs came from IEEE 802.1Q (late 1990s) as a way to do logical segmentation on shared switching infrastructure. They were never meant to be an access control system by themselves.
  2. IPsec was designed for hostile networks, but enterprise deployments often use “allow any” policies because negotiating “what should be allowed” is socially harder than cryptography.
  3. Early remote-access VPNs often bridged layer 2 (TAP devices, “bridge mode”), which made Windows browsing and legacy protocols happy—and security teams unhappy later.
  4. NAT became the accidental standard for home networks, which is why VPNs so often fight overlapping RFC1918 address space in 2025.
  5. Split tunneling predates modern “Zero Trust” by decades. It’s not automatically insecure; it’s insecure when you don’t control device posture and egress policy.
  6. Network segmentation historically came from reliability needs as much as security: limiting broadcast storms and isolating failures was a practical driver long before ransomware was on every slide deck.
  7. 802.1X NAC (port-based authentication) tried to make “who is on this port?” a first-class question. Many orgs still don’t deploy it because it demands operational discipline.
  8. “VPN inside the perimeter” thinking is a relic from when “inside” meant “employees on corporate desktops.” BYOD and SaaS turned that into folklore.

The punchline: VLANs are boundaries, VPNs are transport. Security happens in routing policy, firewall rules, identity, and logging. Not in vibes.

Design goals that stop “flat network by accident”

1) Explicit reachability: least routes, not most hope

Your VPN should advertise only what a given user group needs. If the client can route to every VLAN, someone will eventually try. Sometimes by accident, sometimes not.
A clean design uses per-group route pushes, or per-peer AllowedIPs (WireGuard), or at least per-address-pool firewalling.

2) Enforce segmentation at layer 3/4 (and sometimes 7)

VLANs keep the switching sane. Firewalls keep the business sane. Put the enforcement where you can log it, review it, and test it: at the inter-VLAN boundary and at the VPN ingress.
“But the switch supports ACLs” is true—and also how you end up with rules no one can audit.

3) Don’t make VPN a transit for the whole internet unless you mean it

Full-tunnel VPN is sometimes required (regulated environments, strict egress control, hostile travel). But it’s an expensive choice:
bandwidth, latency, office firewall load, and the inevitable “Zoom is laggy” tickets.
If you do full-tunnel, design for it: capacity, QoS, and clear egress controls.

4) Make identity visible in logs

“10.9.0.23 accessed 10.20.30.40” is weak sauce. You want “alice@corp on device X accessed finance-api on port 443.”
That requires VPN authentication tied to identity (SAML/OIDC/RADIUS) and logs that include the user, assigned IP, and session times.

5) Make overlap and growth boring

VLAN sprawl happens. So does M&A. Plan your address space so you can add segments without readdressing, and avoid overlapping with common home networks.
If your office LAN is 192.168.0.0/16, you deserve the future pain you’re about to have.

Joke #1: “We’ll just allow VPN to LAN for now” is like “I’ll just juggle chainsaws until the auditor leaves.” It works right up to the moment it doesn’t.

Recommended architectures (and when to use them)

Pattern A: VPN lands in a dedicated “VPN users” VLAN + strict routing

This is the workhorse. VPN users get an IP pool that maps to a dedicated VLAN or routed interface (doesn’t have to be a VLAN, but usually is).
Inter-VLAN routing from that segment is controlled by a firewall policy.

When to use: Most offices, especially when you have multiple internal VLANs (corp, servers, VoIP, printers, IoT, guest).

Why it’s good: One chokepoint. One place to log. You can apply different policy to VPN users than to on-prem clients in the same “corp” VLAN.

Failure mode: Someone gets impatient and adds “VPN → any allow” because an internal service used a weird ephemeral port range and nobody wanted to fix the app.

Pattern B: Per-application access via reverse proxy / bastion (no general VLAN access)

Not everything needs network-layer access. Internal web apps can be published behind an identity-aware proxy. SSH can go via a bastion with short-lived certs.
RDP can be brokered.

When to use: Highly regulated environments, remote-first orgs, or when the “internal network” is mostly legacy and you don’t trust it.

Why it’s good: You can stop pretending that “network access” equals “app access.” It doesn’t.

Failure mode: Teams bypass it by exposing random ports “temporarily.” Temporary is the longest time unit in IT.

Pattern C: Site-to-site VPN between offices with VLAN-aware routing

Site-to-site is a different animal: two routing domains, usually stable endpoints, and pressure to make everything talk to everything.
Don’t. Route only the needed subnets between sites, and keep inter-site firewall rules as strict as you can tolerate.

When to use: Branch offices, warehouses, retail sites, plants.

Failure mode: Overlapping subnets (both sites use 10.0.0.0/24) and someone “fixes” it with NAT that later breaks Kerberos, logging, and troubleshooting.

Pattern D: VRF-based segmentation + VPN per VRF

If you have a real network team and gear that supports it, VRFs are the grown-up version of “lots of VLANs.”
You can run separate routing tables (corp, OT, guest) and attach VPN termination to the correct VRF with tightly scoped route leaking.

When to use: Medium to large environments, or anywhere with OT/ICS networks that must not mingle.

Failure mode: Route leaking becomes “just one more exception,” and suddenly your VRFs are a museum of past compromises.

Routing + firewalling: the rules that actually matter

Decide where inter-VLAN routing happens

Pick one: the core switch (SVIs), or a firewall/router. If you route on the switch, you still need enforcement somewhere.
The cleanest operational model is: switch does layer 2/3 forwarding, firewall enforces policy at the boundaries (either via routed links, ACLs, or “firewall-on-a-stick” designs).

My bias: if you care about segmentation as security, enforce it on a firewall with logging. Switch ACLs are fine for guardrails, not for your main safety story.

VPN ingress policy: treat it like a hostile VLAN

Remote devices are variable. Even corporate-managed laptops spend time on hotel Wi‑Fi, home networks, and coffee shops with captive portals designed by chaos.
So your VPN subnet should be treated closer to “semi-trusted” than “corp LAN.”

  • Allow VPN → DNS (your resolvers), NTP, and required internal apps.
  • Block VPN → management networks (switch, firewall, hypervisor mgmt) by default.
  • Block VPN → user VLANs unless you have a justified use case (helpdesk tools, etc.).
  • Log denies for early warning. Then tune to reduce noise.

Route advertisement is a security control

Firewalls are your last line. Routing is your first line. If you push routes for every subnet to every client, you are inviting lateral movement.
In WireGuard, AllowedIPs is both “which routes the client installs” and “which source subnets the server accepts.” That’s powerful. Use it.

Be intentional about split tunneling

Split tunnel means only internal subnets go through VPN; everything else goes direct. The risk isn’t “the internet.” The risk is a compromised endpoint that can talk to both worlds.
You mitigate that with device posture checks, endpoint security, and restricting what the VPN can reach.

Full tunnel means everything goes through your office. The risk is that your office becomes everyone’s bottleneck and your logs become a privacy and compliance minefield.
Choose based on egress control requirements, not fear.

One quote, because it’s still true

Hope is not a strategy. — paraphrased idea often attributed to engineers and operators in reliability circles

(No, I’m not betting production security on the exact provenance of a slogan. Neither should you.)

DNS, identity, and why IP-only thinking fails

DNS: make it boring and consistent

VPN users need name resolution that matches on-prem reality, or you’ll end up with shadow host files and mystery IPs in bookmarks.
Use split DNS: internal domains resolve via internal resolvers; public names via normal resolvers (or your controlled egress if full-tunnel).

Also: decide if VPN clients can resolve internal names that they can’t reach. In strict environments, you may intentionally prevent that to reduce reconnaissance.

Identity: bind access to who, not just where

VLAN segmentation is “where.” VPN adds “how you got here.” You still need “who.”
If your VPN authentication is just a shared PSK, you have a time bomb. Use per-user auth with MFA, short-lived certs/keys, and group-based authorization.

Observability: logs that answer security questions

You want to answer:

  • Which user had IP 10.90.0.14 at 14:32?
  • What internal destinations did that session talk to?
  • Were they denied by policy, and why?
  • Was this traffic normal for their role?

If your current stack can’t answer those without a manual scavenger hunt, fix that before you expand VPN reach.

Practical tasks: commands, outputs, and decisions (12+)

These are real checks you can run during design and incident response. Each one includes: command, what the output means, and the decision you make.
Assume Linux-based VPN gateway/firewall where relevant; adapt to your platform.

Task 1: Confirm the VPN interface and assigned address

cr0x@server:~$ ip -br addr show
lo               UNKNOWN        127.0.0.1/8 ::1/128
eth0             UP             203.0.113.10/24
vlan20@eth1      UP             10.20.0.1/24
vlan30@eth1      UP             10.30.0.1/24
wg0              UP             10.90.0.1/24

Meaning: wg0 is your VPN interface; the VPN client pool is 10.90.0.0/24.

Decision: Treat 10.90.0.0/24 as its own segment in firewall policy. Don’t lump it into “LAN.”

Task 2: Verify which routes VPN clients are expected to reach

cr0x@server:~$ ip route show
default via 203.0.113.1 dev eth0
10.20.0.0/24 dev vlan20 proto kernel scope link src 10.20.0.1
10.30.0.0/24 dev vlan30 proto kernel scope link src 10.30.0.1
10.90.0.0/24 dev wg0 proto kernel scope link src 10.90.0.1

Meaning: The gateway routes between VLAN20, VLAN30, and the VPN subnet.

Decision: If you don’t intend VPN clients to reach VLAN30, don’t rely on “we’ll block it later.” Remove route pushes to clients and add firewall denies.

Task 3: Check WireGuard peers and AllowedIPs (route scope)

cr0x@server:~$ sudo wg show
interface: wg0
  public key: Jkq...redacted...
  listening port: 51820

peer: w9h...redacted...
  preshared key: (hidden)
  endpoint: 198.51.100.25:53644
  allowed ips: 10.90.0.23/32, 10.20.10.0/24
  latest handshake: 1 minute, 12 seconds ago
  transfer: 1.23 GiB received, 5.44 GiB sent

Meaning: This user can source traffic as 10.90.0.23 and is allowed to reach (and/or route) 10.20.10.0/24, depending on your topology.

Decision: Tighten allowed ips to only what they need. Avoid giving a peer broad subnets “for convenience.”

Task 4: Confirm IP forwarding is enabled on the gateway

cr0x@server:~$ sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1

Meaning: The system will route between interfaces.

Decision: If you expected the gateway to be a strict endpoint (no routing), disable forwarding and redesign. If you expected routing, keep it on and enforce firewall policy.

Task 5: Inspect firewall default policy (nftables)

cr0x@server:~$ sudo nft list ruleset | sed -n '1,120p'
table inet filter {
  chain input {
    type filter hook input priority 0; policy drop;
    ct state established,related accept
    iif "lo" accept
    iif "eth0" tcp dport 22 accept
    iif "eth0" udp dport 51820 accept
  }

  chain forward {
    type filter hook forward priority 0; policy drop;
    ct state established,related accept
    iif "wg0" oif "vlan20" tcp dport { 443, 5432 } accept
    iif "wg0" oif "vlan20" udp dport 53 accept
    iif "wg0" oif "vlan30" drop
  }
}

Meaning: Default drop. VPN can reach specific TCP ports on VLAN20, can do DNS, and is explicitly blocked from VLAN30.

Decision: This is the shape you want: explicit allowlists, explicit denylists where needed, and a default deny that you can explain.

Task 6: Confirm the VPN isn’t accidentally NATing into “trusted” LAN

cr0x@server:~$ sudo nft list table ip nat
table ip nat {
  chain postrouting {
    type nat hook postrouting priority 100; policy accept;
    oif "eth0" masquerade
  }
}

Meaning: Only internet-bound traffic is NATed out eth0. Internal VLAN traffic is not masqueraded.

Decision: Avoid NAT between VPN and internal VLANs unless you are forced into it by overlapping subnets. NAT hides identity and complicates logging.

Task 7: Validate that inter-VLAN access is actually blocked (from a VPN client)

cr0x@server:~$ nc -vz 10.30.0.10 22
nc: connect to 10.30.0.10 port 22 (tcp) failed: Operation timed out

Meaning: The connection doesn’t succeed; likely blocked by firewall policy (or routing/host down).

Decision: Confirm in firewall logs. If you see “drop wg0→vlan30,” you’re enforcing segmentation correctly.

Task 8: Verify allowed app access is working (and not wider than needed)

cr0x@server:~$ nc -vz 10.20.0.50 5432
Connection to 10.20.0.50 5432 port [tcp/postgresql] succeeded!

Meaning: VPN client can reach the Postgres service on the approved port.

Decision: If this is the only required access, keep it this narrow. If the app needs more ports, document them and consider an app-layer proxy instead of widening the network.

Task 9: Check path MTU and fragmentation risk (classic VPN throughput killer)

cr0x@server:~$ ping -M do -s 1360 10.90.0.1 -c 3
PING 10.90.0.1 (10.90.0.1) 1360(1388) bytes of data.
1368 bytes from 10.90.0.1: icmp_seq=1 ttl=64 time=22.1 ms
1368 bytes from 10.90.0.1: icmp_seq=2 ttl=64 time=21.7 ms
1368 bytes from 10.90.0.1: icmp_seq=3 ttl=64 time=21.9 ms

--- 10.90.0.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2004ms

Meaning: A 1360-byte payload with DF set succeeds. That suggests your effective MTU is at least ~1388 on that path.

Decision: If this fails, lower the VPN interface MTU (or fix PMTUD). Don’t “optimize” by ignoring MTU; it will optimize you into a ticket storm.

Task 10: Capture traffic on the VPN interface to see what’s really happening

cr0x@server:~$ sudo tcpdump -ni wg0 host 10.90.0.23 and tcp port 5432 -c 5
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on wg0, link-type RAW (Raw IP), snapshot length 262144 bytes
12:44:01.121901 IP 10.90.0.23.49212 > 10.20.0.50.5432: Flags [S], seq 381909112, win 64240, options [mss 1360,sackOK,TS val 101 ecr 0,nop,wscale 7], length 0
12:44:01.138221 IP 10.20.0.50.5432 > 10.90.0.23.49212: Flags [S.], seq 21188218, ack 381909113, win 65160, options [mss 1460,sackOK,TS val 55 ecr 101,nop,wscale 7], length 0
12:44:01.138309 IP 10.90.0.23.49212 > 10.20.0.50.5432: Flags [.], ack 1, win 502, options [nop,nop,TS val 102 ecr 55], length 0

Meaning: SYN/SYN-ACK/ACK completes. Network path works; if the app is still “down,” it’s likely app auth, DNS, or user error—not routing.

Decision: Stop changing firewall rules. Escalate to app/service owners with proof.

Task 11: Confirm DNS resolution path for internal names

cr0x@server:~$ resolvectl status | sed -n '1,120p'
Global
       Protocols: -LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
 resolv.conf mode: stub

Link 3 (wg0)
    Current Scopes: DNS
         Protocols: +DefaultRoute
Current DNS Server: 10.20.0.53
       DNS Servers: 10.20.0.53
        DNS Domain: corp.example

Meaning: DNS for VPN interface points to internal resolver and search domain is set.

Decision: If VPN users can’t resolve *.corp.example, fix DHCP options for the VPN, or the VPN client config push. Don’t tell users to “just use IPs.” That’s how mistakes graduate into outages.

Task 12: Check which connections are active and whether they’re stuck

cr0x@server:~$ ss -tnp | sed -n '1,12p'
State   Recv-Q  Send-Q   Local Address:Port    Peer Address:Port  Process
ESTAB   0       0        10.90.0.1:5432        10.90.0.23:49212   users:(("postgres",pid=2214,fd=7))
SYN-RECV 0      0        10.20.0.50:443        10.90.0.23:51044   users:(("nginx",pid=1893,fd=15))

Meaning: Established DB session is healthy. The SYN-RECV indicates half-open TCP handshakes—often firewall asymmetry, return routing issues, or MTU problems.

Decision: Investigate path symmetry and firewall state tracking; check return routes from 10.20.0.50 back to 10.90.0.0/24.

Task 13: Verify return routing from an internal server VLAN

cr0x@server:~$ ip route get 10.90.0.23 from 10.20.0.50 iif vlan20
10.90.0.23 from 10.20.0.50 iif vlan20
  via 10.20.0.1 dev vlan20 src 10.20.0.50 uid 0
    cache

Meaning: Replies from VLAN20 will go via the gateway at 10.20.0.1, which can route to wg0.

Decision: If this instead points to a different router, you have asymmetric routing. Fix the default gateway or add a static route. Don’t “fix” it with NAT unless you like debugging blindfolded.

Task 14: Check firewall counters to see what’s being dropped

cr0x@server:~$ sudo nft list chain inet filter forward
table inet filter {
  chain forward {
    type filter hook forward priority 0; policy drop;
    ct state established,related accept
    iif "wg0" oif "vlan20" tcp dport { 443, 5432 } accept
    iif "wg0" oif "vlan20" udp dport 53 accept
    iif "wg0" oif "vlan30" drop counter packets 1842 bytes 122004
  }
}

Meaning: Traffic from VPN to VLAN30 is being dropped, and it’s not hypothetical—there are counters.

Decision: Decide if this is expected (good: you’re preventing lateral movement) or indicates a misdocumented dependency (bad: app actually needs VLAN30). Either way, you now have evidence.

Task 15: Measure VPN gateway CPU and crypto bottlenecks

cr0x@server:~$ mpstat -P ALL 1 3
Linux 6.8.0 (vpn-gw)  12/28/2025  _x86_64_  (8 CPU)

01:02:11 PM  CPU  %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
01:02:12 PM  all  12.5  0.0  38.1   0.0    0.0  31.4    0.0    0.0    0.0   18.0
01:02:12 PM    3  10.0  0.0  44.0   0.0    0.0  42.0    0.0    0.0    0.0    4.0

Meaning: High %soft and %sys suggests packet processing is hot; one CPU is nearly pinned. This can cap throughput.

Decision: Consider enabling multiqueue/NIC tuning, moving VPN termination to stronger hardware, or using a kernel-based implementation. Don’t blame “the ISP” until you’ve looked here.

Task 16: Check conntrack table pressure (stateful firewall pain)

cr0x@server:~$ sudo sysctl net.netfilter.nf_conntrack_count net.netfilter.nf_conntrack_max
net.netfilter.nf_conntrack_count = 183742
net.netfilter.nf_conntrack_max = 262144

Meaning: You’re at ~70% of conntrack capacity. Spikes can cause drops and weirdness.

Decision: Increase conntrack sizing and/or reduce unnecessary flows (especially if full-tunnel). If you’re full-tunneling everyone’s streaming, you’re paying the price in states.

Fast diagnosis playbook

When VPN + VLAN access “sort of works” and everyone starts proposing random firewall changes, do this instead.
The goal is to find the bottleneck in minutes, not after a week of folklore.

First: isolate the class of failure

  1. Is it reachability? Can the client ping the VPN gateway IP? Can it reach a known internal service IP on a known port?
  2. Is it name resolution? Does internal DNS resolve? Does it resolve to the correct IP for that user’s segment?
  3. Is it policy? Do firewall counters/logs show drops for that source IP and destination?
  4. Is it performance? Do small pings work but large transfers stall? Think MTU, CPU, conntrack, and asymmetric routes.

Second: check routing symmetry

  1. From the gateway: route to destination VLAN exists.
  2. From the destination VLAN: return route to VPN subnet exists via the same gateway.
  3. On firewalls: state tracking sees both directions on the same box.

Third: confirm the VPN client’s installed routes

Most “VPN can’t reach subnet X” tickets are “client never installed route to X” or “client has a more specific route to a local network.”
This is where overlapping RFC1918 bites.

Fourth: test MTU early

VPN MTU issues waste time because they masquerade as “random slowness” and “some sites work, others don’t.”
Run DF pings with increasing size until you find the ceiling, then set MTU intentionally.

Fifth: look at capacity on the termination point

High CPU softirq, conntrack near max, NIC ring drops, or a single-threaded user-space VPN process can all cap throughput.
If you full-tunnel, assume you are now running an ISP. Act accordingly.

Common mistakes: symptoms → root cause → fix

1) “VPN users can access everything on every VLAN”

Symptom: Security review finds VPN pool can reach printers, hypervisors, and random management IPs.

Root cause: VPN subnet treated as “trusted LAN” and allowed through inter-VLAN routing rules.

Fix: Put VPN users into a dedicated policy zone; switch to default-deny forward policy; allowlist only required ports and destinations.

2) “Some internal apps work, others hang or are painfully slow”

Symptom: SSH works, but file transfers stall; HTTPS loads partially; RDP freezes.

Root cause: MTU/PMTUD failure across VPN, often with encapsulation overhead and blocked ICMP fragmentation-needed.

Fix: Allow relevant ICMP types; set VPN interface MTU lower; test with DF pings; avoid double-encapsulation where possible.

3) “It worked yesterday; today only one VLAN is unreachable from VPN”

Symptom: VPN can reach server VLAN but not corp VLAN, or vice versa.

Root cause: Return path changed (new core switch route, VRRP move, static route removed). Asymmetric routing breaks stateful firewalls.

Fix: Ensure the destination VLAN’s default gateway returns via the VPN gateway/firewall; add explicit routes; avoid routing around the policy enforcement point.

4) “After enabling full-tunnel, the office internet became terrible”

Symptom: Everyone complains, and graphs show firewall CPU and session counts spiking.

Root cause: Full-tunnel turned your VPN concentrator into an egress gateway without capacity planning; conntrack and NAT tables get hammered.

Fix: Either move back to split tunnel with posture controls, or scale the concentrator and egress path; add QoS; control what can transit.

5) “VPN clients can’t reach internal subnets from home”

Symptom: Office subnets overlap with home router subnets; client routes point local instead of VPN.

Root cause: Address planning collision (e.g., office uses 192.168.1.0/24).

Fix: Use less common RFC1918 ranges for corporate networks; if you must, use per-client NAT on VPN as a last resort and document the tradeoffs.

6) “DNS resolves, but users hit the wrong service”

Symptom: Name resolves to an IP that’s reachable but not the intended environment (prod vs staging, office vs cloud).

Root cause: Split DNS misconfiguration; search domains pushing unexpected FQDN resolution; multiple internal zones with inconsistent records.

Fix: Standardize internal DNS zones; explicitly configure VPN DNS; audit search domains; log DNS queries from VPN pool for early detection.

7) “We added a static route and now something else broke”

Symptom: After route changes, weird reachability issues appear between VLANs.

Root cause: Route leaking between VRFs or accidental more-specific routes causing traffic to bypass firewall policy.

Fix: Keep routing simple; document route intent; use route filters; verify with traceroute and firewall counters.

Three corporate mini-stories from real life

Mini-story 1: The incident caused by a wrong assumption

A midsize company had “nice segmentation”: separate VLANs for users, servers, management, VoIP, and guest Wi‑Fi.
Remote workers connected via VPN, got an IP, and everything “just worked.” Which was celebrated as a success.

The wrong assumption was subtle: “VPN users are basically on the user VLAN.” They weren’t. They were on a VPN pool that the firewall treated as “LAN,” because someone copied an existing rule set years ago.
So VPN clients could route to the management VLAN, which existed mostly because auditors liked the diagram.

Then a contractor’s laptop got infected offsite. Nothing cinematic—just a commodity infostealer with network discovery baked in.
Once the laptop connected to VPN, it started probing. The management interface of a hypervisor answered. Then a switch. Then a backup appliance with a web UI.

The outcome wasn’t “instant ransomware apocalypse,” but it was still bad: credential reuse turned into unauthorized access attempts, and noisy scanning triggered IDS alerts that the team had never tested at that volume.
The incident response was dominated by one question: “Why can a VPN user even see this?”

Fixing it was almost boring: create a dedicated VPN zone, default deny forward, allowlist only necessary app ports, and add explicit denies to management.
The embarrassing part was the postmortem line: segmentation existed on paper, but VPN bypassed it entirely.

Mini-story 2: The optimization that backfired

Another organization decided split tunneling was “unsafe,” so they moved to full-tunnel remote access.
All internet traffic from remote laptops was forced through the office firewall, filtered, logged, and NATed out through a single ISP circuit.

The first week looked fine—because the rollout started with a small pilot group of disciplined engineers who mostly used SSH and internal tools.
The second week included the sales org, video conferencing, and a pile of browser tabs streaming everything from training videos to large CRM assets.

The firewall didn’t crash. It just became the world’s most expensive hourglass. CPU spiked in softirq, conntrack crept toward max, and latency became a daily conversation.
Users started toggling the VPN on and off to get work done, which destroyed the very control the security team wanted.

The backfire wasn’t that full-tunnel is always wrong. It’s that they treated it as a policy flip, not an architectural change.
They hadn’t planned capacity, didn’t shape traffic, and didn’t define what “remote internet” should be allowed to transit.

The eventual fix was split tunnel with strict internal allowlists plus device posture checks, and a separate “high-risk travel” profile that used full-tunnel when it was actually needed.
The big win was cultural: the organization stopped believing there was one VPN mode that solved everything.

Mini-story 3: The boring but correct practice that saved the day

A company with multiple VLANs and a remote access VPN did something unfashionable: they maintained a living matrix of “who can access what,” tied to firewall objects and VPN groups.
Every change request had to state the destination service, port, and business owner. No owner, no access. People complained. Quietly.

One morning, an internal server VLAN started seeing strange east-west traffic spikes. Not huge, but consistent.
Monitoring flagged new connection patterns from the VPN subnet into server ranges that were not part of any approved access list.

The on-call didn’t have to guess. Firewall logs mapped the source VPN IP to a user identity, the deny counters showed repeated blocks to a management subnet, and the access matrix showed there was no valid reason for that user to be there.
They disabled that user’s VPN access and rotated a handful of credentials as a precaution.

Later analysis suggested a compromised endpoint. The important part: segmentation and logging meant the blast radius was small and the detection was early.
The team didn’t need heroics. They needed the boring paperwork and the default-deny stance that everyone had rolled their eyes at.

Joke #2: The access matrix was so unpopular that it probably qualifies as a distributed denial-of-service attack on developer patience.

Checklists / step-by-step plan

Step 1: Draw the segments that matter (not the ones you wish existed)

  • List VLANs and subnets: users, servers, management, printers, IoT, guest, OT.
  • List where VPN terminates: firewall, router, Linux gateway, cloud.
  • Identify inter-VLAN routing point(s) and where policy enforcement lives.

Decision pressure: If routing happens in three places, you don’t have segmentation—you have a puzzle.

Step 2: Pick the VPN client address pool like you’ll live with it for years

  • Use a dedicated RFC1918 range that won’t overlap with common home networks.
  • Make it large enough for growth (don’t paint yourself into a /24 if you’ll have 800 users).
  • Document it as a security zone: “VPN-Users.”

Step 3: Define access by role and service

  • Group users by what they need: dev, finance, IT, vendors, support.
  • For each group: enumerate destinations (subnets/hosts) and ports.
  • Prefer service-level destinations (VIPs, load balancers) over whole subnets.

Step 4: Implement “least routes” plus “least firewall rules”

  • Push only the necessary routes to each group.
  • Default deny between VPN zone and internal zones.
  • Allowlist ports with justification and change control.
  • Explicitly block management networks and sensitive east-west by default.

Step 5: Make DNS and time sane

  • Set VPN DNS servers and internal search domains intentionally.
  • Ensure NTP is reachable; authentication and logs depend on time.
  • Decide split DNS vs full internal recursion depending on policy.

Step 6: Logging and auditing that won’t embarrass you later

  • Log VPN auth events with user identity and assigned IP.
  • Log firewall denies from VPN zone (at least initially).
  • Retain enough data to answer “who accessed what” for your compliance needs.

Step 7: Test like a pessimist

  • Positive tests: required apps reachable from VPN.
  • Negative tests: management VLAN not reachable; user VLAN not reachable (unless required).
  • Performance tests: MTU, throughput, concurrency, failover.

Step 8: Operate it without making every change a fire drill

  • Put firewall rules and VPN config in version control.
  • Have a rollback plan that doesn’t require “the one person who knows it.”
  • Use staged rollouts for policy changes.

FAQ

1) Should VPN clients be in the same VLAN/subnet as office users?

No. Give VPN clients their own routed subnet (or VLAN-equivalent zone) and apply stricter policy.
If you need “same experience,” solve it with DNS and allowed access, not with shared layer‑2 adjacency.

2) Is split tunneling insecure?

Split tunneling is a tradeoff. It reduces load and latency, but assumes endpoints are reasonably managed.
If you can’t trust endpoints, full-tunnel is not a magic fix—attackers still ride the endpoint. Use posture checks and tight internal allowlists either way.

3) Why not just allow VPN users to reach entire server VLANs?

Because you’re granting reconnaissance and lateral movement. It also increases blast radius for compromised endpoints.
If users need “an app,” allow that app (VIP, port). If they need “a whole subnet,” ask why, then make it time-bounded and logged.

4) What’s the cleanest way to block VPN access to management networks?

Put management in dedicated subnets/VLANs and add explicit denies from the VPN zone to those subnets.
Also don’t advertise routes to those subnets to VPN clients unless an admin profile needs them.

5) Do VLANs alone provide security segmentation?

VLANs provide separation at layer 2. Without controlled routing and firewall policy, they’re not security boundaries.
Assume any routed adjacency without policy is a “maybe later” breach path.

6) How do I handle overlapping IP space between remote users and office VLANs?

Best: avoid overlaps through sane corporate address planning (don’t use common home ranges).
If you inherit the problem, last resort is NAT for VPN clients or renumbering. NAT works but complicates identity, logging, and some protocols.

7) Why do I see SYN-RECV states when VPN users connect to internal services?

Often asymmetric routing: the server replies via a different gateway than the one that saw the SYN. Stateful firewalls then drop the return traffic.
Fix the return route to go back through the VPN gateway/firewall, or ensure the firewall sees both directions.

8) Should I route inter-VLAN traffic on the core switch or on the firewall?

For pure performance, switches are great. For security segmentation with audit trails, firewalls are better.
Many environments do both: switching for basic routing, firewall as the enforcement point with explicit zone policies.

9) How do I prove segmentation is working?

Run negative tests from VPN clients (attempt management subnet access, scan known blocked ports), verify firewall counters/logs show drops, and document expected denies.
Segmentation you can’t test is just an architectural bedtime story.

Conclusion: next steps that don’t melt your network

Office VPN plus VLAN segmentation is not hard because of technology. It’s hard because “make it work” is easy, and “make it safe, scoped, and supportable” is a discipline.
The good news: the discipline is mostly routing scope, default-deny policy, and logs you can actually use.

Next steps you can execute this week:

  • Create a dedicated VPN users subnet/zone and stop treating it as “LAN.”
  • Inventory which internal services remote users truly need (destinations + ports), then implement allowlists.
  • Remove broad route pushes; advertise only needed subnets per group.
  • Run the fast diagnosis playbook on one “known flaky” app and fix MTU/return routing issues before they become culture.
  • Turn on logging that maps VPN IPs to identities and keep it long enough to answer uncomfortable questions.

You don’t need a flat network to be productive. You need a network where access is intentional—and where failure modes are predictable, not mysterious.

← Previous
Debian 13: Your server won’t boot after updates — the clean GRUB rollback that actually works
Next →
Licensing traps: when software costs more than hardware

Leave a comment