You can ping the storage. TCP/3260 is open. Discovery returns a target IQN. Then Proxmox (or iscsiadm) tries to log in and faceplants:
“login failed”, “authorization failure”, or the extra-annoying variant: login “succeeds” but you still get no LUNs and therefore no disk.
This is the storage equivalent of getting into the building lobby and finding every door upstairs locked. The target is reachable. The LUN is not.
The fix is rarely magic; it’s usually one missing mapping, a mismatched IQN, or a well-meant security setting that got promoted to “production outage”.
What “target reachable but no LUN” really means
iSCSI is two separate problems wearing one trench coat:
transport (can I reach the target portal and authenticate?) and
presentation (does the target actually present me a block device, i.e., a LUN?).
“Target reachable” means discovery works or at least the TCP connection to 3260 works. It does not mean you are allowed to see a LUN.
It often means you’ve successfully talked to the target daemon, but it has decided—correctly or incorrectly—that you should get zero devices.
The confusing part: initiators, targets, and GUIs will all describe “no LUN for you” differently:
- Login fails: target rejects the session (CHAP mismatch, IQN not allowed, wrong portal group, auth method mismatch).
- Login succeeds, no disk: session is up but LUN masking/ACL/mapping prevents any LUN from being exposed.
- Disk appears then vanishes: multipath misconfig, timeouts, path flapping, or ALUA/TPG weirdness.
- Proxmox storage “OK” but nothing usable: you added an iSCSI storage but forgot the second layer (LVM/LVM-thin/ZFS over it), or the LUN is read-only.
Your job is to decide which layer is lying to you. Then you fix that layer, not the entire stack “just in case”.
Fast diagnosis playbook (do this first)
If you’re on call, you want a path to truth that takes minutes, not a spiritual journey through GUIs.
Here’s the sequence that finds the bottleneck fastest.
1) Verify the target is actually reachable on the right IP and port
Check that you’re reaching the intended portal (correct VLAN, correct interface, no NAT surprise), and that 3260 is open.
If this fails, nothing else matters.
2) Discover targets from the Proxmox node with iscsiadm
If discovery fails, it’s DNS/routing/firewall/portal configuration. If discovery succeeds, move on.
3) Log in manually and check sessions
If login fails: CHAP, IQN ACL, authentication method, or portal group mismatch.
If login succeeds: inspect whether any SCSI devices were mapped.
4) Look for LUNs: SCSI scan + block devices
If you have an iSCSI session but no /dev/sdX (or no multipath device), the target isn’t presenting a LUN to this initiator.
That is almost always LUN masking, missing mapping, or an ACL entry that exists but points to the wrong initiator IQN.
5) Confirm Proxmox is configured for the correct storage type
Proxmox “iSCSI” storage alone is not where VM disks live unless you’re doing direct LUNs. Most deployments pair iSCSI with LVM or LVM-thin.
Don’t “fix” iSCSI when the real issue is you expected iSCSI to behave like NFS.
6) If multipath is involved, verify it before blaming the array
A perfectly mapped LUN can still disappear behind broken multipath settings.
Confirm you see stable paths and one mapped device.
Interesting facts (and a bit of history) that help you debug
- iSCSI was standardized in 2004 (RFC 3720). That’s old enough to rent a car, which explains why many arrays still carry legacy defaults.
- Discovery and login are different steps: SendTargets discovery can work even when login is forbidden by ACLs.
- LUN masking predates iSCSI: the same idea existed for Fibre Channel. Your “no LUN” problem is a classic, just wearing Ethernet.
- Initiator identity is the IQN (or EUI). IPs help, but targets typically authorize by IQN because IPs lie and DHCP is chaos.
- CHAP is optional, and many environments still run without it—usually because someone decided “we’re on a private VLAN” and then forgot to keep it private.
- ALUA exists because storage controllers are political: some paths are “optimized” and some are “non-optimized,” and multipath needs hints to avoid slow paths.
- The default iSCSI port is 3260, but some appliances support multiple portals and port binding; you can be “reachable” on one portal and mapped on another.
- Linux open-iscsi stores node records under
/etc/iscsi/. That persistence is convenient until it isn’t—stale records cause very modern outages. - Proxmox doesn’t magically create a filesystem on an iSCSI LUN. It will happily connect and still leave you with “now what?”.
The mental model: portals, targets, sessions, LUNs, ACLs
When troubleshooting, use a strict vocabulary. It prevents “we fixed it” conversations where nobody agrees on what “it” was.
Portal
A portal is an IP:port pair you connect to, typically TCP 3260. Storage arrays may expose multiple portals (per controller, per VLAN, per interface).
“Reachable target” often means “portal reachable”.
Target IQN
The target is the thing you log into, identified by an IQN like iqn.2020-01.example:storage.lun01.
One target may expose multiple LUNs—or none—depending on mapping.
Initiator IQN
Your Proxmox node (the initiator) also has an IQN, found in /etc/iscsi/initiatorname.iscsi.
Targets commonly use that IQN in access control lists (ACLs). If the IQN doesn’t match exactly, you’re a stranger.
Session vs LUN
A session is a logged-in connection. A LUN is an actual SCSI logical unit presented through that session.
You can have a session with zero LUNs. That’s not “broken networking”; it’s “broken mapping”.
CHAP and ACLs: two locks, different keys
CHAP answers “who are you?” ACLs answer “are you allowed to see this LUN?”. You can pass one and fail the other.
Some targets also have per-portal authentication settings—because one lock was never enough.
Proxmox storage layering
Proxmox can:
- Use iSCSI as a transport for LVM or LVM-thin (most common).
- Use iSCSI for direct LUN assignment to VMs (less common, more brittle).
- Combine iSCSI with multipath (common in serious environments).
If you expect an iSCSI target to appear like NFS “a place to put files”, you’ll chase the wrong problem for hours.
Practical tasks: commands, outputs, and decisions (12+)
Below are concrete checks you can run on a Proxmox node. Each one includes: the command, what output means, and what decision to make next.
Run them as root or with equivalent privileges.
Task 1: Confirm which initiator IQN Proxmox is using
cr0x@server:~$ cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.1993-08.org.debian:01:9d3a3b21f9a7
Meaning: That IQN is your identity to the target. If the target ACL is configured for a different IQN (common after cloning nodes),
you will log in but see no LUNs, or you will be rejected outright.
Decision: Compare this exact string with the target’s initiator/host object. Fix the target ACL or change the initiator IQN intentionally (rare).
Task 2: Verify the iSCSI service is running
cr0x@server:~$ systemctl status iscsid --no-pager
● iscsid.service - Open-iSCSI
Loaded: loaded (/lib/systemd/system/iscsid.service; enabled)
Active: active (running) since Thu 2025-12-26 09:12:01 UTC; 1h 3min ago
Meaning: If iscsid isn’t running, discovery/login can behave inconsistently, especially with automatic startup.
Decision: If inactive/failed, inspect logs and fix service before touching storage configuration.
Task 3: Validate basic reachability to the correct portal IP
cr0x@server:~$ ping -c 2 10.10.20.50
PING 10.10.20.50 (10.10.20.50) 56(84) bytes of data.
64 bytes from 10.10.20.50: icmp_seq=1 ttl=64 time=0.410 ms
64 bytes from 10.10.20.50: icmp_seq=2 ttl=64 time=0.392 ms
--- 10.10.20.50 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
Meaning: ICMP works. That’s not proof of iSCSI success, but it rules out the dumbest routing issue.
Decision: If ping fails, fix routing/VLAN/MTU/firewall basics before anything else.
Task 4: Confirm TCP/3260 is reachable
cr0x@server:~$ nc -vz 10.10.20.50 3260
Connection to 10.10.20.50 3260 port [tcp/iscsi-target] succeeded!
Meaning: The portal is reachable at the transport level.
Decision: If this fails, check firewall rules, storage-side listening IPs, and whether you’re hitting the wrong VLAN/interface.
Task 5: Run discovery from the initiator
cr0x@server:~$ iscsiadm -m discovery -t sendtargets -p 10.10.20.50
10.10.20.50:3260,1 iqn.2023-10.lab:truenas.target01
Meaning: The target answers discovery and advertises a target IQN.
Decision: If discovery returns nothing, fix target portal/discovery settings. If it returns a target, proceed to login.
Task 6: Attempt a manual login and see the real error
cr0x@server:~$ iscsiadm -m node -T iqn.2023-10.lab:truenas.target01 -p 10.10.20.50 --login
Logging in to [iface: default, target: iqn.2023-10.lab:truenas.target01, portal: 10.10.20.50,3260]
Login to [iface: default, target: iqn.2023-10.lab:truenas.target01, portal: 10.10.20.50,3260] successful.
Meaning: Session established. Great. Now check whether any LUNs appear.
Decision: If you get “authorization failure” or “login failed”, jump to CHAP/IQN ACL checks.
Task 7: List iSCSI sessions (do you actually have one?)
cr0x@server:~$ iscsiadm -m session
tcp: [1] 10.10.20.50:3260,1 iqn.2023-10.lab:truenas.target01 (non-flash)
Meaning: If this is empty, you are not logged in. If it’s present, the “no LUN” issue is likely mapping/ACL.
Decision: With a session present, stop fiddling with firewall rules and start verifying LUN presentation.
Task 8: Check kernel messages for SCSI discovery results
cr0x@server:~$ dmesg -T | tail -n 25
[Thu Dec 26 10:18:44 2025] scsi host12: iSCSI Initiator over TCP/IP
[Thu Dec 26 10:18:44 2025] scsi 12:0:0:0: Direct-Access LIO-ORG lun01 4.0 PQ: 0 ANSI: 5
[Thu Dec 26 10:18:44 2025] sd 12:0:0:0: Attached scsi generic sg3 type 0
[Thu Dec 26 10:18:44 2025] sd 12:0:0:0: [sdc] 209715200 512-byte logical blocks: (107 GB/100 GiB)
[Thu Dec 26 10:18:44 2025] sd 12:0:0:0: [sdc] Write Protect is off
[Thu Dec 26 10:18:44 2025] sd 12:0:0:0: [sdc] Mode Sense: 43 00 00 00
[Thu Dec 26 10:18:44 2025] sd 12:0:0:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
Meaning: This is the happy case: the target presented a LUN and Linux created /dev/sdc.
Decision: If you see the session but no “Direct-Access” device lines, the target is presenting zero LUNs. Fix mapping/ACL on the target.
Task 9: Enumerate block devices and confirm the iSCSI disk exists
cr0x@server:~$ lsblk -o NAME,SIZE,TYPE,MODEL,SERIAL,TRAN
NAME SIZE TYPE MODEL SERIAL TRAN
sda 447.1G disk Samsung_SSD S6Z2NX0R123456 sata
├─sda1 1G part sata
└─sda2 446.1G part sata
sdc 100G disk LIO-ORG_lun01 1234567890abcdef iscsi
Meaning: A disk with TRAN = iscsi exists. If none appear, you don’t have a LUN.
Decision: If the disk exists but Proxmox can’t use it, move up the stack (multipath, LVM, Proxmox storage config).
Task 10: Verify SCSI path and LUN IDs
cr0x@server:~$ lsscsi -tv
[12:0:0:0] disk LIO-ORG lun01 4.0 /dev/sdc
state=running
Meaning: You can see the host:bus:target:lun tuple. If you see hosts but no LUN entries, again: mapping.
Decision: If LUN exists but is wrong size, stop—someone mapped the wrong extent/volume.
Task 11: Check open-iscsi node records (stale config bites)
cr0x@server:~$ iscsiadm -m node
iqn.2023-10.lab:truenas.target01 10.10.20.50:3260,1 default
Meaning: Node records persist. If you changed CHAP credentials on the target but not here, login will fail forever with confidence.
Decision: If the record points to old portals or wrong iface, update or delete the node record and rediscover.
Task 12: Inspect CHAP settings for a node
cr0x@server:~$ iscsiadm -m node -T iqn.2023-10.lab:truenas.target01 -p 10.10.20.50 -o show | egrep 'auth|username|password'
node.session.auth.authmethod = CHAP
node.session.auth.username = proxmox01
node.session.auth.password = ********
Meaning: Auth method and username are set. If target expects “None” and you force CHAP, login fails. If passwords differ, login fails.
Decision: Align initiator config with the target. Don’t “try random combinations”; treat this like SSH keys, not like a vending machine.
Task 13: Fix CHAP values (and know you changed them)
cr0x@server:~$ iscsiadm -m node -T iqn.2023-10.lab:truenas.target01 -p 10.10.20.50 --op=update -n node.session.auth.authmethod -v CHAP
cr0x@server:~$ iscsiadm -m node -T iqn.2023-10.lab:truenas.target01 -p 10.10.20.50 --op=update -n node.session.auth.username -v proxmox01
cr0x@server:~$ iscsiadm -m node -T iqn.2023-10.lab:truenas.target01 -p 10.10.20.50 --op=update -n node.session.auth.password -v 'CorrectHorseBatteryStaple'
Meaning: You updated the persistent node record.
Decision: Re-login and verify sessions and LUNs. If still no LUN, mapping/ACL is the culprit, not CHAP.
Task 14: Log out and log back in (clean state)
cr0x@server:~$ iscsiadm -m node -T iqn.2023-10.lab:truenas.target01 -p 10.10.20.50 --logout
Logging out of session [sid: 1, target: iqn.2023-10.lab:truenas.target01, portal: 10.10.20.50,3260]
Logout of [sid: 1, target: iqn.2023-10.lab:truenas.target01, portal: 10.10.20.50,3260] successful.
cr0x@server:~$ iscsiadm -m node -T iqn.2023-10.lab:truenas.target01 -p 10.10.20.50 --login
Logging in to [iface: default, target: iqn.2023-10.lab:truenas.target01, portal: 10.10.20.50,3260]
Login to [iface: default, target: iqn.2023-10.lab:truenas.target01, portal: 10.10.20.50,3260] successful.
Meaning: You forced a new negotiation and re-scan.
Decision: If session comes up but LUN still absent, stop touching initiator settings and fix target-side mapping.
Task 15: If multipath is expected, confirm it’s not the reason your LUN “disappeared”
cr0x@server:~$ systemctl status multipathd --no-pager
● multipathd.service - Device-Mapper Multipath Device Controller
Loaded: loaded (/lib/systemd/system/multipathd.service; enabled)
Active: active (running) since Thu 2025-12-26 09:20:10 UTC; 55min ago
Meaning: Multipath daemon is running.
Decision: If it’s not running but you have multiple portals, you may get duplicate devices or flapping paths; fix multipath before putting LVM on it.
Task 16: Inspect multipath mapping
cr0x@server:~$ multipath -ll
mpatha (36001405f2a3b4c5d6e7f890123456789) dm-3 LIO-ORG,lun01
size=100G features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=1 status=active
| `- 12:0:0:0 sdc 8:32 active ready running
`-+- policy='service-time 0' prio=1 status=enabled
`- 13:0:0:0 sdd 8:48 active ready running
Meaning: You have a single multipath device (/dev/mapper/mpatha) backed by two paths. That’s what you want.
Decision: Use the multipath device for LVM/LVM-thin; avoid using /dev/sdc directly or you’ll create a choose-your-own-adventure failure later.
Task 17: Check for duplicate SCSI devices when multipath is not configured
cr0x@server:~$ ls -l /dev/disk/by-path/ | egrep 'iscsi|ip-10.10.20'
lrwxrwxrwx 1 root root 9 Dec 26 10:19 ip-10.10.20.50:3260-iscsi-iqn.2023-10.lab:truenas.target01-lun-0 -> ../../sdc
lrwxrwxrwx 1 root root 9 Dec 26 10:19 ip-10.10.20.51:3260-iscsi-iqn.2023-10.lab:truenas.target01-lun-0 -> ../../sdd
Meaning: Two portals, two /dev/sdX devices. Without multipath, that’s two independent disks from Linux’s perspective, even if they’re the same LUN.
Decision: Enable and configure multipath before building LVM or filesystems, or you’ll eventually corrupt something with enthusiasm.
Task 18: Check Proxmox storage configuration references
cr0x@server:~$ grep -nE 'iscsi|lvm|lun|portal|target' /etc/pve/storage.cfg
12:iscsi: iscsi-san
13: portal 10.10.20.50
14: target iqn.2023-10.lab:truenas.target01
15:lvmthin: vmdata
16: vgname vg_iscsi_vmdata
17: thinpool data
18: content images,rootdir
Meaning: Proxmox separates iSCSI transport (the “iscsi” stanza) from the actual VM storage (LVM-thin on a VG).
Decision: If you only have the iSCSI stanza but no LVM/LVM-thin/ZFS layer, that’s configuration incomplete, not a login bug.
Task 19: Confirm a volume group exists on the iSCSI/multipath device (if using LVM)
cr0x@server:~$ pvs
PV VG Fmt Attr PSize PFree
/dev/mapper/mpatha vg_iscsi_vmdata lvm2 a-- 100.00g 20.00g
Meaning: LVM sees the disk and it’s being used as a PV, which implies the LUN is visible and stable.
Decision: If pvs shows nothing, you either have no LUN, no multipath device, or you haven’t initialized it. Pick the right fix.
Task 20: Look at iSCSI logs for the real rejection reason
cr0x@server:~$ journalctl -u iscsid -n 80 --no-pager
Dec 26 10:05:11 server iscsid[1243]: iSCSI daemon started
Dec 26 10:06:02 server iscsid[1243]: connection1:0 login rejected: Initiator failed authentication with target
Dec 26 10:06:02 server iscsid[1243]: connection1:0 login failed to authenticate with target iqn.2023-10.lab:truenas.target01
Meaning: Authentication failure. That’s CHAP or target-side auth mode mismatch, not “no LUN mapping”.
Decision: Fix CHAP credentials and auth method first; only then worry about LUN mapping.
Proxmox specifics: iSCSI storage vs LVM over iSCSI (and where people get lost)
Proxmox gives you enough rope to build solid storage—or to create a knotty, time-consuming incident.
The Proxmox UI makes it easy to add “iSCSI” and assume you’re done. You’re not, unless you’re intentionally doing raw LUNs per VM.
Pattern A (common): iSCSI + LVM-thin
You connect the host to a LUN, then build an LVM VG and thin pool on top. Proxmox stores VM disks as logical volumes.
This is the path most people take because it’s efficient, supported, and relatively predictable.
If you’re seeing “target reachable but no LUN”, you’re stuck at the very bottom: Proxmox can’t even see the block device to create the PV.
Pattern B (less common): direct LUN per VM
You present multiple LUNs and map them directly, often for database appliances or when you want array-native snapshots/replication.
This requires disciplined LUN mapping and naming, otherwise you’ll attach the wrong LUN to the wrong VM and have a bad afternoon.
Proxmox cluster angle: every node must be allowed
If you’re in a cluster and you want any node to run the VM, every node must be able to see the same backing storage.
“Works on node1” is a trap. It only means node1’s initiator IQN is in the target ACL.
Joke #1: iSCSI is like office security—your badge works perfectly right up until it doesn’t, and then it’s suddenly your problem.
Where it breaks: the real root causes behind “login failed” and “no LUN”
1) Wrong initiator IQN (or cloned nodes with identical IQNs)
Proxmox nodes built from templates often inherit the same InitiatorName. Targets see two hosts claiming the same identity.
Best case, one works and the other gets booted. Worst case, they alternately steal the session and you get intermittent storage timeouts.
Fix it by generating unique IQNs per node and updating target ACLs accordingly. Don’t “just allow all initiators” unless you enjoy explaining yourself later.
2) CHAP mismatch or CHAP required on one side but not the other
Arrays and target stacks differ: some require CHAP per target, some per portal, some per initiator group. Proxmox stores CHAP in the node record.
If someone rotates credentials on the array and forgets to update Proxmox, the failure is deterministic.
3) LUN exists but is not mapped to this initiator (classic)
The target admin created the volume/extent and even created the target IQN. But they didn’t map the LUN to the host/initiator group, or they mapped it to the wrong group.
Discovery sees the target, login might succeed, and you still get zero devices.
This is the number one cause of “target reachable but no LUN”. It’s also the easiest to fix, which is why it’s so irritating.
4) Portal group / network binding mismatch
Many arrays have multiple portal groups. You might be logging into 10.10.20.50 but the LUN mapping applies to the portal group on 10.10.30.50.
Or the target is bound to one interface and discovery is hitting another service IP that doesn’t serve that target.
5) ALUA/multipath weirdness that looks like “no LUN”
If multipath is partially configured, you can see devices briefly then lose them. Proxmox may show storage as degraded or disappear.
This is not “no LUN”, but it’s often misdiagnosed as such.
6) Stale initiator records pointing to an old target config
If you changed target IQN names, portals, or auth requirements, open-iscsi can keep trying the old settings.
Deleting and re-discovering node records is sometimes the cleanest fix—done carefully, during a maintenance window, and with awareness of what VMs are using the LUN.
7) Proxmox expecting LVM-thin but you presented a LUN with an existing filesystem
You can present a LUN that already has something on it (old LVM metadata, leftover partitions). Proxmox may refuse to initialize, or worse, you initialize the wrong disk.
If you’re migrating, be explicit about the plan: import vs reinitialize.
Quote (paraphrased idea) from Werner Vogels (Amazon): “Everything fails all the time; design and operate with that assumption.” — Werner Vogels, paraphrased idea
Common mistakes: symptom → root cause → fix
Symptom: Discovery works, login fails with “authorization failure”
Root cause: CHAP mismatch or target requires CHAP but initiator is set to None (or vice versa).
Fix: Align auth method and credentials on both sides. Verify with journalctl -u iscsid and iscsiadm -m node -o show.
Symptom: Login succeeds, iscsiadm -m session shows a session, but lsblk shows no iSCSI disk
Root cause: No LUN mapped to this initiator IQN, or LUN masking denies it.
Fix: On the target, map the LUN/extent to the correct initiator group/ACL and ensure the LUN ID is enabled. Then re-scan or re-login.
Symptom: Works on one Proxmox node, fails on another in the same cluster
Root cause: Target ACL includes only one initiator IQN. Or nodes share the same IQN due to cloning.
Fix: Give each node a unique initiator IQN and add all nodes to the target’s initiator group. Validate sessions from each node.
Symptom: Two disks appear for the same LUN (/dev/sdc and /dev/sdd), Proxmox gets confused
Root cause: Two portals without multipath; Linux sees two independent SCSI devices.
Fix: Configure multipath properly and use /dev/mapper/mpathX devices for LVM. Blacklist local boot disk.
Symptom: LUN appears, then disappears after minutes or under load
Root cause: MTU mismatch/jumbo frames issue, path flapping, or timeout settings too aggressive.
Fix: Validate MTU end-to-end, check switch counters, and review iSCSI timeouts. Stabilize network before tuning performance.
Symptom: Proxmox UI shows iSCSI storage “OK”, but you cannot create VM disks there
Root cause: You added iSCSI transport but didn’t create the LVM/LVM-thin storage on top (or you mapped no LUN).
Fix: Ensure the iSCSI LUN exists as a block device, then create a VG and thin pool and add it as LVM-thin in Proxmox.
Symptom: Login fails only after a reboot
Root cause: Startup ordering issues, wrong network interface binding, or missing node.startup settings for iSCSI.
Fix: Ensure network is up before iSCSI login, and set node startup to automatic where appropriate. Confirm with systemd logs.
Symptom: You see the LUN, but it’s read-only
Root cause: Target-side read-only mapping, snapshot/clone export, or SCSI reservation conflicts.
Fix: Fix mapping mode on the array; check for persistent reservations if clustering is involved.
Three corporate mini-stories from the trenches
Incident caused by a wrong assumption: “Discovery means it’s mapped”
A mid-sized company migrated from NFS to iSCSI for “better performance” (which was true, but incomplete). The storage admin created a target,
presented it on the right VLAN, and sent the IQN to the virtualization team. They added it in Proxmox and saw the target in discovery.
Everyone relaxed. Naturally, the maintenance window immediately turned into overtime.
The Proxmox nodes could log in, but no LUN ever appeared. The virtualization team assumed it was an initiator bug because “we can reach the target”.
They spent an hour toggling CHAP settings, restarting services, and re-adding storage in the GUI, which is the operational equivalent of shaking a printer.
The real issue: the LUN existed, but it was mapped to an initiator group containing the wrong IQN—an older ESXi host from a retired cluster.
Discovery still returned the target because discovery was open. Login worked because CHAP was disabled. But LUN masking did its job and showed nothing.
The fix took two minutes: add the correct initiator IQNs to the mapping and rescan. The lesson lasted longer:
stop using discovery success as proof of usable storage. Treat it like DNS: useful, not authoritative.
Optimization that backfired: jumbo frames with zero patience
Another shop had iSCSI working, but they wanted lower CPU and better throughput. Someone enabled MTU 9000 on the storage NICs and the Proxmox NICs.
They did not touch the switches because the switches were “already configured for jumbo somewhere”. That “somewhere” was a different set of ports.
The result was not immediate failure. That’s the worst kind. Discovery worked. Login worked. LUNs appeared.
Under load—VM backups, disk scrubs, or a busy database—paths started dropping. Multipath would fail over, then fail back, then mark paths dead.
VMs froze in ways that looked like guest OS bugs.
The team chased ghosts: Proxmox kernel versions, multipath settings, “maybe the array is overloaded”. Meanwhile, the network counters told the boring truth:
fragmentation, drops, and intermittent blackholing of large frames. iSCSI is sensitive to loss and latency spikes; it’s block I/O, not a casual file copy.
The fix was unglamorous: enforce consistent MTU end-to-end, or run MTU 1500 everywhere. They chose the second option for simplicity,
and performance stayed acceptable—because stable is faster than theoretical.
Boring but correct practice that saved the day: explicit host groups and a pre-flight test
A more disciplined team had a rule: every new Proxmox node must pass a storage pre-flight before it can join the cluster.
The pre-flight included verifying a unique initiator IQN, confirming it appears in the array’s initiator group, and logging into all portals.
They also required a test LUN mapping that could be safely attached and detached.
One day, a replacement node arrived during a time crunch. It was built from an old image. The iSCSI IQN was duplicated.
If they had added it directly to the cluster, it would have raced the existing node for sessions and likely caused path resets.
The pre-flight caught it immediately. They regenerated the initiator IQN, updated the host group mapping, and only then joined the node.
No outage. No drama. The best incident is the one you never get paged for.
Joke #2: The only thing more persistent than iSCSI node records is the person who insists “it worked in the lab.”
Checklists / step-by-step plan
Step-by-step: fix “target reachable but no LUN” (most common case)
-
Confirm session state:
runiscsiadm -m session. If no session, you have a login problem; skip to auth/ACL steps. -
Confirm initiator IQN:
check/etc/iscsi/initiatorname.iscsi. Make sure it matches the target ACL exactly. -
Check for LUN devices:
runlsblk -o NAME,TRAN,MODEL,SIZEanddmesg -T. If no iSCSI disk, you have a mapping issue. -
Fix target-side mapping:
map the LUN/extent to the initiator group that includes your Proxmox node IQN(s). Ensure LUN ID is enabled and not filtered. -
Re-login or rescan:
logout/login withiscsiadm. Confirm the block device appears. -
Only then configure Proxmox storage layers:
create PV/VG/thinpool if using LVM-thin, then add LVM-thin in Proxmox.
Step-by-step: fix “login failed” (auth/ACL class of failures)
- Confirm TCP reachability:
nc -vz <portal> 3260. - Confirm discovery:
iscsiadm -m discovery -t sendtargets -p <portal>. - Read the daemon logs:
journalctl -u iscsid -n 100and capture the exact reason. - Verify CHAP config:
iscsiadm -m node -o showand align with target settings. - Verify initiator IQN is allowed: check target ACL/host group. Don’t rely on IP-based rules unless you enjoy ambiguity.
- Remove stale node entries if necessary: delete only the relevant node record, rediscover, and reconfigure auth.
Step-by-step: Proxmox cluster readiness for iSCSI
- Every node has a unique IQN (verify on each node).
- Target ACL includes all node IQNs that may run VMs.
- Multipath is configured if you have multiple portals/controllers.
- Use stable device naming: prefer multipath WWID or
/dev/disk/by-idlinks; avoid raw/dev/sdX. - Test failover: disable one path and confirm I/O continues (carefully, with a safe test volume).
FAQ
1) Why can I discover the target but not see any LUNs?
Because discovery is just “what targets exist here?”. LUN visibility is controlled by mapping and ACLs. Discovery can be open while LUNs are locked down.
2) If login succeeds, doesn’t that prove the LUN is mapped?
No. Login proves you established a session. LUN mapping is a separate step. A session with zero LUNs is normal when masking denies access.
3) Proxmox says the iSCSI storage is online, but I can’t store VM images there. Why?
Proxmox “iSCSI” storage is typically just the transport definition. You usually need LVM or LVM-thin on top to store VM disks.
Add the iSCSI target, then build a VG/thin pool and add an LVM-thin storage entry.
4) Do I need multipath?
If you have multiple storage ports/controllers and you expect redundancy, yes. Without multipath you may get duplicate devices or path failures that look like random disk issues.
If you truly have one portal and one path, multipath is optional but still common in standardized builds.
5) What’s the fastest way to prove this is a mapping issue, not Proxmox?
From the node: if iscsiadm -m session shows a session but lsblk shows no iSCSI disk and dmesg shows no SCSI disk attach,
the target is presenting no LUNs to you. That’s mapping/ACL.
6) Can two Proxmox nodes share the same initiator IQN?
Don’t. Some targets will tolerate it until they don’t. You’ll get session stealing, reservations behaving oddly, and intermittent storage resets.
Make IQNs unique per node, always.
7) Should I use CHAP?
Yes if you can. It’s not perfect security, but it prevents casual cross-talk between initiators and targets on shared networks.
If you skip CHAP, compensate with strict VLAN isolation and ACL discipline.
8) I changed CHAP credentials on the array. Why didn’t Proxmox pick it up?
open-iscsi persists node settings. You must update the node record (or re-discover and reconfigure).
Check with iscsiadm -m node -o show.
9) Why do I see the LUN as /dev/sdc on one boot and /dev/sdd on another?
Linux assigns /dev/sdX names dynamically. Use /dev/disk/by-id or multipath devices, not raw /dev/sdX, in any persistent config.
10) Is “no LUN” ever caused by Proxmox itself?
Rarely. Proxmox uses the Linux iSCSI stack. If Linux can’t see a LUN, Proxmox can’t either. Focus on the initiator/target configuration, not the UI.
Conclusion: next steps that prevent reoccurrence
When Proxmox says “iSCSI login failed” or you have a reachable target with no LUN, resist the urge to thrash.
Prove the layer that’s failing: transport, authentication, or presentation.
- First: validate reachability and discovery (
nc,iscsiadm -m discovery). - Second: validate login and read the real error (
iscsiadm --login,journalctl -u iscsid). - Third: validate LUN presentation (
dmesg,lsblk,lsscsi). - Then: only after the block device exists, build the Proxmox storage layer (LVM/LVM-thin) and, if needed, multipath.
Operationally, the best fix is policy: unique IQNs per node, explicit initiator groups on the target, and a pre-flight check before any node joins the cluster.
It’s boring. It works. Production tends to prefer those two qualities.