·15 min read·Agentic AI

How AI Agents Are Changing the Way SOCs Hunt, Report, and Stay Current

AI agents are compressing SOC cycle times right now. Here are 4 concrete use cases for threat hunting, reporting, and posture assessment.

AI agentsSOC automationMicrosoft Security CopilotMicrosoft Sentinelthreat hunting
How AI Agents Are Changing the Way SOCs Hunt, Report, and Stay Current

How AI Agents Are Changing the Way SOCs Hunt, Report, and Stay Current

By the time you've correlated the right logs, verified the DBIR stats, mapped the threats to your tool stack, and written the weekly posture summary, half your week is gone.

That's the part AI agents are increasingly good at.

This article covers four specific use cases where agents are compressing SOC cycle times today — not theory, not roadmap. If you've been following this series, you now have the threat picture (pillar), the tool map (Article 1), and the log architecture (Article 2). This is the automation layer that ties it together.


Two paths to get started

Before diving into use cases, let's be clear about your options. There are two viable paths right now:

Path 1: Microsoft Security Copilot. Deploy this week with existing licensing. If you have Microsoft 365 Copilot or standalone Security Copilot, you can start using promptbooks and natural language queries immediately. Lower customization, higher speed to value.

Path 2: Custom agents via Sentinel API + MCP. More flexible, requires Python and a few hours of setup. Connect Claude (or Azure OpenAI) to your Sentinel workspace via REST API, build your own prompts, iterate. Higher customization, more investment upfront.

Which you start with depends on your licensing and how much you want to customize. Both are worth knowing.


A third path: the Sentinel MCP server

Between Security Copilot (path 1) and a fully custom agent build (path 2) sits a third option worth knowing, especially if you're in a local development or early evaluation phase.

The Sentinel MCP server exposes a set of tools that give a Claude Desktop or Claude Code session direct, conversational access to your Sentinel workspace. Connection requires the Microsoft Sentinel MCP endpoint URL and a valid Entra service principal. Data lake query access requires the data lake tier configured on your workspace.

Tools available today (generally available):

  • list_sentinel_workspaces — enumerate workspaces in a subscription
  • search_tables — discover what log tables exist in your workspace and their schemas
  • query_lake — run KQL against data lake storage with the full retention horizon

Tools in preview:

  • analyze_user_entity — entity behavior analysis for a specified UPN; returns sign-in risk score, behavioral anomalies, recent activity summary
  • analyze_url_entity — reputation and context lookups against the Threat Intelligence table

When path 3 fits: You want to run local threat hunting experiments before committing to a production agent build. You prefer natural language to KQL translation as an analyst productivity tool rather than an automated pipeline. You're evaluating use cases and want to see real results against real data before writing code.

When path 2 fits better: You need unattended automation, workflow integration, or a repeatable production pipeline. Path 3 is a human-in-the-loop experience by design.


Use case 1: Automated posture reporting

Weekly security posture summaries used to take 2-3 hours of manual work. Pull Secure Score from Defender for Cloud. Aggregate Sentinel incident trends. Check compliance status. Format it for leadership. Repeat next week.

An agent that queries these data sources and generates a structured markdown summary can compress that to 10 minutes of review.

What's working today

Security Copilot approach: Use the built-in promptbook for weekly posture summary. It pulls from Defender for Cloud recommendations, Secure Score trends, and recent incident data. Output is structured enough for exec reporting with minimal editing.

Custom agent approach: Export Sentinel workbook data to agent context, query Defender for Cloud APIs for current posture, and generate summaries on a schedule.

Here's a minimal Python example connecting to Sentinel for incident data:

import requests
from datetime import datetime, timedelta

def get_sentinel_incidents(workspace_id, subscription_id, token):
    """Pull recent incidents from Sentinel for posture reporting."""
    
    url = f"https://management.azure.com/subscriptions/{subscription_id}/resourceGroups/your-rg/providers/Microsoft.OperationalInsights/workspaces/{workspace_id}/providers/Microsoft.SecurityInsights/incidents"
    
    params = {
        "api-version": "2023-11-01",
        "$filter": f"properties/createdTimeUtc ge {(datetime.utcnow() - timedelta(days=7)).isoformat()}Z"
    }
    
    headers = {"Authorization": f"Bearer {token}"}
    
    response = requests.get(url, headers=headers, params=params)
    return response.json()

def summarize_for_posture(incidents):
    """Structure incident data for agent context."""
    
    summary = {
        "total_incidents": len(incidents.get("value", [])),
        "by_severity": {},
        "by_status": {}
    }
    
    for incident in incidents.get("value", []):
        props = incident.get("properties", {})
        severity = props.get("severity", "Unknown")
        status = props.get("status", "Unknown")
        
        summary["by_severity"][severity] = summary["by_severity"].get(severity, 0) + 1
        summary["by_status"][status] = summary["by_status"].get(status, 0) + 1
    
    return summary

Feed that structured data to Claude via API with a prompt like:

You are a security analyst preparing a weekly posture summary for leadership.

Given the following incident data from the past 7 days:
{incident_summary}

And the following Secure Score data:
{secure_score_data}

Generate a concise executive summary that covers:
1. Overall security posture trend (improving/stable/declining)
2. Notable incidents requiring leadership awareness
3. Top 3 recommended actions for the coming week

Keep it under 500 words. Use clear, non-technical language where possible.

Agents don't write better summaries than a skilled analyst. They write faster summaries that are good enough to start with. Your 10 minutes of review turns 80% output into 100% output.


Use case 2: Threat research contextualization

When a new campaign drops — a CISA advisory, a new LOLBAS technique gaining traction, a zero-day making rounds — the SOC needs to answer quickly: Are we tooled to detect this? Do we have the right logs?

That question used to take a morning of research. Cross-reference the advisory against your MITRE ATT&CK mapping. Check your Sentinel coverage. Review detection rules. Write up the gap analysis.

An agent with access to your environment data can produce that gap analysis in minutes.

What's working today

For this use case, the custom agent path shines. Security Copilot can help with research synthesis, but contextualizing against your specific tool coverage requires feeding it your data.

The architecture:

  1. Maintain a current MITRE ATT&CK mapping for your environment (what techniques your tools can detect)
  2. Keep your Sentinel detection rule coverage documented
  3. Feed both to the agent alongside the new threat intelligence

Here's how I structure the prompt:

threat_context_prompt = """
You are a threat intelligence analyst supporting SOC operations.

NEW THREAT INFORMATION:
{threat_advisory}

OUR CURRENT COVERAGE:
MITRE ATT&CK Techniques Detected: {mitre_coverage}
Sentinel Detection Rules Active: {detection_rules}
Log Sources Ingesting: {log_sources}

TASK:
1. Identify which techniques in this threat advisory we can currently detect
2. Identify gaps where we lack detection capability
3. Recommend specific actions to close gaps (new detection rules, log source additions, or tool procurements)
4. Assess overall readiness (High/Medium/Low) with justification

Be specific. Reference actual rule names and log sources from our coverage data.
"""

The value here is compression. What used to be a half-day of manual cross-referencing becomes a 15-minute review cycle. The agent does the correlation; the analyst validates and decides.


Use case 3: Natural language to KQL (Hunting)

"Show me all accounts that signed in from two countries in the same hour in the last 30 days."

That's a reasonable hunting hypothesis. Writing the KQL takes 15 minutes if you know KQL well, longer if you don't, and the query is often wrong the first time. You run it, adjust for your schema, run it again, tweak the time windows.

Security Copilot's natural language to KQL is genuinely useful here.

What's working today

In Sentinel's advanced hunting workspace, Security Copilot can translate natural language into working KQL. The workflow:

  1. Describe what you're looking for in plain English
  2. Copilot generates the query
  3. You validate against your actual data schema
  4. Refine and iterate

Here's the critical limitation: the queries are starting points, not finished detections.

Copilot doesn't know your custom tables, your specific schema variations, or your performance considerations. The generated query might reference SigninLogs when your environment uses AADSignInLogs. It might not account for your data retention or partition structure.

What works:

  • Use Copilot for the initial structure
  • Always validate column names against your actual schema
  • Test performance before deploying as scheduled analytics

What doesn't work:

  • Expecting production-ready queries on first generation
  • Complex multi-table joins without manual refinement
  • Queries that need to account for your specific data quirks

The time savings are real — going from hypothesis to working query in 5 minutes instead of 30 — but the human validation step is non-negotiable.


Use case 4: Defender for Cloud posture loop

Defender for Cloud generates recommendations by the dozen. Enable this encryption. Restrict that network access. Update those permissions. Most SOCs do a manual quarterly review, which means recommendations pile up and the backlog feels unmanageable.

An agent that loops through open recommendations, categorizes by risk tier, drafts remediation actions, and outputs a prioritized ticketing summary changes the quarterly cadence to weekly.

What's working today

This is more ops automation than detection, but the cumulative security posture impact is significant.

The pattern:

import requests

def get_defender_recommendations(subscription_id, token):
    """Pull current Defender for Cloud recommendations."""
    
    url = f"https://management.azure.com/subscriptions/{subscription_id}/providers/Microsoft.Security/assessments"
    
    params = {"api-version": "2021-06-01"}
    headers = {"Authorization": f"Bearer {token}"}
    
    response = requests.get(url, headers=headers, params=params)
    return response.json()

def categorize_recommendations(recommendations):
    """Sort recommendations by severity and resource type."""
    
    categorized = {
        "high": [],
        "medium": [],
        "low": []
    }
    
    for rec in recommendations.get("value", []):
        props = rec.get("properties", {})
        severity = props.get("status", {}).get("severity", "low").lower()
        
        categorized[severity].append({
            "name": props.get("displayName"),
            "description": props.get("description"),
            "remediation": props.get("remediation", {}).get("description"),
            "resource": rec.get("id")
        })
    
    return categorized

Feed the categorized recommendations to the agent with a prompt that generates ticket-ready summaries:

You are a cloud security engineer preparing remediation tickets.

Given the following Defender for Cloud recommendations:
{categorized_recommendations}

For each HIGH severity item, generate:
1. A ticket title (clear, actionable)
2. A brief description of the risk
3. Specific remediation steps
4. Estimated effort (hours)

Output as a table I can paste into our ticketing system.

The agent doesn't execute the remediation — that still requires human judgment and change management. But it transforms a pile of recommendations into an actionable work queue.


Stats that matter

Research across enterprise implementations shows measurable impact:

  • 50-70% faster incident resolution with AI-assisted triage (PwC Insights, 2025)
  • 40% reduction in operational overhead for routine reporting (PwC Insights, 2025)
  • 108 days faster breach containment in organizations using AI-assisted SOC operations (Agentic Digital Workers Research, 2024)
  • Up to 80% reduction in Level-1 support ticket volumes through AI-assisted triage (Agentic Digital Workers Research, 2024)

These come from practitioner studies tracking actual implementations, not vendor marketing claims. The variance is significant because results depend heavily on data quality and implementation maturity.


Use case 5: Entity triage and blast radius analysis

This use case addresses the most expensive phase of incident response — determining whether an affected account is compromised and, if so, what it had access to.

An analyst manually linking user identity, recent authentications, peer group behavior, and resource access takes 45-90 minutes per entity on a complex incident. An agent running a structured entity triage workflow can complete the same analysis in under 10 minutes.

The workflow

  1. Retrieve entity from incident object. Pull the UPN from the incident's entity object directly, not from the alert title or description. Alert text is attacker-influenced. Entity objects are structured telemetry.

  2. Run analyze_user_entity against the UPN. This returns sign-in risk score, recent authentication summary, behavioral anomaly flags, and associated device identifiers.

  3. Query query_lake for 30-day authentication baseline. Compare the incident's access pattern against historical behavior. Look for: first-time resource access, unusual authentication hours, impossible travel, sign-in from previously unseen IP ranges.

  4. Query blast_radius graph. What resources did this identity have access to? What groups, applications, and service principals are in scope? This defines the containment boundary if compromise is confirmed.

  5. Cross-validate analyze_user_entity verdict against SigninLogs. This step is mandatory. AI agent analysis of entity data is subject to cross-prompt injection manipulation (covered in the guardrails section below). Do not act on the entity verdict until you have cross-referenced it against raw SigninLogs or AADSignInEventsBeta records.

The output is a structured triage brief: entity risk assessment, baseline deviation summary, blast radius scope, and a recommended action — monitor, contain, or escalate. The analyst reviews and approves. The agent does not take containment actions autonomously.

Measured time savings: 45-90 minutes down to 5-10 minutes per entity, based on practitioner deployment data.


Use case 6: Threat intelligence and adversary context

When a Defender alert or Sentinel incident includes indicators — IPs, hashes, domains, usernames — an analyst manually correlating those against threat intelligence data, ATT&CK technique mappings, and prior campaign records is looking at 1-2 hours per incident.

This use case applies when you have indicators and need adversary context: which campaign uses this TTP, what have we seen from this infrastructure before, is this a known tool or a custom implant.

The workflow

  1. Extract indicators from incident telemetry. Pull IPs, file hashes, domains, and process names from the incident entity list or alert details.

  2. Query the Sentinel Threat Intelligence table (ThreatIntelligenceIndicator). Cross-reference indicators against your existing TI feed data. This stays entirely local — no external API calls to third-party TI services.

  3. Query query_lake for historical indicator matches. Has this IP, domain, or hash appeared in your environment before? When and where?

  4. Map to ATT&CK techniques. The agent maps observed behaviors — process injection, scheduled task creation, remote service installation — to MITRE ATT&CK technique IDs from the alert enrichment data Sentinel already carries.

  5. Generate adversary context brief. Indicator summary, historical environment activity, ATT&CK chain reconstruction with confidence rating (high/medium/low based on available evidence), recommended detection rules to validate coverage.

External TI API calls are intentionally excluded from this workflow. Calling third-party threat intelligence APIs from an automated agent loop introduces SSRF risk — an attacker who controls indicator data could construct payloads that manipulate outbound lookups. Use local ThreatIntelligenceIndicator data and query_lake. Microsoft's built-in TI ingestion handles external feed integration separately, where it belongs.

Time savings: 1-2 hours down to 10-15 minutes per incident context brief.


When AI verdicts become an attack surface

AI agent triage is useful. It's also a new attack surface if you don't account for it.

Three specific risks are documented against entity triage and threat analysis agents:

Verdict laundering. An agent that processes attacker-controlled content — a phishing email body, a filename, a calendar invite subject — and produces a triage verdict has potentially been influenced by that content. If the attacker understands the agent's verdict format, they can craft content that biases the output toward "benign" before a human ever reviews it.

UPN injection via XPIA. Cross-prompt injection attacks embed instructions inside data the agent reads — a username, a document title, a Teams meeting invite. The agent reads the content as part of its analysis. The embedded instruction tells the agent to modify its verdict, skip a validation step, or misattribute the incident. This is a documented attack pattern against production AI deployments, not a theoretical risk.

Time window manipulation. An attacker who understands that your agent compares recent activity against a 30-day baseline can operate below the threshold. Slow-and-low credential abuse that stays within the agent's "normal" window won't generate anomaly flags. The counter is to vary the analysis window and not rely on a single baseline comparison.

The rule

Never close a security incident based solely on an AI agent verdict.

The agent proposes. The analyst reviews. The verdict is not final until a human has cross-referenced the agent's conclusion against source telemetry — raw SigninLogs, raw alert data, raw process execution records.

This isn't about not trusting AI. It's about understanding the threat model. If your agent can be manipulated by attacker-controlled input, the adversary will find that manipulation path before you do if you haven't accounted for it.

Operational requirements for any agent-assisted triage deployment:

  • Log all agent verdict outputs to a custom log table (EntityAnalyzerAudit_CL or equivalent)
  • Alert on verdict discrepancies between agent analysis and analyst final assessment
  • Never configure automated containment on agent verdict alone — human approval required at every containment decision

Where agents fail

Let's be honest about the failure modes:

Hallucination on technical specifics. KQL schema names, specific CVE details, API parameter syntax — agents will confidently generate plausible-looking technical content that's wrong. I've seen Copilot reference tables that don't exist and Claude generate API calls with invented parameters.

Over-confidence on low-signal alerts. Agents don't have good intuition for "this alert is probably noise" versus "this alert deserves investigation." They'll treat every input with the same level of attention.

Generic output without environment context. Agents without access to your actual environment data produce recommendations that could apply to anyone, which means they're useful to no one.

The fixes

Ground agents in real data sources. Connect them to Sentinel workspaces, Defender APIs, and your actual coverage documentation. Generic prompts produce generic outputs.

Build in human review checkpoints. For anything high-consequence — remediation actions, detection rule deployment, incident escalation — the agent proposes, the human approves.

Start narrow before going broad. Pick one use case. Get it working reliably. Expand from there. Trying to automate everything at once produces nothing useful.


Getting started this week

Two concrete entry points, depending on your situation. Let's walk through both.

Entry point 1: Security Copilot (30 minutes to try)

If you have Microsoft 365 E5 Security or standalone Security Copilot licensing:

  1. Open the Security Copilot experience in the Microsoft Defender portal
  2. Start with the incident summary promptbook — pick a recent incident and ask Copilot to summarize it
  3. Try the natural language to KQL interface — describe a hunting hypothesis and see what query it generates
  4. Validate the output against your actual data

That's enough to assess whether Copilot fits your workflow.

Entry point 2: Custom agent (an afternoon)

If you want more control or don't have Copilot licensing:

  1. Get API access to your Sentinel workspace (Azure REST API)
  2. Set up Anthropic API access (Claude) or Azure OpenAI Service
  3. Write a simple posture summary script using the patterns above
  4. Iterate on the prompt until the output is useful

See the "A third path: the Sentinel MCP server" section earlier in this article for the specific tools and setup steps for that approach.


For security leadership

Before investing in AI agent capabilities, three questions:

  1. Do analysts have the high-quality logs and tool coverage to make agent outputs meaningful? Agents process what you give them. If your telemetry is incomplete, agent analysis will be incomplete.

  2. What is the current cost of routine reporting and research tasks? Quantify it. If your senior analysts spend 30% of their time on tasks agents could handle, that's the value case.

  3. Where is analyst fatigue highest? That's where agents deliver the most immediate value. Not the most complex work — the most repetitive work.

The technology is ready. The constraint is usually data foundation, not agent capability.


Where this lands

This is where the series comes together.

You now have the threat picture (pillar article), the tool map (Article 1), the log architecture (Article 2), and the automation layer (this article). The blueprint covers detection and response. The next articles in this series cover the architecture decisions and migration steps that make all of it scale.

Start with one threat. One tool. One log source. One agent use case.

Map a specific threat from the Microsoft Digital Defense Report to your tool stack. Verify you have the logs to detect it. Build or deploy one agent workflow to accelerate your response.

Iterate from there.

The four articles through here are designed to be a blueprint. The value is in the implementation, not the reading. Pick one use case from this article. Deploy it this week.

Let me know what you build.


This article is part of the Threat-Informed Defense Series: The Agentic SOC. See the pillar article for the complete framework.