MCP servers have become a go-to for AI-driven access to services and capabilities. In cyber security, tools, agents and people use them daily to fetch logs, hunt threats, or automate responses - straightforward in theory.
In practice, though, they often ship without enterprise safeguards. LLMs’ unpredictable outputs can spawn expansive actions and queries.
As someone dealing with Sentinel data lake daily I’ve used the official MCP server often for targeted data pulls and seen enterprises adopt it broadly. A few times, though, absent defaults lead to high costs from unchecked queries, which I’ve encountered first hand as well
In this post, I detail a real-world example and present a patch for the official MCP server to show the options you have and things you should be careful about when dealing with AI and MCP servers.
MCP Cost Breakdown and Data Lake Realities
Cost Elements
Using an MCP server introduces layered costs, even if they’re often modest at first glance.
- LLM baseline: Any LLM integration incurs token-based fees, though free tiers and models exist.
- Sampling overhead: MCP can use sampling - utilizing your LLM - dipping into the same credit pool.
- MCP cost: Minimal in my experience - usually covered by Azure Functions or containers.
- Downstream service fees: Cost of using the end solution - like paying for queries in Sentinel data lake.
Data lake scans via MCP are billed at 0.005 USD/GB (East US, current pricing). While low per query, expenses compound with scale. Companies now push 5-10x more data into lakes - cheap option lets them retain what they’d previously dropped. Overall, more data is stored in Sentinel for less cost.
A client with steady 1 TB/day ingestion? Looking back 90 days scans 90 TB - 450 USD at current rates. Intentional? Manageable. Accidental? Ouch.
The Core Problem in data lake: Unbounded Queries
Sentinel’s GUI (Portal, Hunting, Lake UI) defaults to 24-hour windows if not specified by the user. MCP server ignores this practice: without an explicit TimeGenerated filter, queries process the entire table.
Why does this happen so often?
- Users expect 24h defaults from Sentinel habits.
- Legacy SIEM queries were “free”, so being careful with filters was optional.
- LLM errors: Vague prompts interpreted loosely by non-deterministic LLMs.
Lacking native safeguards like time limits, MCP exposes teams to avoidable financial risk.
The Risk Scenario: Vague Prompts Trigger Full Scans
When prompting an LLM via MCP to retrieve data lake information, precision is critical. Ambiguous instructions often yield KQL queries lacking a TimeGenerated filter - scanning every row in targeted tables, from day one to present.
Take a Microsoft Foundry agent as an example: by default, it generates and executes broad queries without time bounds. Whether run by an automated agent or a user directly, this can incur substantial costs—especially in high-volume Sentinel lakes.
[!!!] My test run.
Next, we’ll explore targeted solutions to enforce safeguards.
Prompt Engineering: The “Pretty Please” Approach
Solutions vary by application and desired assurance level.
Instruct the LLM via system prompts to follow specific behaviors - like mandating time filters. This isn’t foolproof against malicious inputs but works reliably for trusted users most of the time.
In our scenario, we don’t want to prevent the user to intentiall run a query without TimeGenerated filter. We rather want to prevent an accidental long query due to an accidentally missing filter.
The following example from a Microsoft Foundry agent demonstrates: adding a system prompt to always warn users of query implications and enforce TimeGenerated filters transforms risky requests. Users now receive confirmation prompts (raising awareness) alongside bounded queries.
[Image!!!]
Unfortunately, while effective most of the time, this relies on LLM non-determinism. In my testing, the prompt was ignored multiple times. [Image!!!]
Note: Prompts can get ignored, and it varies by model - what clicks with one doesn’t always work with another.
Prompt engineering is effective in agentic workflows because the agent can follow structured tool instructions. With direct MCP calls from a client, you lose that layer of customizable guidance.
If you operate your own MCP server, you can embed best-practice recommendations in the server and tool descriptions so the LLM can follow it. But these recommendations are not handled with the same weight as a system prompt. But when you don’t even control the MCP server, you need to be more inventive about steering behavior.