Shadow Agents: How to Find the AI You Don't Know You're Running
Most enterprises have more AI agents running in production than they realize. Learn how to discover ungoverned agents, build a complete inventory, and establish onboarding processes that prevent shadow AI from becoming a security and compliance liability.
The average enterprise has two to three times more AI agents running in production than its security team can account for. Discovering them and keeping them governed requires a fundamentally different approach than most organizations have today.
Wondering when ungoverned AI will become a serious enterprise liability? For many organizations, it already has. While security and compliance teams are focused on the AI agents they deliberately deployed, a parallel ecosystem of autonomous systems has quietly taken root, built by well-intentioned engineers and business teams, running without review, and accumulating risk with every action they take.
We call these shadow agents. And in discovery audits conducted across enterprises in 2025 and 2026, we found them in every organization we examined.
The scale of what organizations don’t know
In early 2026, a mid-sized financial services firm was preparing for a SOC 2 Type II audit. Compliance asked engineering leadership a simple question: how many AI agents are running in production?
The official answer was eight. The auditors found 23.
The fifteen undisclosed agents weren’t the product of rogue actors. A credit analysis team had built a system that pulled customer financial data, ran it through GPT-4, and stored the results in a shared Google Drive. Customer success had deployed an agent that drafted personalized ticket responses using the full customer database. An engineering team was using an agent to summarize deployment logs, logs that contained live authentication tokens.
Every one of these agents was built to solve a real problem. None had undergone security review. One of them had been sending customer Social Security numbers, income figures, and credit scores to a third-party LLM API for seven months without a data processing agreement in place.
The SOC 2 audit did not go well.
This isn’t an isolated case. Based on discovery audits we’ve conducted, a typical enterprise with more than 500 employees has two to three times more AI agents running in production than its security team knows about. Sixty-five percent of those shadow agents have access to sensitive data, including customer PII, financial records, and internal credentials. The average shadow agent runs for 4.5 months before detection.
Why shadow agents are structurally inevitable and what makes them dangerous
Shadow agents aren’t a failure of individual judgment. They’re a predictable outcome of two forces moving in opposite directions: the cost of deploying an AI agent has collapsed, while governance processes haven’t moved at all.
Three years ago, building an agent required ML engineering expertise, infrastructure setup, and meaningful budget approval. Today, any developer with an API key can deploy a functional agent in an afternoon using LangChain, CrewAI, or AutoGen. Business analysts are building agentic workflows without writing a single line of code. The informal technical barrier that once slowed deployment has effectively disappeared.
At the same time, most enterprise security review processes were designed for traditional software. They involve architecture review boards, security questionnaires, and approval chains that span weeks. When a team can solve a real business problem before lunch, a three-week review process isn’t just inconvenient. It’s irrational from their perspective. So they deploy now. They formalize later. Later doesn’t come.
There’s a third factor that compounds the problem: many teams don’t realize they’ve built an agent at all. A scheduled script that calls an LLM API, processes the response, and takes an action based on the output is an agent by every functional definition. But the team that built it thinks of it as “just a script.” If the organization’s governance policy uses the word “agent” without defining it, these systems fall through the cracks by default.
The result is a category of risk that’s uniquely difficult to manage: you cannot assess, mitigate, or monitor risks you don’t know exist.
The specific risks shadow agents create fall into four categories.
Uncontrolled data exposure. Shadow agents routinely access sensitive data and transfer it to third-party LLM providers without data processing agreements, encryption requirements, or retention controls. Under GDPR, CCPA, and multiple financial regulations, these transfers may constitute unauthorized data processing, often triggering mandatory breach notification requirements that the organization doesn’t discover until an audit.
Direct compliance liability. The EU AI Act requires organizations to maintain a registry of all AI systems. Shadow agents are, by definition, unregistered. SOC 2 auditors expect controls over automated decision-making. SEC examiners ask financial firms to account for algorithmic decision-making. Shadow agents make the honest answer to all of these questions “we don’t know.”
Cost exposure without accountability. Shadow agents consume API tokens, compute, and storage outside any budget or cost allocation. A single poorly optimized agent can generate thousands of dollars in monthly API costs. Across a dozen shadow agents, organizations face meaningful cost runaway that no one is tracking or accountable for.
Security blind spots. Agents that haven’t undergone security review often have overly broad permissions, hardcoded credentials, and no input validation. They’re prime targets for prompt injection attacks and can serve as entry points for broader system compromise, operating as persistent threats for months without detection.
Discovering what you don’t know: A multi-method approach
No single discovery technique catches everything. In our experience, network traffic analysis alone catches 50 to 60 percent of shadow agents. Combining four methods gets organizations above 90 percent coverage.
Network traffic analysis is the highest-leverage starting point. Monitor outbound connections to known LLM API endpoints, including OpenAI, Anthropic, Google Vertex AI, Azure OpenAI, Cohere, and Mistral. Any connection that doesn’t originate from a registered agent is a shadow agent candidate. This step can typically be completed in a day using existing network monitoring infrastructure.
Cloud billing and API key audits catch what network scanning misses. Review cloud provider billing for AI-related charges. AWS Bedrock, Azure OpenAI Service, and Google Vertex AI all appear as distinct line items. Cross-reference those charges against known agents. Audit LLM provider API keys: how many exist, who created them, and which projects they’re associated with. Orphaned or unattributed API keys are strong indicators of shadow deployments.
Code repository scanning surfaces agents in development. Automated scanning for agent framework imports (LangChain, LlamaIndex, CrewAI, AutoGen, Semantic Kernel), LLM SDK references, and hardcoded API keys identifies repositories that contain agent-related code but aren’t listed in the agent inventory.
Organizational outreach catches what technical scanning cannot. Some shadow agents run on individual laptops, use personal API keys, or operate through no-code platforms that don’t appear in infrastructure logs. Direct team surveys, framed around discovery rather than accountability, surface these systems. The question to ask is broad and non-judgmental: do you use any AI tools, automations, or agents in your workflow?
Building an inventory that doesn’t become stale
Discovery is a point-in-time exercise. Its value depends entirely on what comes next: a living inventory that captures enough information about each agent to support ongoing governance.
A complete agent inventory documents seven things for each agent: identity and ownership (the team that built it and the individual accountable for it); business purpose (what problem it solves and what tasks it performs); technical architecture (LLM providers, models, and frameworks); tool access (every external service, API, and database the agent can invoke); data access (what data it reads and writes, including sensitivity classification); security review status; and cost profile (estimated monthly API and compute spend).
The inventory is not a one-time document. It needs to be updated whenever an agent is deployed, modified, or decommissioned, including verification that decommissioned agents have had all associated API keys, service accounts, and permissions actually revoked.
Prioritization matters. Not all shadow agents require the same urgency. Agents accessing customer PII, making decisions that affect customers or financial outcomes, or transferring data to third-party APIs without agreements represent the highest-priority remediation targets.
Preventing the next wave
Discovery solves the problem you have today. Without structural changes, shadow agents will reappear as fast as you find them. Prevention requires both technical controls and a fundamentally different approach to governance design.
The core principle: compliance must be easier than non-compliance. If the onboarding process takes three weeks and five approvals, teams will bypass it. If it takes two hours and a self-service form with automated checks, they’ll use it. Organizations that get this right reduce new shadow agent creation by roughly 80 percent within six months.
A well-designed onboarding process includes a self-service registration form that captures inventory fields; automated security checks that flag high-risk configurations; tiered review based on risk level (low-risk agents auto-approved with monitoring, high-risk agents requiring full architecture review); and automatic provisioning of governance infrastructure including audit logging, policy attachment, and cost monitoring.
Technical controls reinforce the process. Network-level monitoring that alerts on new, unregistered connections to LLM endpoints makes shadow agents visible within hours rather than months. API gateway policies that route all LLM calls through a centralized proxy provide a logging and enforcement point that doesn’t depend on team behavior. Restricting API key provisioning to a controlled process closes the most common entry point for shadow agents.
Finally, organizations need a published, concrete definition of what constitutes an AI agent, one specific enough to include scheduled scripts, no-code automations, and embedded SaaS features. If the definition is ambiguous, systems fall through the cracks by default.
Where to begin
For most organizations, the right starting point is a focused, time-bounded discovery effort rather than a comprehensive program.
In week one: Run a network traffic scan. Audit API keys and cloud billing. This combination typically surfaces the majority of shadow agents and produces a risk-prioritized list of systems requiring immediate attention.
In weeks two through four: Build the initial inventory from discovery findings. Begin security reviews for the highest-risk agents, particularly those with access to customer PII or those transferring data to third-party APIs. Establish interim monitoring for agents that can’t be reviewed immediately.
In month two: Launch a lightweight onboarding process. Publish an agent definition. Communicate to all teams that new agents must be registered before deployment, and make the registration process fast enough that it doesn’t create incentive to bypass it.
Ongoing: Run quarterly discovery sweeps. Maintain automated monitoring for new agent activity. Track agents through their full lifecycle, including decommissioning.
The imperative is visibility
The financial services firm from our opening scenario eventually brought all 23 agents under governance. It took three months of remediation, including retroactive security reviews, data access policy implementation, and compliance documentation. The teams that built the shadow agents weren’t penalized. Instead, the firm used the experience to design an onboarding process that takes less than a day. Shadow agents still appear, but they’re now detected in hours.
That outcome is achievable for most organizations. But it requires treating shadow agent discovery not as a one-time audit exercise, but as an ongoing operational capability, one that combines technical monitoring, organizational outreach, and a governance process designed to be used rather than bypassed.
Every shadow agent in your environment is an autonomous decision-maker operating without oversight. It may be accessing data it shouldn’t touch, making choices that affect customers without any audit trail, or accumulating costs no one is tracking. The longer it runs undetected, the more exposure it creates.
The question isn’t whether your organization has shadow agents. It does. The question is whether you’ll find them before your auditor does, and whether the governance infrastructure you build will prevent the cycle from repeating.
This article draws on discovery audits and governance implementations conducted with enterprise clients in 2025 and 2026.
Frequently Asked Questions
What are shadow AI agents and why are they dangerous?
Shadow AI agents are autonomous AI systems running in production environments that have not been registered, reviewed, or approved through an organization’s security and governance processes. They are dangerous because they operate outside the visibility of security, compliance, and risk teams. Shadow agents may access sensitive data without proper authorization, make decisions that affect customers without audit trails, consume cloud resources without cost controls, and create compliance violations that the organization does not discover until an audit or incident. Because no one is monitoring them, shadow agents can run for months or years before they are detected, accumulating risk with every action they take.
How do shadow AI agents end up in production environments?
Shadow agents typically emerge through three paths. First, engineering teams spin up agents for internal workflows using readily available frameworks and API keys without going through formal security review because the process is too slow or too burdensome. Second, business teams use no-code and low-code platforms to build automated workflows that incorporate LLM calls, creating agents without realizing they have done so. Third, third-party SaaS tools embed agent capabilities that activate when customers enable certain features, introducing agents that the organization never explicitly deployed. In all three cases, the agents are created with good intentions to solve real problems, but they bypass the governance processes that would ensure they operate safely.
How do you discover AI agents that are already running in your environment?
Discovery requires a multi-layered approach. Start with network traffic analysis to identify outbound calls to known LLM API endpoints like OpenAI, Anthropic, Google, and Cohere. Review cloud billing records for charges related to AI services that have not been formally provisioned. Scan code repositories for API keys, SDK imports, and agent framework dependencies. Examine infrastructure-as-code templates and container registries for agent-related deployments. Query identity and access management systems for service accounts with permissions to AI provider APIs. Finally, conduct team surveys and interviews to surface agents built with no-code tools or running on individual workstations. No single method catches everything, which is why a combination of technical scanning and organizational outreach is necessary.
What should an AI agent inventory include?
A comprehensive agent inventory should document each agent’s identity and ownership including the team that built it and the individual responsible for it, the business purpose it serves and the tasks it performs, the LLM providers and models it uses, every tool and external service it can access, the data sources it reads from and writes to, the deployment environment and infrastructure it runs on, the security review status and date of last review, the governance policies applied to it, the cost profile including estimated monthly spend on API calls and compute, and the compliance requirements that apply based on its function and the data it handles. The inventory should be a living document updated whenever an agent is deployed, modified, or decommissioned.
How do you prevent new shadow agents from appearing after initial discovery?
Prevention requires both technical controls and organizational processes. On the technical side, implement network-level monitoring that alerts when new connections to LLM API endpoints are detected from unregistered sources. Use API gateway policies that require all LLM API calls to route through a centralized proxy where they can be logged and policy-checked. Restrict the provisioning of AI service API keys to a controlled process that includes security review. On the organizational side, establish a lightweight agent onboarding process that teams can complete in hours rather than weeks so they have no incentive to bypass it. Create clear documentation explaining what constitutes an AI agent and why registration matters. Include agent governance in engineering onboarding and training. The goal is to make the governed path the easiest path.