Securing AI Workloads in AWS: Why Bedrock and SageMaker Need Runtime Detection and AI-Powered Response
Published 06/03/2026
Attackers are using AI to break into AWS environments and then turning around and using your AI — Bedrock and SageMaker — as the target. Posture alone can't keep up. Here's how cloud detection and response (CDR) solutions and AI-powered threat stories close the gap.
|
TL;DR
|
On November 28, 2025, the Sysdig Threat Research Team watched an attacker compromise an AWS environment and reach full administrator privileges in less than 10 minutes.
Not 10 hours. Ten minutes.
The initial access was familiar — credentials left in a public S3 bucket. What happened next was not. The attacker enumerated Amazon Bedrock, confirmed model invocation logging was disabled, and invoked multiple foundation models including Claude, Llama, and DeepSeek R1. They created Lambda backdoors to mint Bedrock credentials. They tried to spin up p4d GPU instances for model training. And the whole operation showed signs of being LLM-driven — scripts referencing hallucinated GitHub repos, reconnaissance moving at machine speed, decisions made faster than any human attacker could make them.
Two takeaways. First, attackers are using AI to attack AWS faster than humans can defend it. Second — and this is the part most security programs are still catching up on — Amazon Bedrock and Amazon SageMaker are no longer just AI services your developers are experimenting with. They're high-value targets. They have privileged identities. They run expensive workloads. They sit on top of sensitive training data. And right now, in most organizations, they're being secured the way developers' first AWS sandboxes were secured five years ago.
This is the blog post about closing that gap. We'll walk through what attackers actually do to Bedrock and SageMaker, why posture-only AI security can't catch it, where runtime detection and AI-powered response come in, and how CDR solutions — paired with AI-SPM and CIEM — give security teams a fighting chance against AI-speed adversaries.
The new AI attack surface in AWS
AWS gives you two flagship managed AI services: Amazon Bedrock for foundation model access, and Amazon SageMaker for end-to-end ML model development and deployment. Both have become enterprise standards. Both have also become enterprise problems.
Here's what the data says about the state of AWS security right now, based on Tenable Research's Cloud AI Risk Report 2025:
|
Stat |
What it means |
Source |
|
91% |
Of organizations using Amazon SageMaker had at least one notebook instance with root access — administrator privileges that allow changes to system-critical files and the AI model itself. |
|
|
14% |
Of organizations using Amazon Bedrock did not explicitly block public access to at least one AI training bucket. 5% had at least one overly permissive bucket. |
Tenable Cloud AI Risk Report 2025 |
|
97% |
Of AI-related security breaches occurred in organizations without proper AI access controls. |
|
|
20% |
Of organizations experienced a breach linked to shadow AI — unsanctioned AI tools without IT or security oversight. These breaches added an average of $670K to total cost. |
IBM Cost of a Data Breach 2025 |
|
38% |
Of organizations are battling a “toxic cloud triad” — workloads that are publicly exposed, critically vulnerable, AND highly privileged. AI workloads frequently match all three. |
|
|
8 minutes |
Time from leaked credentials to full AWS admin via LLM-assisted attack in the Sysdig-observed November 2025 incident — including Bedrock LLMjacking. |
Sysdig Threat Research Team |
Read those numbers as a system. Most AI workloads are overprivileged. Many are exposed. Almost none have proper access controls. And attackers are now operating at speeds that make manual response impossible. This is the operating environment your AI security strategy has to assume.
How attackers actually target Bedrock and SageMaker
If you only know one cloud AI attack pattern, know this one. LLMjacking — first identified by Sysdig in May 2024 — is the dominant motivation for credential theft in AWS environments running AI workloads. The attacker steals a key, validates Bedrock access, checks whether model invocation logging is enabled, and if it isn't, starts invoking foundation models on your bill.
Why? Two reasons. First, frontier model access is expensive on the black market — Sysdig observed costs climbing toward $100,000 per day on victim accounts running Claude 3 Opus. Second, attackers resell hijacked LLM access through unfiltered chatbot services. There's a thriving underground economy for someone else's Bedrock quota.
But LLMjacking is just the most visible technique. The full picture looks like this:
Bedrock attack patterns
- LLMjacking. Stolen IAM credentials used to invoke foundation models. Look for
- Model invocation logging tampering. Attackers call GetModelInvocationLoggingConfiguration to check if you'd see them. If logging is off, the attack proceeds. If it's on, they try to disable it — mapped to MITRE ATT&CK T1562.008 (Disable or Modify Cloud Logs).
- Cross-region inference abuse. Attackers distribute invocations across regions to maximize throughput and minimize detection — your bill spikes, your single-region alerting misses it.
- Marketplace agreement abuse. Calls to AcceptAgreementRequest and GetListingView to enable model access the account didn't previously have.
- Guardrail removal. Attackers strip Bedrock guardrails to bypass content filters, then use the unrestricted model for fraud, social engineering, or adult content services.
- Public training bucket exfiltration. RAG data in publicly accessible S3 buckets gets vacuumed up — that's exactly how the Sysdig-observed November 2025 breach started.
SageMaker attack patterns
- Notebook instance compromise. With 91% of SageMaker users running at least one notebook with root, a single stolen session token can pivot to the underlying ML environment, training data, and IAM roles.
- Direct internet exposure. SageMaker notebooks default to direct internet access; one industry report found this in 97% of SageMaker deployments. Combined with permissive IAM, this is an entry point.
- Training job hijacking. A training job without proper VPC isolation can be intercepted, modified, or used to inject backdoors into the resulting model.
- GPU resource abuse. p4d, p5, and g5 instances are expensive — attackers spin them up for unauthorized model training, cryptomining, or to host their own LLM infrastructure.
- Training data poisoning. Sensitive or PII-laden data in training buckets gets exfiltrated; malicious data gets injected to corrupt model outputs downstream.
- Model artifact theft. Trained model artifacts in S3 — sometimes the most valuable IP an organization owns — get copied to attacker-controlled buckets via cross-account replication (MITRE ATT&CK T1537).
|
CASE STUDY Anatomy of the November 2025 Bedrock breach Step 1: Initial access. The attacker found AWS credentials in a publicly accessible S3 bucket containing RAG data for AI models. The IAM user had Lambda permissions and limited Bedrock access. Total time to foothold: minutes. Step 2: Privilege escalation. Using Lambda function code injection, the attacker escalated and moved laterally across 19 unique AWS principals. LLMs assisted with reconnaissance and code generation throughout. Step 3: Bedrock target. The attacker enumerated both custom and foundation models, then called GetModelInvocationLoggingConfiguration to verify logging was disabled before invoking multiple models — Claude variants, DeepSeek R1, Llama, and others. Step 4: Persistence. A Terraform script was uploaded that would deploy a public Lambda backdoor capable of generating Bedrock credentials on demand. Step 5: Resource abuse. Attempts to launch p4d GPU instances for ML workloads followed. Most failed on capacity limits, but the attacker eventually launched a costly GPU instance with a publicly accessible JupyterLab server — an alternative gateway in case the original credentials were rotated. Total elapsed time from leaked credentials to full administrator control: under 10 minutes. No human SOC analyst could have triaged the original credential alert, investigated, and contained this in time. |
Why AI posture management alone falls short
AI Security Posture Management (AI-SPM) is essential. You need it. AI-SPM discovers Bedrock and SageMaker resources, enforces configuration best practices, identifies overprivileged AI identities, and finds exposed training buckets. That's how you prevent the kind of misconfigurations that made the November 2025 attack possible in the first place.
But posture is preventive. It tells you what could go wrong. It doesn't tell you what is going wrong right now.
Posture-only AI security has three structural blind spots:
1. It doesn't see in-flight attacker behavior
Configuration scans run on schedules — hourly, daily, or whenever a change is detected. The Sysdig-observed attack completed in under 10 minutes. By the time a posture tool flagged the new Lambda backdoor or the disabled CloudTrail, the attacker had what they came for. You need real-time control-plane monitoring for that.
2. It can't distinguish legitimate use from abuse
A data scientist calling Bedrock to test prompts and an attacker using the same key to mass-invoke models look identical to a posture scan. Both have valid credentials, both call the same API. The difference is behavioral — volume, timing, geography, model selection, user-agent patterns. Behavioral signals require runtime detection.
3. It produces alerts, not investigations
Posture findings come as a list of issues to fix. “Bedrock model invocation logging disabled.” “SageMaker notebook with internet access.” “Overprivileged IAM role.” These are useful as a backlog, but during an active intrusion they don't reconstruct the story for the SOC. You need correlation that turns 47 individual events into one narrative of what's happening.
This is exactly the gap cloud detection and response (CDR) was built to close.
AI-powered incident response, step by step
Here's what an AI workload incident actually looks like — using the November 2025 attack pattern as the example. Same chain, different ending.
|
Attacker action |
What Tenable Cloud Security sees |
What happens next |
|
Credentials from the public S3 bucket used to log in. |
Anomalous IAM authentication from a non-baseline IP and user-agent for that principal. |
A new threat story opens. Confidence: medium. Severity tied to the principal's privilege level and Bedrock/SageMaker access. |
|
GetModelInvocationLoggingConfiguration call. |
Recognized as a high-signal pre-attack reconnaissance pattern. |
Threat story confidence escalates. MITRE ATT&CK technique IDs (T1580 Discovery, pre-T1562.008) added to the story. |
|
Multiple Bedrock model invocations to Claude, Llama, DeepSeek. |
LLMjacking pattern matched — burst of invocations across foundation models from a principal with no prior Bedrock history. |
Story enriched with cost impact estimate and the specific models being abused. AI-SPM cross-references the affected Bedrock workspace's exposure status. |
|
New Lambda function created for credential generation. |
T1098.001 Additional Cloud Credentials detected. Persistence pattern. |
Response options surfaced: disable Lambda, revoke principal credentials, lock down the affected Bedrock workspace. |
|
GPU instance launched with public JupyterLab. |
T1496 Resource Hijacking + T1190 Exploit Public-Facing Application combined. |
Threat story now spans 7 techniques. Blast radius shows all affected resources. SOC analyst opens ONE story instead of triaging 30+ alerts. |
|
Cross-account S3 replication attempted on training data. |
T1537 Transfer Data to Cloud Account. Exfiltration signal. |
Critical severity. Recommended actions to block replication, isolate principal, and preserve forensic data are presented. |
The difference between this and a posture-only world isn't theoretical. It's the difference between a SOC analyst looking at one investigation that shows the whole attack and an analyst trying to manually connect 30+ alerts across CloudTrail, IAM, Bedrock logs, and SageMaker telemetry — while the attacker is already inside.
Best practices for securing Bedrock and SageMaker workloads
Putting this all together, here's the operational checklist. Run it in this order.
- Inventory every AI resource in every cloud account. Use AI-SPM to automatically discover all Bedrock workspaces, SageMaker notebooks, training jobs, endpoints, and the IAM roles, data sources, and KMS keys attached to them. You can't protect what you don't know exists. Shadow AI is the modern shadow IT.
- Enable model invocation logging — and detect when it's disabled. Bedrock model invocation logging is off by default. Turn it on across every account and every region. More importantly, treat any call to GetModelInvocationLoggingConfiguration or StopLogging as a high-fidelity attack signal and route it directly to the SOC.
- Lock down SageMaker notebook defaults. Disable direct internet access by default. Eliminate root access on production notebooks. Require VPC isolation on training jobs. Force IAM roles to follow least privilege — the Tenable Cloud AI Risk Report's 91% root-access finding is the gap to close first.
- Apply identity-first controls (CIEM) to AI roles. Bedrock and SageMaker service roles are some of the most over-permissioned identities in modern AWS accounts. Use CIEM to right-size them, identify toxic combinations of permissions, and enforce just-in-time access for sensitive operations like model deployment.
- Block public access on AI training buckets. Treat S3 buckets containing training data, model artifacts, or RAG data as crown jewels. Block public access by default at the account level. Encrypt with customer-managed KMS keys. The November 2025 breach started with a public bucket — most LLMjacking incidents do.
- Add runtime detection (CDR) on top of posture. Posture catches misconfigurations. CDR catches behavior. You need both. Specifically: real-time monitoring of Bedrock invocation patterns, SageMaker notebook activity, training job creation, and GPU instance launches.
- Correlate AI workload alerts into threat stories. Don't make your SOC piece together 30 alerts during an active LLMjacking incident. Use AI-powered correlation that maps the full attack chain — across identity, posture, and runtime — into a single investigation tied to MITRE ATT&CK techniques.
- Set cost-based detection alongside security detection. LLMjacking is observable as a billing anomaly faster than it's observable as a security event in some organizations. Wire AWS Cost Anomaly Detection and Bedrock usage thresholds directly to your security alerting. A surprise $20K Bedrock invoice is a security incident, not a finance question.
- Run AI-specific red team and adversary emulation exercises. Use MITRE ATLAS — the AI-focused adversary tactics framework — alongside ATT&CK for Cloud to test your detections. Stratus Red Team has LLMjacking emulation. Run it. Verify your CDR sees it.
The bottom line
In November 2025, an attacker reached AWS administrator access in eight minutes and used a victim's Bedrock account to invoke frontier models on the victim's bill. That's the operating environment for AI workload security right now.
Posture management alone won't keep up. Identity controls alone won't keep up. Runtime detection alone won't keep up — and runtime as a single layer misses the misconfigurations and excessive permissions that made the attack possible in the first place. The answer is all of it, together, in one platform, with AI-powered correlation that compresses a 30-alert incident into a single investigation.
Your developers are going to keep shipping Bedrock and SageMaker workloads faster than your security team can review them. That's not changing. What can change is whether you can see what's happening to those workloads in real time, and respond at the speed the attackers are now operating.
Frequently asked questions
What is LLMjacking?
LLMjacking is an attack where adversaries use stolen cloud credentials to invoke hosted large language models — typically on Amazon Bedrock, Azure OpenAI, or Google Vertex AI — at the victim's expense. The economic motivation is twofold: frontier model access is expensive, and there's a black market reselling hijacked LLM quota through unfiltered chatbot services. Sysdig first documented the attack pattern in May 2024 and has tracked its continued growth since.
Doesn't AWS already protect Bedrock and SageMaker?
AWS handles the security of the underlying services. You're responsible for everything in the service — IAM policies, network configuration, logging, data protection, and detecting abuse of legitimate API calls. The shared responsibility model still applies, and most LLMjacking and AI workload attacks succeed entirely within the customer's responsibility zone.
Does GuardDuty cover this?
Amazon GuardDuty has some Bedrock and SageMaker detection capabilities — it can flag suspicious model invocation patterns and removal of guardrails. But GuardDuty operates at the AWS-account level, doesn't correlate across cloud control plane, identity context, and workload behavior the way a CNAPP-integrated CDR does, and doesn't tie findings to exposure context across other domains. Most mature AWS-using organizations layer purpose-built cloud security on top of GuardDuty rather than relying on it alone.
What's the difference between AI-SPM and CDR?
AI-SPM is preventive. It discovers AI resources, enforces secure configurations, and flags misconfigurations and overprivileged access before they're exploited. CDR is detective and responsive. It monitors runtime behavior, surfaces active threats, and supports response actions in real time. You need both.
Can CDR solutions detect LLMjacking specifically?
Yes. CDR solutions monitor Bedrock control-plane activity for the behavioral patterns that characterize LLMjacking — calls to GetModelInvocationLoggingConfiguration, anomalous invocation volume, model enumeration from non-baseline principals, Marketplace agreement changes, cross-region invocation patterns, and guardrail modifications. Detections are mapped to MITRE ATT&CK techniques and correlated with identity, posture, and data context to produce a single threat story.
How does this work for organizations on Azure or GCP?
The same approach extends across clouds. CDR covers Azure AI services (Azure OpenAI, ML workspaces) and Google Cloud Vertex AI alongside Bedrock and SageMaker, with normalized findings and consistent threat stories across providers. Multi-cloud AI workload security is one of the strongest cases for a CNAPP-based approach over native cloud tools.
What MITRE ATT&CK techniques should I watch for in AI workload attacks?
The high-signal ones: T1078.004 (Valid Accounts: Cloud Accounts), T1580 (Cloud Infrastructure Discovery), T1562.008 (Disable or Modify Cloud Logs), T1098.001 (Additional Cloud Credentials), T1496 (Resource Hijacking — especially relevant for GPU abuse), T1530 (Data from Cloud Storage), and T1537 (Transfer Data to Cloud Account). MITRE ATLAS — the adversarial threat landscape for AI systems — complements ATT&CK with AI-specific techniques like prompt injection and model evasion.
About the Author
Thomas Nuth is a seasoned cybersecurity executive with over 15 years of experience driving global go-to-market strategy, brand development, and market adoption for some of the world’s most innovative security companies. With a deep understanding of the evolving threat landscape—from cloud-native risk to AI-powered attacks—Thomas has played a pivotal role in shaping industry narratives and positioning next-gen technologies at the forefront of the cybersecurity conversation. Before joining Tenable, Thomas held positions at Wiz, Qualys, Fortinet, Forescout, and other innovative leaders in cybersecurity.

Related Resources



Unlock Cloud Security Insights
Subscribe to our newsletter for the latest expert trends and updates
Related Articles:
Top 6 Claude Security Risks to Watch as AI Becomes Your Employees' Operating System
Published: 06/02/2026
SLMs, LLMs, and the Real Difference That Matters in DSPM
Published: 06/01/2026





.jpeg)
.jpeg)