LLMs Writing Code? Cool. LLMs Executing It? Dangerous.
Published 06/03/2025
Written by Olivia Rempe, Community Engagement Manager, Cloud Security Alliance.
There’s no denying it—Large Language Models (LLMs) have changed the game for software development.
They can autocomplete boilerplate, refactor legacy functions, and even generate entire microservices with a well-crafted prompt. But as tempting as it is to let that generated code run, here’s a word of caution:
- Letting an LLM write code is powerful.
- Letting it execute code? That’s dangerous.
CSA’s latest white paper, Securing LLM-Backed Systems: Essential Authorization Practices, spells out why combining non-deterministic models with execution rights is a major security risk—and what you should do about it.
The Risks: When Code Becomes a Threat Vector
LLMs don’t “understand” what they’re doing. They generate code probabilistically based on patterns—not verified logic. Combine that with runtime permissions, and you get a volatile mix:
Vulnerable Code Generation
LLMs can hallucinate insecure functions, use deprecated libraries, or output vulnerable patterns (e.g., unsanitized SQL queries, shell commands, etc.).
Prompt Injection Attacks
Malicious users can craft prompts that trick the model into generating dangerous or destructive code—even unintentionally.
Excessive Privileges
The runtime environment may have broad access to internal systems, files, or the network, turning one bad line of code into a system-wide breach.
No Natural Boundaries
LLMs don’t distinguish between code and data, and they don’t enforce access controls. Left unchecked, they can pull in sensitive data, mutate state, or execute instructions outside their scope.
What Secure Execution Should Look Like
If you must allow runtime code execution from an LLM (say, for a developer tool or automation pipeline), CSA recommends a strict, multi-layered defense:
1. Sandbox the Execution Environment
- Use containerized or virtualized sandboxes with strict runtime limits
- Block outbound network access by default
- Whitelist libraries and functions that can be imported or called
- Prevent write access unless explicitly required (read-only by default)
- Limit CPU, memory, and execution time (e.g., 30 seconds max)
2. Validate and Review the Code
- Use automated static code analysis and LLM-based reviewers to flag suspicious patterns
- Only allow specific types of code generation (e.g., string manipulation, math functions)
- Monitor for calls to file systems, subprocesses, or system-level operations
- Implement syntax checks, query validation, and function white-listing
3. Insert Human-in-the-Loop Checkpoints
- Require human approval before executing code that:
- Modifies data
- Calls external APIs
- Has destructive potential
- Log all executions, including who approved and what inputs were used
4. Use the Orchestrator, Not the LLM, to Control Execution
- Treat the LLM as an untrusted advisor, not an autonomous actor
- The orchestrator (your trusted runtime) should handle:
- User authentication
- API calls
- Identity and access checks
- Execution permissions
- Don’t let the LLM pull the trigger—let it suggest, and have a human or secure service confirm.
Build Safely or Not at All
In the rush to harness LLMs for productivity, too many teams skip the guardrails. But when the LLM moves from “suggesting code” to “executing code,” you enter a whole new threat landscape.
Just because the model can do it doesn’t mean it should—at least not without a secure environment, human oversight, and airtight permissions.
So the next time you build a tool that lets the LLM run scripts, ask yourself:
Would I let an intern with zero context and no code review access production?
If not, don’t let your LLM either.
CSA’s white paper breaks this all down—from sandboxing tips to orchestration patterns to execution audit trails.
Download “Securing LLM-Backed Systems: Essential Authorization Practices” today.
Related Resources



Unlock Cloud Security Insights
Subscribe to our newsletter for the latest expert trends and updates
Related Articles:
NIST AI RMF: Everything You Need to Know
Published: 06/17/2025
AI Agents vs. AI Chatbots: Understanding the Difference
Published: 06/16/2025
Runtime Integrity Measurement Overview
Published: 06/13/2025