MasteringClaude'sExternalCodeExecutionwithCloudRun
Developers: Master Claude's external code execution with Google Cloud Run. This guide covers setup, security, tool creation, and debugging for agentic workflows. See the full setup guide.


📋 At a Glance
- Difficulty: Advanced
- Time required: 1-2 hours (initial setup), variable for custom tool development
- Prerequisites:
- Basic understanding of Python and Flask.
- Familiarity with Google Cloud Platform (GCP) concepts (IAM, Cloud Run, Secret Manager).
gcloudCLI installed and authenticated.- Anthropic API key with sufficient quota for Claude 3.x models.
- Basic knowledge of LLM tool use and prompt engineering.
- Works on: Any OS with
gcloudCLI and Python. Cloud components are platform-agnostic.
#How Does Claude's Code Execution Enhance Agentic Workflows?
Claude's external code execution transforms it from a reactive chatbot into a proactive agent by enabling it to perform real-world actions and computations. This capability, often referred to as "tool use," allows Claude to call developer-defined functions, execute code in a secure environment, and integrate with external systems. Instead of merely suggesting code or workflows, Claude can now execute them, debug them based on output, and iteratively refine its approach, significantly boosting its utility in complex development, data analysis, and automation tasks.
The Fireship video highlights this "superpower" by framing it as running code on "Google's world-class infrastructure" (Cloud Run). This isn't Claude directly running code on Cloud Run, but rather a developer deploying a custom code execution service to Cloud Run, which Claude can then invoke as a tool. This distinction is critical: Claude orchestrates the use of external tools; developers build and deploy those tools. This architecture provides security, scalability, and flexibility, allowing Claude to interact with virtually any system accessible via an API.
For example, Claude can be tasked with:
- Data Analysis: Execute Python scripts to process CSV files, run statistical models, or generate plots.
- Web Scraping: Use a tool to fetch data from websites, then process it.
- API Interaction: Call external APIs (e.g., GitHub, Slack, payment gateways) to perform actions like creating pull requests, sending messages, or processing transactions.
- Software Development: Write code, execute tests, identify errors, and even deploy simple applications.
This agentic capability is foundational for building autonomous AI systems that can achieve multi-step goals without constant human intervention.
#How Do I Configure a Secure Cloud Execution Environment for Claude?
Configuring a secure cloud execution environment for Claude involves setting up a Google Cloud Project, deploying a dedicated service (e.g., on Cloud Run) that acts as an intermediary for code execution, and exposing this service to Claude as a tool. This process ensures that Claude's code execution is sandboxed, adheres to strict access controls, and leverages the scalability and security features of Google Cloud. The critical element is building a robust, minimal-privilege execution service on Cloud Run that Claude can invoke via an API.
The following steps detail how to set up a basic Python-based code execution service on Google Cloud Run and expose it securely. This example focuses on Python, but the principles apply to any language or runtime.
Step 1: Initialize Your Google Cloud Project
What: Set up or select a Google Cloud Project and enable necessary APIs.
Why: A dedicated project provides resource isolation and billing control. Enabling APIs ensures the required services (Cloud Run, Secret Manager) are available.
How:
First, ensure you have the gcloud CLI installed and authenticated. If not, follow the official Google Cloud documentation for installation and authentication.
# Language: bash
# What: Authenticate gcloud CLI (if not already done)
# Why: Allows gcloud to interact with your GCP account.
gcloud auth login
# What: Set your active Google Cloud Project
# Why: All subsequent commands will apply to this project. Replace `YOUR_PROJECT_ID` with your actual project ID.
gcloud config set project YOUR_PROJECT_ID
# What: Enable the Cloud Run API and Secret Manager API
# Why: These services are essential for deploying our code execution environment and securely storing credentials.
gcloud services enable run.googleapis.com secretmanager.googleapis.com
Verify: You should see output confirming the APIs are enabled, or stating they are already enabled.
# Language: text
# Expected Output (example for services enable):
# Operation "operations/..." finished successfully.
Step 2: Create a Service Account for Cloud Run
What: Create a dedicated service account for your Cloud Run service and assign it minimal necessary permissions. Why: Following the principle of least privilege, this service account will only have permissions required to run the service, enhancing security. How:
# Language: bash
# What: Create a new service account
# Why: This account will be used by the Cloud Run service.
gcloud iam service-accounts create claude-code-executor \
--display-name "Claude Code Executor Service Account"
# What: Grant the service account permissions to log to Cloud Logging
# Why: Allows the Cloud Run service to write logs for debugging and monitoring.
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member "serviceAccount:claude-code-executor@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role "roles/logging.logWriter"
# What: Grant the service account permissions to access Secret Manager secrets (if you plan to use them)
# Why: If your code execution service needs to access API keys or other secrets, it needs this role.
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member "serviceAccount:claude-code-executor@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role "roles/secretmanager.secretAccessor"
Verify: Confirm the service account creation and policy bindings were successful.
# Language: text
# Expected Output (example for service account create):
# Created service account [claude-code-executor].
# Email: [claude-code-executor@YOUR_PROJECT_ID.iam.gserviceaccount.com]
# ...
Step 3: Develop the Code Execution Service (Python Flask)
What: Write a simple Python Flask application that accepts code, executes it in a sandboxed manner, and returns the output.
Why: This application will be deployed to Cloud Run and serve as the actual execution engine for Claude. Sandboxing is critical for security.
How:
Create a directory claude-executor-service and add the following files:
claude-executor-service/main.py
# Language: python
import os
import io
import sys
import json
from flask import Flask, request, jsonify
app = Flask(__name__)
@app.route('/execute', methods=['POST'])
def execute_code():
"""
Executes Python code received in the request body within a limited environment.
"""
data = request.get_json(silent=True)
if not data or 'code' not in data:
return jsonify({"error": "Missing 'code' in request body."}), 400
code_to_execute = data['code']
# Redirect stdout and stderr to capture output
old_stdout = sys.stdout
old_stderr = sys.stderr
redirected_output = io.StringIO()
redirected_error = io.StringIO()
sys.stdout = redirected_output
sys.stderr = redirected_error
execution_result = {}
try:
# Create a restricted execution environment
# WARNING: This is a basic sandbox. For production, consider dedicated sandboxing libraries
# or containerized execution environments (e.g., gVisor, Firecracker).
# This example primarily limits access to built-in functions and global variables.
global_vars = {'__builtins__': {
'print': print, 'len': len, 'str': str, 'int': int, 'float': float,
'list': list, 'dict': dict, 'tuple': tuple, 'set': set, 'range': range,
'sum': sum, 'min': min, 'max': max, 'abs': abs, 'round': round,
'type': type, 'isinstance': isinstance, 'getattr': getattr, 'setattr': setattr,
'hasattr': hasattr, 'dir': dir, 'enumerate': enumerate, 'zip': zip,
'map': map, 'filter': filter, 'sorted': sorted, 'reversed': reversed,
'all': all, 'any': any, 'next': next, 'iter': iter, 'repr': repr,
'ord': ord, 'chr': chr, 'hex': hex, 'oct': oct, 'bin': bin,
'pow': pow, 'divmod': divmod, 'complex': complex, 'frozenset': frozenset,
'memoryview': memoryview, 'bytearray': bytearray, 'bytes': bytes,
'super': super, 'object': object, 'Exception': Exception,
'KeyboardInterrupt': KeyboardInterrupt, 'SystemExit': SystemExit,
'ArithmeticError': ArithmeticError, 'AssertionError': AssertionError,
'AttributeError': AttributeError, 'EOFError': EOFError,
'FloatingPointError': FloatingPointError, 'GeneratorExit': GeneratorExit,
'ImportError': ImportError, 'IndexError': IndexError, 'KeyError': KeyError,
'MemoryError': MemoryError, 'NameError': NameError, 'NotImplementedError': NotImplementedError,
'OSError': OSError, 'OverflowError': OverflowError, 'ReferenceError': ReferenceError,
'RuntimeError': RuntimeError, 'StopIteration': StopIteration, 'SyntaxError': SyntaxError,
'IndentationError': IndentationError, 'TabError': TabError, 'SystemError': SystemError,
'TypeError': TypeError, 'UnboundLocalError': UnboundLocalError,
'UnicodeError': UnicodeError, 'UnicodeDecodeError': UnicodeDecodeError,
'UnicodeEncodeError': UnicodeEncodeError, 'UnicodeTranslateError': UnicodeTranslateError,
'ValueError': ValueError, 'ZeroDivisionError': ZeroDivisionError
# Add other safe built-ins as needed
}}
local_vars = {}
exec(code_to_execute, global_vars, local_vars)
execution_result['output'] = redirected_output.getvalue()
execution_result['error'] = redirected_error.getvalue()
execution_result['success'] = True
except Exception as e:
execution_result['output'] = redirected_output.getvalue()
execution_result['error'] = redirected_error.getvalue() + f"\nExecution Error: {str(e)}"
execution_result['success'] = False
finally:
# Restore stdout and stderr
sys.stdout = old_stdout
sys.stderr = old_stderr
return jsonify(execution_result), 200
if __name__ == '__main__':
# For local testing
app.run(debug=True, host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))
claude-executor-service/requirements.txt
# Language: text
flask==3.0.3
claude-executor-service/Dockerfile
# Language: dockerfile
# Use the official Python image as a base
FROM python:3.11-slim-buster
# Set the working directory in the container
WORKDIR /app
# Copy the requirements file into the container
COPY requirements.txt .
# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Copy the rest of the application code into the container
COPY . .
# Expose the port that the Flask app will listen on
# Cloud Run automatically sets the PORT environment variable
ENV PORT 8080
# Run the Flask application
CMD ["python", "main.py"]
> ⚠️ Security Warning: The exec() function in Python is inherently dangerous. The basic sandboxing provided in main.py is NOT sufficient for production environments where untrusted code might be executed. For a robust solution, consider:
- Dedicated Sandboxing Libraries: Tools like
RestrictedPythonorPyPyfor more secure execution. - Container Isolation: Running each execution in a fresh, isolated container (e.g., using Google Cloud Build, GKE, or a custom Firecracker setup).
- Code Linting/Analysis: Pre-screening code for malicious patterns.
- Time/Resource Limits: Preventing long-running or resource-intensive code. This guide provides a functional example for demonstration; production systems require significantly more robust security measures.
Step 4: Deploy the Service to Google Cloud Run
What: Build the Docker image for your Flask application and deploy it to Google Cloud Run.
Why: Cloud Run provides a fully managed, scalable, and serverless environment for your service.
How:
Navigate to the claude-executor-service directory in your terminal.
# Language: bash
# What: Build the Docker image
# Why: Creates a container image from your Dockerfile and application code.
gcloud builds submit --tag gcr.io/YOUR_PROJECT_ID/claude-code-executor
# What: Deploy the image to Cloud Run
# Why: Creates a new Cloud Run service instance.
# --platform managed: Specifies a fully managed environment.
# --region: Choose a region close to your users or Anthropic's services.
# --allow-unauthenticated: Allows public access (necessary for Claude to call it).
# For production, consider IAP or API keys for authentication.
# --service-account: Assigns the service account created in Step 2.
gcloud run deploy claude-code-executor \
--image gcr.io/YOUR_PROJECT_ID/claude-code-executor \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--service-account "claude-code-executor@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--memory 512Mi \
--cpu 1 \
--min-instances 0 \
--max-instances 5
Verify: After deployment, you will receive a URL for your Cloud Run service.
# Language: text
# Expected Output:
# Service [claude-code-executor] revision [claude-code-executor-00001-...] has been deployed and is serving 100 percent of traffic.
# Service URL: https://claude-code-executor-...-uc.a.run.app
Test the endpoint manually using curl or Postman:
# Language: bash
# Replace YOUR_CLOUD_RUN_URL with the URL from the previous step
curl -X POST -H "Content-Type: application/json" \
-d '{"code": "print(\"Hello from Claude!\")"}' \
YOUR_CLOUD_RUN_URL/execute
> ✅ What you should see:
# Language: json
{
"error": "",
"output": "Hello from Claude!\n",
"success": true
}
If you get a 403 error, ensure --allow-unauthenticated was used, or check IAM permissions on the Cloud Run service.
Step 5: Define the Tool for Claude
What: Create an OpenAPI specification (or direct function definition) that describes your Cloud Run service's /execute endpoint for Claude.
Why: Claude uses this specification to understand how to call your tool, including required parameters and expected output.
How:
You will provide this JSON schema to Claude when initiating a conversation or defining its tools.
# Language: json
{
"openapi": "3.1.0",
"info": {
"title": "Code Executor",
"version": "1.0.0",
"description": "A tool to execute Python code in a sandboxed environment."
},
"servers": [
{
"url": "YOUR_CLOUD_RUN_URL"
}
],
"paths": {
"/execute": {
"post": {
"summary": "Execute Python Code",
"operationId": "execute_python_code",
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The Python code to execute."
}
},
"required": ["code"]
}
}
}
},
"responses": {
"200": {
"description": "Successful execution output",
"content": {
"application/json": {
"schema": {
"type": "object",
"properties": {
"output": {
"type": "string",
"description": "Standard output from the executed code."
},
"error": {
"type": "string",
"description": "Error output or exception message if execution failed."
},
"success": {
"type": "boolean",
"description": "True if code executed without uncaught exceptions, False otherwise."
}
}
}
}
}
}
}
}
}
}
}
Replace YOUR_CLOUD_RUN_URL with the actual URL from your Cloud Run service.
When interacting with Claude via its API (e.g., using anthropic-sdk), you would pass this tool definition:
# Language: python
import anthropic
import json
import os
client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
# Your tool definition as a Python dictionary
code_executor_tool = {
"name": "execute_python_code",
"description": "Executes Python code in a sandboxed environment and returns the output.",
"input_schema": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The Python code to execute."
}
},
"required": ["code"]
}
}
# Example of how Claude might use the tool
message = client.messages.create(
model="claude-3-opus-20240229", # Or other Claude 3.x model
max_tokens=2000,
tools=[code_executor_tool], # Pass the tool definition here
messages=[
{"role": "user", "content": "Write a Python script to print the first 5 prime numbers and then execute it. Show me the output."}
]
)
print(message.content)
# If Claude decides to use the tool, its response will contain tool_use blocks.
# You would then parse these, call your Cloud Run service, and pass the result back to Claude.
#What Are the Best Practices for Developing and Debugging Claude's Executable Code?
Developing and debugging code executed by Claude requires a systematic approach that emphasizes clear tool definitions, robust error handling in the execution environment, and detailed logging. Since Claude operates without direct visual feedback, clear inputs, predictable outputs, and comprehensive error messages are paramount for Claude to self-correct and for developers to diagnose issues.
1. Clear and Concise Tool Definitions
What: Ensure your OpenAPI schema or tool definitions are precise and unambiguous. Why: Claude relies entirely on these definitions to understand how to use your tool, what parameters it expects, and what kind of output to anticipate. Ambiguity leads to incorrect tool calls. How:
- Descriptive
descriptionfields: Explain the tool's purpose and each parameter clearly. - Strict
input_schema: Definetype,properties, andrequiredfields accurately. - Realistic
output_schema: Claude uses the response schema to interpret the tool's result. Ensure it reflects the actual JSON structure returned by your Cloud Run service. - Example: The
execute_python_codetool definition above is a good starting point. Ensure your Cloud Run service's actual response always matches theoutput,error, andsuccessfields.
2. Robust Error Handling in the Execution Service
What: Implement comprehensive try-except blocks in your Cloud Run service to catch execution errors, API failures, and unexpected conditions.
Why: Unhandled exceptions crash your service or return generic errors, preventing Claude from understanding what went wrong. Detailed error messages enable Claude to debug its own code generation.
How:
- Capture all output: Redirect
stdoutandstderrto capture both successful prints and error messages. - Structured error responses: Always return a consistent JSON structure, including an
errorfield with a detailed message when something goes wrong. - Specific error types: If possible, categorize errors (e.g.,
SyntaxError,RuntimeError,APIError) for Claude to better interpret.
# Language: python
# From main.py, specifically the try-except block
try:
exec(code_to_execute, global_vars, local_vars) # The potentially failing part
execution_result['output'] = redirected_output.getvalue()
execution_result['error'] = redirected_error.getvalue()
execution_result['success'] = True
except Exception as e:
execution_result['output'] = redirected_output.getvalue()
execution_result['error'] = redirected_error.getvalue() + f"\nExecution Error: {str(e)}"
execution_result['success'] = False
3. Comprehensive Logging and Monitoring
What: Utilize Google Cloud Logging for your Cloud Run service and enable Anthropic API logging. Why: Logs are your primary debugging tool. Cloud Logging provides a centralized view of your service's behavior, while Anthropic's logs show Claude's tool calls and responses. How:
- Cloud Run Logging: Cloud Run automatically sends
stdoutandstderrfrom your service to Cloud Logging. Ensure your Python app prints meaningful messages. - Anthropic API Logging: When calling the Anthropic API, ensure you log the full
tool_useblocks sent by Claude and thetool_resultblocks you send back. This allows you to reconstruct Claude's thought process. - Correlation IDs: If possible, pass a unique correlation ID from your user session through to Claude's API calls and into your Cloud Run service logs. This links a specific user request to all associated LLM interactions and code executions.
4. Iterative Prompt Engineering and Tool Testing
What: Test your tool with Claude using a variety of prompts, starting simple and gradually increasing complexity. Why: Claude learns how to use tools through examples and feedback. Iterative testing helps you refine both your tool's definition and your initial prompts. How:
- Start with basic calls: Ask Claude to execute a simple
print("hello")to confirm basic connectivity. - Test edge cases: Provide prompts that might lead to errors (e.g., invalid syntax, division by zero) and verify your error handling.
- Observe Claude's reasoning: Pay close attention to Claude's
tool_useandtool_resultblocks. If Claude misinterprets the tool, adjust thedescriptionorinput_schema. - Give explicit instructions: In your initial system prompt to Claude, you might include instructions on how to use the code execution tool, especially for complex scenarios or debugging. For example: "If code execution fails, analyze the error message and suggest a fix."
5. Resource and Time Limits
What: Configure resource limits (CPU, memory) and timeout settings for your Cloud Run service. Why: Prevents runaway code from consuming excessive resources or causing long delays. How:
- Cloud Run settings: During deployment, use
--memory,--cpu, and--timeoutflags. - Internal timeouts: Consider adding a timeout mechanism within your Python
exec()call if your sandbox environment allows it, to stop execution of infinitely looping code.
By following these practices, you can build a robust, debuggable, and secure system that effectively leverages Claude's code execution capabilities.
#When Is Claude's Code Execution NOT the Optimal Solution?
While powerful, Claude's external code execution is not a universal solution and presents trade-offs in scenarios requiring extreme security, real-time performance, or tightly coupled application logic. It introduces latency from API calls, adds complexity for setting up and maintaining external execution environments, and inherently carries security risks if sandboxing is not rigorously implemented. Developers must weigh these factors against the benefits of AI-driven automation.
Here are specific scenarios where alternatives might be more suitable:
-
High-Security, Production-Critical Environments:
- Why not Claude: Executing arbitrary code, even in a sandbox, introduces a non-zero risk of vulnerabilities, data exfiltration, or resource abuse. While sandboxing mitigates this, a perfectly secure sandbox is extremely difficult to achieve. For systems handling sensitive data or critical infrastructure, the risk surface expanded by LLM-driven code execution can be unacceptable.
- Alternative: Pre-approved, vetted, and thoroughly tested functions or microservices. Instead of allowing Claude to write and run code, allow it to call a limited set of pre-defined, secure functions with strictly controlled inputs and outputs.
-
Real-time or Low-Latency Applications:
- Why not Claude: The round-trip time for Claude to generate a tool call, for your application to send it to the Cloud Run service, for the code to execute, and for the result to be returned (potentially back to Claude, then to the user) can introduce significant latency. This makes it unsuitable for interactive UIs, high-frequency trading, or real-time control systems.
- Alternative: Direct application logic, optimized algorithms, or event-driven architectures where code execution is immediate and locally controlled.
-
Complex, State-Dependent Workflows:
- Why not Claude: While Claude can manage multi-step processes, maintaining complex state across multiple tool calls can be challenging. Each tool call is stateless from Claude's perspective unless explicitly managed through prompt history or external state management. Debugging state-related issues when Claude is orchestrating can be difficult.
- Alternative: Traditional workflow engines (e.g., Apache Airflow, Temporal), state machines, or human-in-the-loop systems that explicitly manage and persist state between steps.
-
Cost-Sensitive or High-Volume Simple Tasks:
- Why not Claude: Every interaction with Claude and every invocation of your Cloud Run service incurs costs. For very high volumes of simple, repetitive tasks that don't require LLM-level reasoning, the overhead of using Claude can be economically inefficient.
- Alternative: Dedicated serverless functions (e.g., Cloud Functions, AWS Lambda) or custom scripts triggered by event queues, which are often cheaper and faster for simple, defined operations.
-
When Code Generation is Sufficient:
- Why not Claude: If the goal is merely to generate code snippets, documentation, or design patterns for human developers to review and execute, then actual execution by Claude might be overkill.
- Alternative: Use Claude for code generation and review, but delegate execution and testing to human developers or CI/CD pipelines. This provides an additional layer of human oversight and quality control.
-
Unreliable or Extremely Volatile External APIs:
- Why not Claude: If the external APIs your code execution tool interacts with are frequently down, rate-limited, or return inconsistent data, Claude will struggle to reliably complete tasks. Its retry logic might be simplistic, and it may not have the context to understand transient vs. permanent errors.
- Alternative: Implement robust retry mechanisms, circuit breakers, and comprehensive error handling within your custom tool or directly in your application logic, potentially with human intervention for complex API failures.
In summary, Claude's code execution is a powerful feature for agentic AI, but it's best suited for tasks that benefit from dynamic problem-solving, require complex logical steps, and where the overhead of external orchestration and potential security implications are acceptable. For highly optimized, secure, or real-time operations, direct application code or specialized services often remain superior.
#How Can I Integrate Claude's Code Execution with External Tools and APIs?
Integrating Claude's code execution with external tools and APIs involves designing your Cloud Run execution service to act as a proxy or wrapper for those external services, exposing a unified interface to Claude. This allows Claude to leverage a broad ecosystem of third-party tools (e.g., GitHub, Slack, TradingView, Notion, Figma) by calling your custom execution service, which then translates Claude's requests into specific API calls. The key is to map Claude's high-level instructions to concrete, secure API interactions.
Here's a breakdown of the process and considerations:
1. Design Your Cloud Run Service as an API Gateway/Wrapper
What: Instead of just executing arbitrary code, your Cloud Run service should contain functions that specifically call external APIs. Why: This provides a controlled interface. Claude requests an action (e.g., "create a GitHub issue"), your service translates it to the GitHub API call, executes it, and returns the structured result. This is more secure and reliable than asking Claude to write raw API calls directly. How:
- Modular Functions: Within your
main.py(or other files), create distinct Python functions for each external API interaction (e.g.,create_github_issue(title, body),send_slack_message(channel, text)). - Input Validation: Rigorously validate inputs received from Claude before making external API calls.
- Error Handling: Wrap external API calls in
try-exceptblocks to catch network errors, authentication failures, and API-specific error codes. Return meaningful error messages to Claude. - Authentication: Store API keys for external services securely (e.g., in Google Secret Manager) and retrieve them within your Cloud Run service. Never embed them directly in code or expose them to Claude.
2. Update Your Tool Definition for Claude
What: Your tool definition (OpenAPI spec) should reflect these new, specific functions, not just a generic execute_code endpoint.
Why: Claude needs to know what specific actions it can take and what parameters each action requires.
How:
Expand your OpenAPI schema to include multiple paths or operationId entries, each corresponding to a specific external API function.
Example: Integrating with GitHub (simplified)
Let's assume your Cloud Run service now has an endpoint /github/create-issue that takes repo, title, and body.
claude-executor-service/main.py (excerpt)
# Language: python
# ... (imports and Flask app setup) ...
import requests # For making HTTP requests to GitHub API
GITHUB_TOKEN = os.environ.get("GITHUB_TOKEN") # Fetched from Secret Manager or env var
@app.route('/github/create-issue', methods=['POST'])
def create_github_issue():
data = request.get_json(silent=True)
required_fields = ['repo', 'title', 'body']
if not data or not all(field in data for field in required_fields):
return jsonify({"error": f"Missing one of: {', '.join(required_fields)}"}), 400
repo = data['repo']
title = data['title']
body = data['body']
headers = {
"Authorization": f"token {GITHUB_TOKEN}",
"Accept": "application/vnd.github.v3+json"
}
github_api_url = f"https://api.github.com/repos/{repo}/issues"
try:
response = requests.post(github_api_url, headers=headers, json={"title": title, "body": body})
response.raise_for_status() # Raise an exception for HTTP errors
issue_data = response.json()
return jsonify({
"success": True,
"issue_number": issue_data['number'],
"issue_url": issue_data['html_url']
}), 200
except requests.exceptions.RequestException as e:
return jsonify({"success": False, "error": f"GitHub API error: {str(e)}"}), 500
Tool Definition (OpenAPI spec excerpt)
# Language: json
{
"openapi": "3.1.0",
"info": {
"title": "Agent Tools",
"version": "1.0.0",
"description": "Collection of tools for various tasks."
},
"servers": [
{
"url": "YOUR_CLOUD_RUN_URL"
}
],
"paths": {
"/github/create-issue": {
"post": {
"summary": "Create a GitHub Issue",
"operationId": "create_github_issue",
"requestBody": {
"required": true,
"content": {
"application/json": {
"schema": {
"type": "object",
"properties": {
"repo": {
"type": "string",
"description": "The full repository name (e.g., 'owner/repo_name')."
},
"title": {
"type": "string",
"description": "The title of the new GitHub issue."
},
"body": {
"type": "string",
"description": "The body content of the new GitHub issue."
}
},
"required": ["repo", "title", "body"]
}
}
}
},
"responses": {
"200": {
"description": "Successfully created issue",
"content": {
"application/json": {
"schema": {
"type": "object",
"properties": {
"success": { "type": "boolean" },
"issue_number": { "type": "integer" },
"issue_url": { "type": "string" }
}
}
}
}
}
}
}
},
"/execute": {
# ... (your existing code execution tool definition) ...
}
}
}
3. Securely Manage API Keys and Credentials
What: Store all sensitive credentials (GitHub tokens, Slack webhook URLs, etc.) in Google Secret Manager. Why: Hardcoding secrets is a major security risk. Secret Manager provides a secure, auditable, and versioned way to store and access secrets. How:
- Create Secrets:
# Language: bash gcloud secrets create GITHUB_TOKEN --data-file=- <<<"YOUR_GITHUB_PERSONAL_ACCESS_TOKEN" - Grant Access: Ensure your Cloud Run service account (
claude-code-executor@YOUR_PROJECT_ID.iam.gserviceaccount.com) has theroles/secretmanager.secretAccessorrole (granted in Step 2). - Access in Code: In
main.py, retrieve secrets using the Secret Manager client library or via environment variables if configured in Cloud Run.
# Language: python
# Example of accessing a secret via environment variable in Cloud Run
# In Cloud Run deployment, map secret to env var:
# --set-secrets GITHUB_TOKEN=GITHUB_TOKEN:latest
GITHUB_TOKEN = os.environ.get("GITHUB_TOKEN")
if not GITHUB_TOKEN:
# Fallback or error handling if not found
print("WARNING: GITHUB_TOKEN not found in environment variables.")
4. Iterative Prompting and Tool Chaining
What: Guide Claude through complex tasks requiring multiple tool calls or a combination of code execution and API interactions. Why: Claude can chain tools together, using the output of one tool as input for another, or using code execution to process data before calling an API. How:
- Clear Instructions: Provide Claude with a clear overall goal.
- Intermediate Steps: If the task is complex, you might initially guide Claude with prompts that break it down into smaller, manageable steps.
- Feedback Loop: Allow Claude to observe tool outputs and adjust its subsequent actions. For instance, if
create_github_issuefails, Claude might use theexecute_python_codetool to debug the input or analyze the error message.
By structuring your Cloud Run service as a robust API gateway for external tools and providing Claude with precise tool definitions, you can unlock a vast array of agentic capabilities, allowing Claude to automate complex workflows across diverse platforms.
#Frequently Asked Questions
Can Claude execute code directly on my local machine? No, Claude (as a cloud-hosted LLM) cannot directly execute code on your local machine. It relies on tool definitions that point to external services, like the Cloud Run service described here, to perform code execution. You would need to expose a local execution environment via an HTTP endpoint for Claude to interact with it, which is generally not recommended for security reasons.
What are the primary security concerns with LLM-driven code execution?
The main concerns are arbitrary code execution vulnerabilities (e.g., prompt injection leading to malicious code execution), data exfiltration, and resource abuse. Robust sandboxing, strict input validation, principle of least privilege for service accounts, and careful monitoring are crucial to mitigate these risks. Never use exec() without extreme caution and multiple layers of isolation.
Can I use other cloud providers or serverless functions instead of Google Cloud Run? Yes, absolutely. The principles remain the same: deploy a secure, accessible HTTP endpoint that accepts code for execution (or specific API calls) and return structured results. AWS Lambda, Azure Functions, or even a self-hosted server with appropriate security measures could serve as the backend for Claude's code execution tool.
#Quick Verification Checklist
- Google Cloud Project initialized and
run.googleapis.com,secretmanager.googleapis.comAPIs enabled. - Dedicated
claude-code-executorservice account created withroles/logging.logWriterandroles/secretmanager.secretAccessor(if using secrets). - Python Flask service deployed to Cloud Run at a public URL, configured with
--allow-unauthenticatedand the correct service account. - Manual
curltest to the Cloud Run/executeendpoint returns a successful JSON response with expected output. - Claude's tool definition (OpenAPI spec or Python SDK definition) accurately reflects the Cloud Run service's endpoint, input schema, and expected output.
- Claude, when prompted, attempts to call the
execute_python_codetool with correct parameters (visible in API logs if you're logging tool calls).
Related Reading
Last updated: July 28, 2024
Lazy Tech Talk Newsletter
Stay ahead — weekly AI & dev guides, zero noise →

Harit Narke
Senior SDET · Editor-in-Chief
Senior Software Development Engineer in Test with 10+ years in software engineering. Covers AI developer tools, agentic workflows, and emerging technology with engineering-first rigour. Testing claims, not taking them at face value.
Keep Reading
RESPECTS
Submit your respect if this protocol was helpful.
COMMUNICATIONS
No communications recorded in this log.
