Code Sandbox MCP Server (Ruby)
A secure Docker-based MCP server for executing code in multiple languages, implemented in Ruby. Features Alpine Linux for minimal size and maximum security.
Features
- 12 Supported Languages: Python, JavaScript, TypeScript, Ruby, Bash, Zsh, Fish, Java, Clojure, Kotlin, Groovy, Scala
- Secure Execution: Runs in Docker with strict resource limits
- Alpine Linux: Optimized 978MB production image with JVM support
- Multi-stage Build: Separate optimized images for production and testing
- Ruby 3.4 + JDK 21: Latest stable Ruby with modern Java runtime
- Output Capture: Complete output capture with MCP protocol compliance
- Full MCP Protocol: Implements Model Context Protocol for tool calling
- Comprehensive Testing: RSpec test suite with 97.6% coverage
- Automatic Session Management: State persists between executions for each language
- Session Reset Tool: Clear language sessions when needed
Installation
Using Pre-built Images (Recommended)
# Pull the latest image from GitHub Container Registry
docker pull ghcr.io/timlikesai/code-sandbox-mcp:latest
# Run directly
docker run --rm --interactive ghcr.io/timlikesai/code-sandbox-mcp:latest
Building from Source
# Build the Docker image
docker compose build
# Run tests (optional)
docker compose run --rm code-sandbox bundle exec rspec
Configuration
Add this configuration to:
- Claude Desktop: Settings file
- Claude Code:
.mcp.json
file in your project root, or user-wide settings
{
"mcpServers": {
"code-sandbox": {
"command": "docker",
"args": [
"run", "--rm", "--interactive",
"--network", "none",
"ghcr.io/timlikesai/code-sandbox-mcp:latest"
]
}
}
}
Note: The --network none
flag is a Docker CLI option that disables all network access for security. The container itself supports networking - this flag prevents it.
Claude Code CLI: You can also add via command line:
claude mcp add code-sandbox -- docker run --rm --interactive --network none ghcr.io/timlikesai/code-sandbox-mcp:latest
Security Note: We recommend the --network none
flag to disable network access for safety.
Advanced Configuration (Optional)
For additional security hardening:
{
"mcpServers": {
"code-sandbox": {
"command": "docker",
"args": [
"run", "--rm", "--interactive",
"--read-only",
"--tmpfs", "/tmp",
"--tmpfs", "/app/tmp",
"--memory", "512m",
"--cpus", "0.5",
"--network", "none",
"--security-opt", "no-new-privileges",
"--cap-drop", "ALL",
"ghcr.io/timlikesai/code-sandbox-mcp:latest"
]
}
}
}
Enabling Network Access (When Needed)
About Network Access: The container supports networking, but we recommend using Docker's --network none
flag to disable it for security.
⚠️ Security Recommendation: Use --network none
to prevent code from accessing the internet, your local network, or making external API calls.
To enable network access when needed, remove the --network none
flag:
{
"mcpServers": {
"code-sandbox": {
"command": "docker",
"args": [
"run", "--rm", "--interactive",
"--network", "none",
"ghcr.io/timlikesai/code-sandbox-mcp:latest"
]
},
"code-sandbox-network": {
"command": "docker",
"args": [
"run", "--rm", "--interactive",
"ghcr.io/timlikesai/code-sandbox-mcp:latest"
]
}
}
}
Use cases that require network access:
- API calls (
requests.get()
,fetch()
,curl
) - Package installation (
pip install
,npm install
) - Data downloading or web scraping
- External service integration
- Installing libraries for data science, web frameworks, etc.
Container-Level Package Security:
- ✅ Ephemeral Installs: Packages install within the container's temporary filesystem
- ✅ No Host Persistence: Nothing survives container restart
- ✅ Session Sharing: Installed packages available across all languages in the same container session
- ✅ Clean Slate: Each new container starts fresh with no previous installations
- ✅ Resource Bounded: All installs subject to container memory/disk limits
Docker Network Security:
--network none
: Complete network isolation (recommended for untrusted code)- No
--network
flag: Full network access (safe for development/experimentation) - The container provides strong isolation boundaries regardless of network settings
Usage
Available Tools
- execute_code - Execute code with automatic session management
- validate_code - Validate syntax without execution
- reset_session - Reset sessions for specific languages or all languages
Session Management
Code execution is stateful by default. Each language maintains its own isolated session with:
- Variables and their values
- Function/class definitions
- Imported modules
- Execution history
Sessions expire after 1 hour of inactivity.
Quick Test
# Test execution (with network disabled)
echo '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"execute_code","arguments":{"language":"python","code":"print(\"Hello World!\")"}}}' | docker run --rm -i --network none ghcr.io/timlikesai/code-sandbox-mcp:latest
# Test with network access and package installation
echo '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"execute_code","arguments":{"language":"python","code":"import subprocess; subprocess.run([\"pip\", \"install\", \"requests\"]); import requests; print(requests.get(\"https://httpbin.org/ip\").json())"}}}' | docker run --rm -i ghcr.io/timlikesai/code-sandbox-mcp:latest
# Debug mode
docker run --rm -it --network none ghcr.io/timlikesai/code-sandbox-mcp:latest bash
Security
Multiple layers of container security:
- Container Isolation with resource limits (512MB memory, 0.5 CPU)
- Ephemeral Filesystem - nothing persists after container stops
- Package Installation Safety - packages install in container temp space only
- Network Isolation (configurable via Docker's
--network none
flag) - Read-only Root Filesystem with writable
/tmp
only - No Privileges (
--security-opt no-new-privileges
,--cap-drop ALL
) - Non-root User (executes as
sandbox
user) - Auto-cleanup (
--rm
removes containers after execution) - Configurable Timeout (default 30s via
EXECUTION_TIMEOUT
)
Package Installation Security: When network access is enabled, users can install packages (pip install
, npm install
, etc.) safely within the container. All installations are ephemeral and don't affect the host system or persist between container restarts.
Examples
See examples/README.md for comprehensive examples including:
- JSON examples for all 12 languages
- MCP protocol demonstrations
- Session management patterns
- Error handling
- Response format documentation
Testing
All tests run in Docker containers for security and consistency.
Quick Start
# Build the Docker images
docker compose build
# Run all tests (unit tests + example integration tests)
./examples/test_all_examples.sh
# With verbose output
VERBOSE=true ./examples/test_all_examples.sh
Docker Commands
# Build test image
docker compose build code-sandbox-test
# Run unit tests
docker compose run --rm code-sandbox-test bundle exec rspec
# Run code quality checks
docker compose run --rm code-sandbox-test bundle exec rubocop
docker compose run --rm code-sandbox-test bundle exec bundler-audit check --update
# Run all checks at once
docker compose run --rm code-sandbox-test bundle exec rake
# Interactive shell in test container
docker compose run --rm code-sandbox-test bash
Testing Examples
# Test all examples (same as CI) - uses production image
docker run --rm -v $PWD/examples:/app/examples:ro ghcr.io/timlikesai/code-sandbox-mcp:latest /app/examples/test_examples_in_container.sh
# Test single example
cat examples/correct_tool_call.json | docker run --rm -i ghcr.io/timlikesai/code-sandbox-mcp:latest
# Debug mode
docker run --rm -it ghcr.io/timlikesai/code-sandbox-mcp:latest bash
Architecture
Multi-stage Docker Build:
- Production: 978MB Alpine-based image with all 12 languages
- Test: 1.68GB with development dependencies
- Builder: Intermediate stage for gem compilation
Key Components:
server.rb
- MCP protocol (JSON-RPC)streaming_executor.rb
- Code execution with output captureexecutor.rb
- Core execution enginelanguages.rb
- Language configurations
Performance:
- Cold start: ~1-2 seconds
- Execution: ~50-200ms per request
- Memory: <100MB typical, 512MB limit
License
MIT