MCP Server Testing¶
This document describes how to test the Shannot MCP server for v0.5.0, which uses PyPy sandbox architecture with session-based approval.
Overview¶
Shannot v0.5.0 includes comprehensive MCP testing:
- Protocol Tests: JSON-RPC 2.0 message handling (
test/test_mcp_protocol.py) - Server Tests: Tool registration, validation, resources (
test/test_mcp_server.py) - Integration Tests: End-to-end workflows (
test/test_mcp_script_execution.py)
Total Coverage: 60 tests covering protocol, server infrastructure, and execution workflows.
Prerequisites¶
Install Development Dependencies¶
Install PyPy Sandbox Runtime¶
Running Tests¶
Quick Test (All MCP Tests)¶
Individual Test Files¶
# Protocol tests (11 tests)
uv run pytest test/test_mcp_protocol.py -v
# Server tests (28 tests)
uv run pytest test/test_mcp_server.py -v
# Integration tests (21 tests)
uv run pytest test/test_mcp_script_execution.py -v
Test with Coverage¶
# Run with coverage report
uv run pytest test/test_mcp*.py --cov=shannot.mcp --cov-report=term
# Generate HTML coverage report
uv run pytest test/test_mcp*.py --cov=shannot.mcp --cov-report=html
# Open htmlcov/index.html
Test Coverage¶
Protocol Tests (test_mcp_protocol.py)¶
Tests JSON-RPC 2.0 protocol implementation:
- ✅ Read valid JSON messages
- ✅ Handle EOF gracefully
- ✅ Handle invalid JSON
- ✅ Handle keyboard interrupt
- ✅ Write messages with proper formatting
- ✅ Handle broken pipe errors
- ✅ Handle I/O errors
- ✅ Serve loop processes requests
- ✅ Handle notifications
- ✅ Handle handler exceptions
Coverage: Pure stdlib implementation (json, sys, io)
Server Tests (test_mcp_server.py)¶
Tests base MCP server infrastructure:
Base Server: - ✅ Server initialization with metadata - ✅ Tool registration - ✅ Resource registration - ✅ Handle initialize request - ✅ Handle ping request - ✅ Handle tools/list - ✅ Handle tools/call - ✅ Handle unknown methods
Shannot Server: - ✅ Default profile loading - ✅ Profile structure validation - ✅ Tool registration (sandbox_run, session_result) - ✅ Resource registration (profiles, status) - ✅ AST-based script analysis - ✅ Command extraction from AST - ✅ Invalid script handling - ✅ Invalid profile handling - ✅ Denied operation handling - ✅ Session result polling
Coverage: Server infrastructure, validation, AST analysis
Integration Tests (test_mcp_script_execution.py)¶
Tests complete execution workflows:
Execution Paths: - ✅ Fast path with allowed operations - ✅ Review path with unapproved operations - ✅ Blocked path with denied operations
Session Management: - ✅ Session creation from review path - ✅ Session result polling (pending) - ✅ Session expiry handling - ✅ Session cleanup
AST Analysis: - ✅ Detect multiple subprocess calls - ✅ Handle dynamic commands (limitation) - ✅ Syntax error handling
Profile Validation: - ✅ Different profiles have different allowlists - ✅ Custom profile loading - ✅ Session naming
Resource Endpoints: - ✅ List profiles - ✅ Get profile configuration - ✅ Get runtime status
Tool Schemas: - ✅ Python 3.6 syntax warnings - ✅ Dynamic profile validation - ✅ Session result schema
Coverage: End-to-end workflows, session lifecycle, profile management
Test Organization¶
test/
├── test_mcp_protocol.py # 11 tests - JSON-RPC protocol
├── test_mcp_server.py # 28 tests - Server infrastructure
└── test_mcp_script_execution.py # 21 tests - Integration workflows
Manual Testing¶
Interactive Server Testing¶
Start the MCP server manually to test JSON-RPC messages:
# Start server with verbose logging
shannot-mcp --verbose
# In another terminal, send JSON-RPC messages:
echo '{"jsonrpc": "2.0", "method": "initialize", "params": {}, "id": 1}' | shannot-mcp
# List tools
echo '{"jsonrpc": "2.0", "method": "tools/list", "id": 2}' | shannot-mcp
# Call sandbox_run
echo '{"jsonrpc": "2.0", "method": "tools/call", "params": {"name": "sandbox_run", "arguments": {"script": "print(\"hello\")", "profile": "minimal"}}, "id": 3}' | shannot-mcp
Test with Claude Code¶
Install in Claude Code and test interactively:
# Install for Claude Code
shannot mcp install --client claude-code
# Restart Claude Code, then:
# > /mcp
# Should show shannot with 2 tools, 3 resources
Ask Claude: - "Use the sandbox_run tool to check disk space with df -h" - "List the available approval profiles" - "What's the status of the PyPy sandbox runtime?"
CI/CD Integration¶
GitHub Actions¶
Tests run automatically on: - Pushes to main - Pull requests - Manual workflow dispatch
See .github/workflows/test.yml for configuration.
Pre-commit Hooks¶
Install pre-commit hooks for local testing:
# Install pre-commit
uv pip install pre-commit
# Install hooks
pre-commit install
# Run manually
pre-commit run --all-files
Debugging Test Failures¶
Enable Verbose Output¶
# Verbose pytest output
uv run pytest test/test_mcp*.py -vv
# Show print statements
uv run pytest test/test_mcp*.py -s
# Stop on first failure
uv run pytest test/test_mcp*.py -x
Debug Specific Test¶
# Run specific test
uv run pytest test/test_mcp_server.py::TestShannotMCPServer::test_sandbox_run_tool_registered -vv
# Run with pdb debugger
uv run pytest test/test_mcp_server.py::TestShannotMCPServer::test_sandbox_run_tool_registered --pdb
Check Logs¶
Testing with Custom Profiles¶
Create a custom profile for testing:
# Create test profile
mkdir -p ~/.config/shannot
cat > ~/.config/shannot/test.json <<'EOF'
{
"auto_approve": ["echo", "printf"],
"always_deny": ["eval"]
}
EOF
# Test with custom profile
uv run pytest test/test_mcp_script_execution.py::TestScriptExecutionWorkflow::test_custom_profile -v
Performance Testing¶
Measure Test Duration¶
# Show slowest tests
uv run pytest test/test_mcp*.py --durations=10
# Run with timing
uv run pytest test/test_mcp*.py -v --tb=short --durations=0
Parallel Testing¶
# Install pytest-xdist
uv pip install pytest-xdist
# Run tests in parallel
uv run pytest test/test_mcp*.py -n auto
Known Limitations¶
PyPy Sandbox Runtime¶
Tests that require actual script execution need PyPy sandbox runtime:
# If runtime not available, tests will gracefully handle errors
# Install runtime for full integration testing
shannot setup
Session Cleanup¶
Integration tests create sessions. Cleanup is automatic, but you can manually check:
# List pending sessions
shannot approve list
# Clean up test sessions
shannot approve list | grep "test-session" | cut -d' ' -f1 | xargs -I{} shannot approve cancel {}
Best Practices¶
-
Run tests before committing:
-
Check coverage:
-
Update tests when adding features: New MCP tools/resources should have corresponding tests
-
Use meaningful test names: Follow pattern
test_<feature>_<scenario> -
Mock external dependencies: Tests should not require network access or SSH
Troubleshooting¶
"PyPy sandbox runtime not found"¶
Tests will show warnings but continue. For full integration testing:
"Session directory already exists"¶
Clean up stale sessions:
"Import errors"¶
Reinstall in development mode:
"Tests hang"¶
Check for deadlocks in subprocess execution. Use timeout:
Test Checklist¶
When adding new MCP features, ensure:
- [ ] Protocol tests for new JSON-RPC methods
- [ ] Server tests for tool/resource registration
- [ ] Integration tests for end-to-end workflows
- [ ] Tests pass locally
- [ ] Coverage remains >80%
- [ ] No new warnings or errors
- [ ] Documentation updated