SecureLiteLLM
Access 100+ LLM providers through LiteLLM's unified interface. True drop-in replacement - just change your import. Supports Groq, Together AI, Mistral, local vLLM/Ollama, and more.
Quick Start
import litellm
response = litellm.completion(
model="groq/llama3-70b-8192",
messages=[{"role": "user", "content": "Hello"}]
)from macaw_adapters import litellm
response = litellm.completion(
model="groq/llama3-70b-8192",
messages=[{"role": "user", "content": "Hello"}]
)True Drop-in Replacement
The litellm.completion() call is identical. All 100+ providers work automatically with MACAW security.
Supported Providers
See LiteLLM Providers for the full list of 100+ supported providers.
When to Use Each Path
| Scenario | Path | Example |
|---|---|---|
| Simple app, no user distinction | Module-level or class | litellm.completion(...) |
| Multi-user, per-user policies | bind_to_user | service.bind_to_user(user) |
| A2A communication, explicit control | invoke_tool | user.invoke_tool("tool:xxx/generate") |
Usage Patterns
# Pattern 1: Module-level (drop-in replacement)
from macaw_adapters import litellm
response = litellm.completion(model="groq/llama3-70b", messages=[...])
# Pattern 2: Class-based (more control)
from macaw_adapters.litellm import SecureLiteLLM
client = SecureLiteLLM(app_name="my-app")
response = client.completion(model="groq/llama3-70b", messages=[...])
# Pattern 3: OpenAI-style API
response = client.chat.completions.create(model="groq/llama3-70b", messages=[...])Constructor
client = SecureLiteLLM(
app_name="my-app", # Default: macaw-litellm
api_base=None, # Custom endpoint (vLLM, Ollama)
api_key=None, # Or use provider env vars
intent_policy={...}, # Optional: MAPL format
jwt_token=None, # User mode: JWT identity token
user_name=None # User mode: username for registration
)| Parameter | Type | Description |
|---|---|---|
| app_name | str | Application name for MACAW (default: macaw-litellm) |
| api_base | str | Custom API endpoint for local models |
| api_key | str | Provider API key (or use env vars) |
| intent_policy | dict | Optional security policy (MAPL format) |
| jwt_token | str | If provided, enables user mode with this identity |
| user_name | str | Optional username for user mode registration |
MACAW-Protected Methods
These methods are routed through MACAW PEP for policy enforcement, audit logging, and authenticated prompts:
# Module-level functions
litellm.completion(model, messages, ...)
litellm.acompletion(model, messages, ...) # async
litellm.embedding(model, input, ...)
litellm.text_completion(model, prompt, ...)
# Class methods
client.completion(model, messages, ...)
client.chat.completions.create(model, messages, ...) # OpenAI-style
client.embeddings.create(model, input, ...)Local Model Support
Use api_base to connect to local models running on vLLM or Ollama:
# vLLM (OpenAI-compatible endpoint)
response = litellm.completion(
model="openai/llama3",
messages=[{"role": "user", "content": "Hello"}],
api_base="http://localhost:8000/v1"
)
# Ollama
response = litellm.completion(
model="ollama/llama3",
messages=[{"role": "user", "content": "Hello"}],
api_base="http://localhost:11434"
)Local models get the same MACAW security: cryptographic signing, policy enforcement, and audit logging.
Streaming Responses
Set stream=True for real-time token streaming. Policy enforcement happens before the first chunk is returned.
response = litellm.completion(
model="groq/llama3-70b",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")Streaming Security
Policy checks (model, max_tokens, etc.) are validated before streaming begins. Each chunk is logged for audit compliance.
Multi-User: bind_to_user()
For multi-user SaaS apps where different users need different permissions:
from macaw_adapters.litellm import SecureLiteLLM
from macaw_client import MACAWClient
# Service (shared)
service = SecureLiteLLM(app_name="my-service")
# User agent with JWT identity
user = MACAWClient(user_name="alice", iam_token=jwt_token, agent_type="user")
user.register()
# Bind user to service
user_client = service.bind_to_user(user)
# Calls now use alice's identity for policy evaluation
response = user_client.completion(model="groq/llama3-70b", messages=[...])
# Cleanup
user_client.unbind()Why bind_to_user?
Without bind_to_user(), all users share the service's identity. With it, each user's JWT flows through, enabling policies like: Alice = Groq only; Bob = any provider, max 2000 tokens.
MAPL Tool Names
SecureLiteLLM registers tools using MAPL-compliant resource names for policy matching:
| Tool Name | Method |
|---|---|
| tool:<app>/generate | completion(), chat.completions.create() |
| tool:<app>/complete | text_completion() |
| tool:<app>/embed | embedding(), embeddings.create() |
| tool:<app>/<name> | User-registered tools |
Environment Variables
LiteLLM uses provider-specific environment variables for API keys:
export GROQ_API_KEY="your-key"
export TOGETHER_API_KEY="your-key"
export MISTRAL_API_KEY="your-key"
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"Official API Reference
SDK Compatibility: litellm ≥1.0.0 • 100+ providers supported