Docs/Adapters/SecureLiteLLM

SecureLiteLLM

Access 100+ LLM providers through LiteLLM's unified interface. True drop-in replacement - just change your import. Supports Groq, Together AI, Mistral, local vLLM/Ollama, and more.

LiteLLM v1.x100+ ProvidersLiteLLM Docs →

Quick Start

Before
import litellm

response = litellm.completion(
    model="groq/llama3-70b-8192",
    messages=[{"role": "user", "content": "Hello"}]
)
After
from macaw_adapters import litellm

response = litellm.completion(
    model="groq/llama3-70b-8192",
    messages=[{"role": "user", "content": "Hello"}]
)

True Drop-in Replacement

The litellm.completion() call is identical. All 100+ providers work automatically with MACAW security.


Supported Providers

Groqgroq/llama3-70b
Together AItogether_ai/...
Mistralmistral/...
OpenAIgpt-4, gpt-4o
Anthropicclaude-3-*
AWS Bedrockbedrock/...
Azure OpenAIazure/...
vLLM (local)openai/llama3
Ollama (local)ollama/llama3

See LiteLLM Providers for the full list of 100+ supported providers.


When to Use Each Path

ScenarioPathExample
Simple app, no user distinctionModule-level or classlitellm.completion(...)
Multi-user, per-user policiesbind_to_userservice.bind_to_user(user)
A2A communication, explicit controlinvoke_tooluser.invoke_tool("tool:xxx/generate")

Usage Patterns

# Pattern 1: Module-level (drop-in replacement)
from macaw_adapters import litellm
response = litellm.completion(model="groq/llama3-70b", messages=[...])

# Pattern 2: Class-based (more control)
from macaw_adapters.litellm import SecureLiteLLM
client = SecureLiteLLM(app_name="my-app")
response = client.completion(model="groq/llama3-70b", messages=[...])

# Pattern 3: OpenAI-style API
response = client.chat.completions.create(model="groq/llama3-70b", messages=[...])

Constructor

client = SecureLiteLLM(
    app_name="my-app",         # Default: macaw-litellm
    api_base=None,             # Custom endpoint (vLLM, Ollama)
    api_key=None,              # Or use provider env vars
    intent_policy={...},       # Optional: MAPL format
    jwt_token=None,            # User mode: JWT identity token
    user_name=None             # User mode: username for registration
)
ParameterTypeDescription
app_namestrApplication name for MACAW (default: macaw-litellm)
api_basestrCustom API endpoint for local models
api_keystrProvider API key (or use env vars)
intent_policydictOptional security policy (MAPL format)
jwt_tokenstrIf provided, enables user mode with this identity
user_namestrOptional username for user mode registration

MACAW-Protected Methods

These methods are routed through MACAW PEP for policy enforcement, audit logging, and authenticated prompts:

# Module-level functions
litellm.completion(model, messages, ...)
litellm.acompletion(model, messages, ...)  # async
litellm.embedding(model, input, ...)
litellm.text_completion(model, prompt, ...)

# Class methods
client.completion(model, messages, ...)
client.chat.completions.create(model, messages, ...)  # OpenAI-style
client.embeddings.create(model, input, ...)

Local Model Support

Use api_base to connect to local models running on vLLM or Ollama:

# vLLM (OpenAI-compatible endpoint)
response = litellm.completion(
    model="openai/llama3",
    messages=[{"role": "user", "content": "Hello"}],
    api_base="http://localhost:8000/v1"
)

# Ollama
response = litellm.completion(
    model="ollama/llama3",
    messages=[{"role": "user", "content": "Hello"}],
    api_base="http://localhost:11434"
)

Local models get the same MACAW security: cryptographic signing, policy enforcement, and audit logging.


Streaming Responses

Set stream=True for real-time token streaming. Policy enforcement happens before the first chunk is returned.

response = litellm.completion(
    model="groq/llama3-70b",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Streaming Security

Policy checks (model, max_tokens, etc.) are validated before streaming begins. Each chunk is logged for audit compliance.


Multi-User: bind_to_user()

For multi-user SaaS apps where different users need different permissions:

from macaw_adapters.litellm import SecureLiteLLM
from macaw_client import MACAWClient

# Service (shared)
service = SecureLiteLLM(app_name="my-service")

# User agent with JWT identity
user = MACAWClient(user_name="alice", iam_token=jwt_token, agent_type="user")
user.register()

# Bind user to service
user_client = service.bind_to_user(user)

# Calls now use alice's identity for policy evaluation
response = user_client.completion(model="groq/llama3-70b", messages=[...])

# Cleanup
user_client.unbind()

Why bind_to_user?

Without bind_to_user(), all users share the service's identity. With it, each user's JWT flows through, enabling policies like: Alice = Groq only; Bob = any provider, max 2000 tokens.


MAPL Tool Names

SecureLiteLLM registers tools using MAPL-compliant resource names for policy matching:

Tool NameMethod
tool:<app>/generatecompletion(), chat.completions.create()
tool:<app>/completetext_completion()
tool:<app>/embedembedding(), embeddings.create()
tool:<app>/<name>User-registered tools

Environment Variables

LiteLLM uses provider-specific environment variables for API keys:

export GROQ_API_KEY="your-key"
export TOGETHER_API_KEY="your-key"
export MISTRAL_API_KEY="your-key"
export OPENAI_API_KEY="your-key"
export ANTHROPIC_API_KEY="your-key"

Official API Reference

SDK Compatibility: litellm ≥1.0.0 • 100+ providers supported


Related Topics