omnillm

package module
v0.11.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 10, 2026 License: MIT Imports: 22 Imported by: 0

README

OmniLLM - Unified Go SDK for Large Language Models

Build Status Lint Status Go Report Card Docs License

OmniLLM is a unified Go SDK that provides a consistent interface for interacting with multiple Large Language Model (LLM) providers including OpenAI, Anthropic (Claude), Google Gemini, X.AI (Grok), and Ollama. It implements the Chat Completions API pattern and offers both synchronous and streaming capabilities. Additional providers like AWS Bedrock are available as external modules.

✨ Features

  • 🔌 Multi-Provider Support: OpenAI, Anthropic (Claude), Google Gemini, X.AI (Grok), Ollama, plus external providers (AWS Bedrock, etc.)
  • 🎯 Unified API: Same interface across all providers
  • 📡 Streaming Support: Real-time response streaming for all providers
  • 🧠 Conversation Memory: Persistent conversation history using Key-Value Stores
  • 🔀 Fallback Providers: Automatic failover to backup providers when primary fails
  • ⚡ Circuit Breaker: Prevent cascading failures by temporarily skipping unhealthy providers
  • 🔢 Token Estimation: Pre-flight token counting to validate requests before sending
  • 💾 Response Caching: Cache identical requests with configurable TTL to reduce costs
  • 📊 Observability Hooks: Extensible hooks for tracing, logging, and metrics without modifying core library
  • 🔄 Retry with Backoff: Automatic retries for transient failures (rate limits, 5xx errors)
  • 🧪 Comprehensive Testing: Unit tests, integration tests, and mock implementations included
  • 🔧 Extensible: Easy to add new LLM providers
  • 📦 Modular: Provider-specific implementations in separate packages
  • 🏗️ Reference Architecture: Internal providers serve as reference implementations for external providers
  • 🔌 3rd Party Friendly: External providers can be injected without modifying core library
  • ⚡ Type Safe: Full Go type safety with comprehensive error handling

🏗️ Architecture

OmniLLM uses a clean, modular architecture that separates concerns and enables easy extensibility:

omnillm/
├── client.go            # Main ChatClient wrapper
├── providers.go         # Factory functions for built-in providers
├── types.go             # Type aliases for backward compatibility
├── memory.go            # Conversation memory management
├── observability.go     # ObservabilityHook interface for tracing/logging/metrics
├── errors.go            # Unified error handling
├── *_test.go            # Comprehensive unit tests
├── provider/            # 🎯 Public interface package for external providers
│   ├── interface.go     # Provider interface that all providers must implement
│   └── types.go         # Unified request/response types
├── providers/           # 📦 Individual provider packages (reference implementations)
│   ├── openai/          # OpenAI implementation
│   │   ├── openai.go    # HTTP client
│   │   ├── types.go     # OpenAI-specific types
│   │   ├── adapter.go   # provider.Provider implementation
│   │   └── *_test.go    # Provider tests
│   ├── anthropic/       # Anthropic implementation
│   │   ├── anthropic.go # HTTP client (SSE streaming)
│   │   ├── types.go     # Anthropic-specific types
│   │   ├── adapter.go   # provider.Provider implementation
│   │   └── *_test.go    # Provider and integration tests
│   ├── gemini/          # Google Gemini implementation
│   ├── xai/             # X.AI Grok implementation
│   └── ollama/          # Ollama implementation
└── testing/             # 🧪 Test utilities
    └── mock_kvs.go      # Mock KVS for memory testing
Key Architecture Benefits
  • 🎯 Public Interface: The provider package exports the Provider interface that external packages can implement
  • 🏗️ Reference Implementation: Internal providers follow the exact same structure that external providers should use
  • 🔌 Direct Injection: External providers are injected via ClientConfig.CustomProvider without modifying core code
  • 📦 Modular Design: Each provider is self-contained with its own HTTP client, types, and adapter
  • 🧪 Testable: Clean interfaces that can be easily mocked and tested
  • 🔧 Extensible: New providers can be added without touching existing code
  • ⚡ Native Implementation: Uses standard net/http for direct API communication (no official SDK dependencies)

🚀 Quick Start

Installation
go get github.com/agentplexus/omnillm
Basic Usage
package main

import (
    "context"
    "fmt"
    "log"
    
    "github.com/agentplexus/omnillm"
)

func main() {
    // Create a client for OpenAI
    client, err := omnillm.NewClient(omnillm.ClientConfig{
        Providers: []omnillm.ProviderConfig{
            {Provider: omnillm.ProviderNameOpenAI, APIKey: "your-openai-api-key"},
        },
    })
    if err != nil {
        log.Fatal(err)
    }
    defer client.Close()

    // Create a chat completion request
    response, err := client.CreateChatCompletion(context.Background(), &omnillm.ChatCompletionRequest{
        Model: omnillm.ModelGPT4o,
        Messages: []omnillm.Message{
            {
                Role:    omnillm.RoleUser,
                Content: "Hello! How can you help me today?",
            },
        },
        MaxTokens:   &[]int{150}[0],
        Temperature: &[]float64{0.7}[0],
    })
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("Response: %s\n", response.Choices[0].Message.Content)
    fmt.Printf("Tokens used: %d\n", response.Usage.TotalTokens)
}

🔧 Supported Providers

OpenAI
  • Models: GPT-5, GPT-4.1, GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo
  • Features: Chat completions, streaming, function calling
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "your-openai-api-key"},
    },
})
Anthropic (Claude)
  • Models: Claude-Opus-4.1, Claude-Opus-4, Claude-Sonnet-4, Claude-3.7-Sonnet, Claude-3.5-Haiku, Claude-3-Opus, Claude-3-Sonnet, Claude-3-Haiku
  • Features: Chat completions, streaming, system message support
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameAnthropic, APIKey: "your-anthropic-api-key"},
    },
})
Google Gemini
  • Models: Gemini-2.5-Pro, Gemini-2.5-Flash, Gemini-1.5-Pro, Gemini-1.5-Flash
  • Features: Chat completions, streaming
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameGemini, APIKey: "your-gemini-api-key"},
    },
})
AWS Bedrock (External Provider)

AWS Bedrock is available as an external module to avoid pulling AWS SDK dependencies for users who don't need it.

go get github.com/agentplexus/omnillm-bedrock
import (
    "github.com/agentplexus/omnillm"
    "github.com/agentplexus/omnillm-bedrock"
)

// Create the Bedrock provider
bedrockProvider, err := bedrock.NewProvider("us-east-1")
if err != nil {
    log.Fatal(err)
}

// Use it with omnillm via CustomProvider
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {CustomProvider: bedrockProvider},
    },
})

See External Providers for more details.

X.AI (Grok)
  • Models: Grok-4.1-Fast (Reasoning/Non-Reasoning), Grok-4 (0709), Grok-4-Fast (Reasoning/Non-Reasoning), Grok-Code-Fast, Grok-3, Grok-3-Mini, Grok-2, Grok-2-Vision
  • Features: Chat completions, streaming, OpenAI-compatible API, 2M context window (4.1/4-Fast models)
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameXAI, APIKey: "your-xai-api-key"},
    },
})
Ollama (Local Models)
  • Models: Llama 3, Mistral, CodeLlama, Gemma, Qwen2.5, DeepSeek-Coder
  • Features: Local inference, no API keys required, optimized for Apple Silicon
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOllama, BaseURL: "http://localhost:11434"},
    },
})

🔌 External Providers

Some providers with heavy SDK dependencies are available as separate modules to keep the core library lightweight. These are injected via ClientConfig.CustomProvider.

Provider Module Why External
AWS Bedrock github.com/agentplexus/omnillm-bedrock AWS SDK v2 adds 17+ transitive dependencies
Using External Providers
import (
    "github.com/agentplexus/omnillm"
    "github.com/agentplexus/omnillm-bedrock"  // or your custom provider
)

// Create the external provider
bedrockProv, err := bedrock.NewProvider("us-east-1")
if err != nil {
    log.Fatal(err)
}

// Inject via CustomProvider in Providers slice
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {CustomProvider: bedrockProv},
    },
})
Creating Your Own External Provider

External providers implement the provider.Provider interface:

import "github.com/agentplexus/omnillm/provider"

type MyProvider struct{}

func (p *MyProvider) Name() string { return "myprovider" }
func (p *MyProvider) Close() error { return nil }

func (p *MyProvider) CreateChatCompletion(ctx context.Context, req *provider.ChatCompletionRequest) (*provider.ChatCompletionResponse, error) {
    // Your implementation
}

func (p *MyProvider) CreateChatCompletionStream(ctx context.Context, req *provider.ChatCompletionRequest) (provider.ChatCompletionStream, error) {
    // Your streaming implementation
}

See the omnillm-bedrock source code as a reference implementation.

📡 Streaming Example

stream, err := client.CreateChatCompletionStream(context.Background(), &omnillm.ChatCompletionRequest{
    Model: omnillm.ModelGPT4o,
    Messages: []omnillm.Message{
        {
            Role:    omnillm.RoleUser,
            Content: "Tell me a short story about AI.",
        },
    },
    MaxTokens:   &[]int{200}[0],
    Temperature: &[]float64{0.8}[0],
})
if err != nil {
    log.Fatal(err)
}
defer stream.Close()

fmt.Print("AI Response: ")
for {
    chunk, err := stream.Recv()
    if err == io.EOF {
        break
    }
    if err != nil {
        log.Fatal(err)
    }
    
    if len(chunk.Choices) > 0 && chunk.Choices[0].Delta != nil {
        fmt.Print(chunk.Choices[0].Delta.Content)
    }
}
fmt.Println()

🧠 Conversation Memory

OmniLLM supports persistent conversation memory using any Key-Value Store that implements the Sogo KVS interface. This enables multi-turn conversations that persist across application restarts.

Memory Configuration
// Configure memory settings
memoryConfig := omnillm.MemoryConfig{
    MaxMessages: 50,                    // Keep last 50 messages per session
    TTL:         24 * time.Hour,       // Messages expire after 24 hours
    KeyPrefix:   "myapp:conversations", // Custom key prefix
}

// Create client with memory (using Redis, DynamoDB, etc.)
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "your-api-key"},
    },
    Memory:       kvsClient,          // Your KVS implementation
    MemoryConfig: &memoryConfig,
})
Memory-Aware Completions
// Create a session with system message
err = client.CreateConversationWithSystemMessage(ctx, "user-123", 
    "You are a helpful assistant that remembers our conversation history.")

// Use memory-aware completion - automatically loads conversation history
response, err := client.CreateChatCompletionWithMemory(ctx, "user-123", &omnillm.ChatCompletionRequest{
    Model: omnillm.ModelGPT4o,
    Messages: []omnillm.Message{
        {Role: omnillm.RoleUser, Content: "What did we discuss last time?"},
    },
    MaxTokens: &[]int{200}[0],
})

// The response will include context from previous conversations in this session
Memory Management
// Load conversation history
conversation, err := client.LoadConversation(ctx, "user-123")

// Get just the messages
messages, err := client.GetConversationMessages(ctx, "user-123")

// Manually append messages
err = client.AppendMessage(ctx, "user-123", omnillm.Message{
    Role:    omnillm.RoleUser,
    Content: "Remember this important fact: I prefer JSON responses.",
})

// Delete conversation
err = client.DeleteConversation(ctx, "user-123")
KVS Backend Support

Memory works with any KVS implementation:

  • Redis: For high-performance, distributed memory
  • DynamoDB: For AWS-native storage
  • In-Memory: For testing and development
  • Custom: Any implementation of the Sogo KVS interface
// Example with Redis (using a hypothetical Redis KVS implementation)
redisKVS := redis.NewKVSClient("localhost:6379")
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "your-key"},
    },
    Memory: redisKVS,
})

📊 Observability Hooks

OmniLLM supports observability hooks that allow you to add tracing, logging, and metrics to LLM calls without modifying the core library. This is useful for integrating with observability platforms like OpenTelemetry, Datadog, or custom monitoring solutions.

ObservabilityHook Interface
// LLMCallInfo provides metadata about the LLM call
type LLMCallInfo struct {
    CallID       string    // Unique identifier for correlating BeforeRequest/AfterResponse
    ProviderName string    // e.g., "openai", "anthropic"
    StartTime    time.Time // When the call started
}

// ObservabilityHook allows external packages to observe LLM calls
type ObservabilityHook interface {
    // BeforeRequest is called before each LLM call.
    // Returns a new context for trace/span propagation.
    BeforeRequest(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest) context.Context

    // AfterResponse is called after each LLM call completes (success or failure).
    AfterResponse(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest, resp *provider.ChatCompletionResponse, err error)

    // WrapStream wraps a stream for observability of streaming responses.
    // Note: AfterResponse is only called if stream creation fails. For streaming
    // completion timing, handle Close() or EOF detection in your wrapper.
    WrapStream(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest, stream provider.ChatCompletionStream) provider.ChatCompletionStream
}
Basic Usage
// Create a simple logging hook
type LoggingHook struct{}

func (h *LoggingHook) BeforeRequest(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest) context.Context {
    log.Printf("[%s] LLM call started: provider=%s model=%s", info.CallID, info.ProviderName, req.Model)
    return ctx
}

func (h *LoggingHook) AfterResponse(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest, resp *omnillm.ChatCompletionResponse, err error) {
    duration := time.Since(info.StartTime)
    if err != nil {
        log.Printf("[%s] LLM call failed: provider=%s duration=%v error=%v", info.CallID, info.ProviderName, duration, err)
    } else {
        log.Printf("[%s] LLM call completed: provider=%s duration=%v tokens=%d", info.CallID, info.ProviderName, duration, resp.Usage.TotalTokens)
    }
}

func (h *LoggingHook) WrapStream(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest, stream omnillm.ChatCompletionStream) omnillm.ChatCompletionStream {
    return stream // Return unwrapped for simple logging, or wrap for streaming metrics
}

// Use the hook when creating a client
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "your-api-key"},
    },
    ObservabilityHook: &LoggingHook{},
})
OpenTelemetry Integration Example
type OTelHook struct {
    tracer trace.Tracer
}

func (h *OTelHook) BeforeRequest(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest) context.Context {
    ctx, span := h.tracer.Start(ctx, "llm.chat_completion",
        trace.WithAttributes(
            attribute.String("llm.provider", info.ProviderName),
            attribute.String("llm.model", req.Model),
        ),
    )
    return ctx
}

func (h *OTelHook) AfterResponse(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest, resp *omnillm.ChatCompletionResponse, err error) {
    span := trace.SpanFromContext(ctx)
    defer span.End()

    if err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, err.Error())
    } else if resp != nil {
        span.SetAttributes(
            attribute.Int("llm.tokens.total", resp.Usage.TotalTokens),
            attribute.Int("llm.tokens.prompt", resp.Usage.PromptTokens),
            attribute.Int("llm.tokens.completion", resp.Usage.CompletionTokens),
        )
    }
}

func (h *OTelHook) WrapStream(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest, stream omnillm.ChatCompletionStream) omnillm.ChatCompletionStream {
    return &observableStream{stream: stream, ctx: ctx, info: info}
}
Key Benefits
  • Non-Invasive: Add observability without modifying core library code
  • Provider Agnostic: Works with all LLM providers (OpenAI, Anthropic, Gemini, etc.)
  • Streaming Support: Wrap streams to observe streaming responses
  • Context Propagation: Pass trace context through the entire call chain
  • Flexible: Implement only the methods you need; all are called if the hook is set

🔀 Fallback Providers

OmniLLM supports automatic failover to backup providers when the primary provider fails. Fallback only triggers on retryable errors (rate limits, server errors, network issues) - authentication errors and invalid requests do not trigger fallback.

Basic Usage
// Providers[0] is primary, Providers[1+] are fallbacks
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "openai-key"},       // Primary
        {Provider: omnillm.ProviderNameAnthropic, APIKey: "anthropic-key"}, // Fallback 1
        {Provider: omnillm.ProviderNameGemini, APIKey: "gemini-key"},       // Fallback 2
    },
})

// If OpenAI fails with a retryable error, automatically tries Anthropic, then Gemini
response, err := client.CreateChatCompletion(ctx, request)
With Circuit Breaker

Enable circuit breaker to temporarily skip providers that are failing repeatedly:

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "openai-key"},
        {Provider: omnillm.ProviderNameAnthropic, APIKey: "anthropic-key"},
    },
    CircuitBreakerConfig: &omnillm.CircuitBreakerConfig{
        FailureThreshold: 5,               // Open after 5 consecutive failures
        SuccessThreshold: 2,               // Close after 2 successes in half-open
        Timeout:          30 * time.Second, // Wait before trying again
    },
})
Error Classification

Fallback uses intelligent error classification:

Error Type Triggers Fallback
Rate limits (429) ✅ Yes
Server errors (5xx) ✅ Yes
Network errors ✅ Yes
Auth errors (401/403) ❌ No
Invalid requests (400) ❌ No

⚡ Circuit Breaker

The circuit breaker pattern prevents cascading failures by temporarily skipping providers that are unhealthy.

States
  • Closed: Normal operation, requests flow through
  • Open: Provider is failing, requests skip it immediately
  • Half-Open: Testing if provider has recovered
Configuration
cbConfig := &omnillm.CircuitBreakerConfig{
    FailureThreshold:     5,               // Failures before opening
    SuccessThreshold:     2,               // Successes to close from half-open
    Timeout:              30 * time.Second, // Wait before half-open
    FailureRateThreshold: 0.5,             // 50% failure rate opens circuit
    MinimumRequests:      10,              // Minimum requests for rate calculation
}

🔢 Token Estimation

OmniLLM provides pre-flight token estimation to validate requests before sending them to the API. This helps avoid hitting context window limits.

Basic Usage
// Create estimator with default config
estimator := omnillm.NewTokenEstimator(omnillm.DefaultTokenEstimatorConfig())

// Estimate tokens for messages
tokens, err := estimator.EstimateTokens("gpt-4o", messages)

// Get model's context window
window := estimator.GetContextWindow("gpt-4o") // Returns 128000
Automatic Validation

Enable automatic token validation in client:

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "your-key"},
    },
    TokenEstimator: omnillm.NewTokenEstimator(omnillm.DefaultTokenEstimatorConfig()),
    ValidateTokens: true, // Rejects requests that exceed context window
})

// Returns TokenLimitError if request exceeds model limits
response, err := client.CreateChatCompletion(ctx, request)
if tlErr, ok := err.(*omnillm.TokenLimitError); ok {
    fmt.Printf("Request has %d tokens, but model only supports %d\n",
        tlErr.EstimatedTokens, tlErr.ContextWindow)
}
Built-in Context Windows

Token estimator includes context windows for 40+ models:

Provider Models Context Window
OpenAI GPT-4o, GPT-4o-mini 128,000
OpenAI o1 200,000
Anthropic Claude 3/3.5/4 200,000
Google Gemini 2.5 1,000,000
Google Gemini 1.5 Pro 2,000,000
X.AI Grok 3/4 128,000
Custom Configuration
config := omnillm.TokenEstimatorConfig{
    CharactersPerToken: 3.5, // More conservative estimate
    CustomContextWindows: map[string]int{
        "my-custom-model": 500000,
        "gpt-4o":          200000, // Override built-in
    },
}
estimator := omnillm.NewTokenEstimator(config)

💾 Response Caching

OmniLLM supports response caching to reduce API costs for identical requests. Caching uses the same KVS backend as conversation memory.

Basic Usage
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "your-key"},
    },
    Cache: kvsClient, // Your KVS implementation (Redis, DynamoDB, etc.)
    CacheConfig: &omnillm.CacheConfig{
        TTL:       1 * time.Hour,        // Cache duration
        KeyPrefix: "myapp:llm-cache",    // Key prefix in KVS
    },
})

// First call hits the API
response1, _ := client.CreateChatCompletion(ctx, request)

// Second identical call returns cached response
response2, _ := client.CreateChatCompletion(ctx, request)

// Check if response was from cache
if response2.ProviderMetadata["cache_hit"] == true {
    fmt.Println("Response was cached!")
}
Cache Configuration
cacheConfig := &omnillm.CacheConfig{
    TTL:                1 * time.Hour,       // Time-to-live
    KeyPrefix:          "omnillm:cache",     // Key prefix
    SkipStreaming:      true,                // Don't cache streaming (default)
    CacheableModels:    []string{"gpt-4o"},  // Only cache specific models (nil = all)
    IncludeTemperature: true,                // Temperature affects cache key
    IncludeSeed:        true,                // Seed affects cache key
}
Cache Key Generation

Cache keys are generated from a SHA-256 hash of:

  • Model name
  • Messages (role, content, name, tool_call_id)
  • MaxTokens, Temperature, TopP, TopK, Seed, Stop sequences

Different parameter values = different cache keys.

🔄 Provider Switching

The unified interface makes it easy to switch between providers:

// Same request works with any provider
request := &omnillm.ChatCompletionRequest{
    Model: omnillm.ModelGPT4o, // or omnillm.ModelClaude3Sonnet, etc.
    Messages: []omnillm.Message{
        {Role: omnillm.RoleUser, Content: "Hello, world!"},
    },
    MaxTokens: &[]int{100}[0],
}

// OpenAI
openaiClient, _ := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "openai-key"},
    },
})

// Anthropic
anthropicClient, _ := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameAnthropic, APIKey: "anthropic-key"},
    },
})

// Gemini
geminiClient, _ := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameGemini, APIKey: "gemini-key"},
    },
})

// Same API call for all providers
response1, _ := openaiClient.CreateChatCompletion(ctx, request)
response2, _ := anthropicClient.CreateChatCompletion(ctx, request)
response3, _ := geminiClient.CreateChatCompletion(ctx, request)

🧪 Testing

OmniLLM includes a comprehensive test suite with both unit tests and integration tests.

Running Tests
# Run all unit tests (no API keys required)
go test ./... -short

# Run with coverage
go test ./... -short -cover

# Run integration tests (requires API keys)
ANTHROPIC_API_KEY=your-key go test ./providers/anthropic -v
OPENAI_API_KEY=your-key go test ./providers/openai -v
XAI_API_KEY=your-key go test ./providers/xai -v

# Run all tests including integration
ANTHROPIC_API_KEY=your-key OPENAI_API_KEY=your-key XAI_API_KEY=your-key go test ./... -v
Test Coverage
  • Unit Tests: Mock-based tests that run without external dependencies
  • Integration Tests: Real API tests that skip gracefully when API keys are not set
  • Memory Tests: Comprehensive conversation memory management tests
  • Provider Tests: Adapter logic, message conversion, and streaming tests
Writing Tests

The clean interface design makes testing straightforward:

// Mock the Provider interface for testing
type mockProvider struct{}

func (m *mockProvider) CreateChatCompletion(ctx context.Context, req *omnillm.ChatCompletionRequest) (*omnillm.ChatCompletionResponse, error) {
    return &omnillm.ChatCompletionResponse{
        Choices: []omnillm.ChatCompletionChoice{
            {
                Message: omnillm.Message{
                    Role:    omnillm.RoleAssistant,
                    Content: "Mock response",
                },
            },
        },
    }, nil
}

func (m *mockProvider) CreateChatCompletionStream(ctx context.Context, req *omnillm.ChatCompletionRequest) (omnillm.ChatCompletionStream, error) {
    return nil, nil
}

func (m *mockProvider) Close() error { return nil }
func (m *mockProvider) Name() string { return "mock" }
Conditional Integration Tests

Integration tests automatically skip when API keys are not available:

func TestAnthropicIntegration_Streaming(t *testing.T) {
    apiKey := os.Getenv("ANTHROPIC_API_KEY")
    if apiKey == "" {
        t.Skip("Skipping integration test: ANTHROPIC_API_KEY not set")
    }
    // Test code here...
}
Mock KVS for Memory Testing

OmniLLM provides a mock KVS implementation for testing memory functionality:

import omnillmtest "github.com/agentplexus/omnillm/testing"

// Create mock KVS for testing
mockKVS := omnillmtest.NewMockKVS()

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "test-key"},
    },
    Memory: mockKVS,
})

📚 Examples

The repository includes comprehensive examples:

  • Basic Usage: Simple chat completions with each provider
  • Streaming: Real-time response handling
  • Conversation: Multi-turn conversations with context
  • Memory Demo: Persistent conversation memory with KVS backend
  • Architecture Demo: Overview of the provider architecture
  • Custom Provider: How to create and use 3rd party providers

Run examples:

go run examples/basic/main.go
go run examples/streaming/main.go
go run examples/anthropic_streaming/main.go
go run examples/conversation/main.go
go run examples/memory_demo/main.go
go run examples/providers_demo/main.go
go run examples/xai/main.go
go run examples/ollama/main.go
go run examples/ollama_streaming/main.go
go run examples/gemini/main.go
go run examples/custom_provider/main.go

🔧 Configuration

Environment Variables
  • OPENAI_API_KEY: Your OpenAI API key
  • ANTHROPIC_API_KEY: Your Anthropic API key
  • GEMINI_API_KEY: Your Google Gemini API key
  • XAI_API_KEY: Your X.AI API key
Advanced Configuration
config := omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {
            Provider: omnillm.ProviderNameOpenAI,
            APIKey:   "your-api-key",
            BaseURL:  "https://custom-endpoint.com/v1",
            Extra: map[string]any{
                "timeout": 60, // Custom provider-specific settings
            },
        },
    },
}
Request Parameters

ChatCompletionRequest supports the following parameters with provider-specific availability:

Parameter Type Providers Description
Model string All Model identifier (required)
Messages []Message All Conversation messages (required)
MaxTokens *int All Maximum tokens to generate
Temperature *float64 All Randomness (0.0-2.0)
TopP *float64 All Nucleus sampling threshold
TopK *int Anthropic, Gemini, Ollama Top K token selection
Stop []string All Stop sequences
PresencePenalty *float64 OpenAI, X.AI Penalize tokens by presence
FrequencyPenalty *float64 OpenAI, X.AI Penalize tokens by frequency
Seed *int OpenAI, X.AI, Ollama Reproducible outputs
N *int OpenAI Number of completions
ResponseFormat *ResponseFormat OpenAI, Gemini JSON mode ({"type": "json_object"})
Logprobs *bool OpenAI Return log probabilities
TopLogprobs *int OpenAI Top logprobs count (0-20)
User *string OpenAI End-user identifier
LogitBias map[string]int OpenAI Token bias adjustments
// Helper for pointer values
func ptr[T any](v T) *T { return &v }

// Example: Reproducible outputs with seed
response, err := client.CreateChatCompletion(ctx, &omnillm.ChatCompletionRequest{
    Model:    omnillm.ModelGPT4o,
    Messages: messages,
    Seed:     ptr(42), // Same seed = same output
})

// Example: JSON mode response
response, err := client.CreateChatCompletion(ctx, &omnillm.ChatCompletionRequest{
    Model:    omnillm.ModelGPT4o,
    Messages: messages,
    ResponseFormat: &omnillm.ResponseFormat{Type: "json_object"},
})

// Example: TopK sampling (Anthropic/Gemini/Ollama)
response, err := client.CreateChatCompletion(ctx, &omnillm.ChatCompletionRequest{
    Model:    omnillm.ModelClaude3Sonnet,
    Messages: messages,
    TopK:     ptr(40), // Consider only top 40 tokens
})
Logging Configuration

OmniLLM supports injectable logging via Go's standard log/slog package. If no logger is provided, a null logger is used (no output).

import (
    "log/slog"
    "os"

    "github.com/agentplexus/omnillm"
)

// Use a custom logger
logger := slog.New(slog.NewJSONHandler(os.Stderr, &slog.HandlerOptions{
    Level: slog.LevelDebug,
}))

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {Provider: omnillm.ProviderNameOpenAI, APIKey: "your-api-key"},
    },
    Logger: logger, // Optional: defaults to null logger if not provided
})

// Access the logger if needed
client.Logger().Info("client initialized", slog.String("provider", "openai"))

The logger is used internally for non-critical errors (e.g., memory save failures) that shouldn't interrupt the main request flow.

Context-Aware Logging

OmniLLM supports request-scoped logging via context. This allows you to attach trace IDs, user IDs, or other request-specific attributes to all log output within a request:

import (
    "log/slog"

    "github.com/agentplexus/omnillm"
    "github.com/grokify/mogo/log/slogutil"
)

// Create a request-scoped logger with trace/user context
reqLogger := slog.Default().With(
    slog.String("trace_id", traceID),
    slog.String("user_id", userID),
    slog.String("request_id", requestID),
)

// Attach logger to context
ctx = slogutil.ContextWithLogger(ctx, reqLogger)

// All internal logging will now include trace_id, user_id, and request_id
response, err := client.CreateChatCompletionWithMemory(ctx, sessionID, req)

The context-aware logger is retrieved using slogutil.LoggerFromContext(ctx, fallback), which returns the context logger if present, or falls back to the client's configured logger.

Retry with Backoff

OmniLLM supports automatic retries for transient failures (rate limits, 5xx errors) via a custom HTTP client. This uses the retryhttp package from github.com/grokify/mogo.

import (
    "net/http"
    "time"

    "github.com/agentplexus/omnillm"
    "github.com/grokify/mogo/net/http/retryhttp"
)

// Create retry transport with exponential backoff
rt := retryhttp.NewWithOptions(
    retryhttp.WithMaxRetries(5),                           // Max 5 retries
    retryhttp.WithInitialBackoff(500 * time.Millisecond),  // Start with 500ms
    retryhttp.WithMaxBackoff(30 * time.Second),            // Cap at 30s
    retryhttp.WithOnRetry(func(attempt int, req *http.Request, resp *http.Response, err error, backoff time.Duration) {
        log.Printf("Retry attempt %d, waiting %v", attempt, backoff)
    }),
)

// Create client with retry-enabled HTTP client
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Providers: []omnillm.ProviderConfig{
        {
            Provider: omnillm.ProviderNameOpenAI,
            APIKey:   os.Getenv("OPENAI_API_KEY"),
            HTTPClient: &http.Client{
                Transport: rt,
                Timeout:   2 * time.Minute, // Allow time for retries
            },
        },
    },
})

Retry Transport Features:

Feature Default Description
Max Retries 3 Maximum retry attempts
Initial Backoff 1s Starting backoff duration
Max Backoff 30s Cap on backoff duration
Backoff Multiplier 2.0 Exponential growth factor
Jitter 10% Randomness to prevent thundering herd
Retryable Status Codes 429, 500, 502, 503, 504 Rate limits + 5xx errors

Additional Options:

  • WithRetryableStatusCodes(codes) - Custom status codes to retry
  • WithShouldRetry(fn) - Custom retry decision function
  • WithLogger(logger) - Structured logging for retry events
  • Respects Retry-After headers from API responses

Provider Support: Works with OpenAI, Anthropic, X.AI, and Ollama providers. Gemini and Bedrock use SDK clients with their own retry mechanisms.

🏗️ Adding New Providers

External packages can create providers without modifying the core library. This is the recommended approach for most use cases:

Step 1: Create Your Provider Package
// In your external package (e.g., github.com/yourname/omnillm-gemini)
package gemini

import (
    "context"
    "github.com/agentplexus/omnillm/provider"
)

// Step 1: HTTP Client (like providers/openai/openai.go)
type Client struct {
    apiKey string
    // your HTTP client implementation
}

func New(apiKey string) *Client {
    return &Client{apiKey: apiKey}
}

// Step 2: Provider Adapter (like providers/openai/adapter.go)
type Provider struct {
    client *Client
}

func NewProvider(apiKey string) provider.Provider {
    return &Provider{client: New(apiKey)}
}

func (p *Provider) CreateChatCompletion(ctx context.Context, req *provider.ChatCompletionRequest) (*provider.ChatCompletionResponse, error) {
    // Convert provider.ChatCompletionRequest to your API format
    // Make HTTP call via p.client
    // Convert response back to provider.ChatCompletionResponse
}

func (p *Provider) CreateChatCompletionStream(ctx context.Context, req *provider.ChatCompletionRequest) (provider.ChatCompletionStream, error) {
    // Your streaming implementation
}

func (p *Provider) Close() error { return p.client.Close() }
func (p *Provider) Name() string { return "gemini" }
Step 2: Use Your Provider
import (
    "github.com/agentplexus/omnillm"
    "github.com/yourname/omnillm-gemini"
)

func main() {
    // Create your custom provider
    customProvider := gemini.NewProvider("your-api-key")

    // Inject it directly into omnillm - no core modifications needed!
    client, err := omnillm.NewClient(omnillm.ClientConfig{
        Providers: []omnillm.ProviderConfig{
            {CustomProvider: customProvider},
        },
    })

    // Use the same omnillm API
    response, err := client.CreateChatCompletion(ctx, &omnillm.ChatCompletionRequest{
        Model: "gemini-pro",
        Messages: []omnillm.Message{{Role: omnillm.RoleUser, Content: "Hello!"}},
    })
}
🔧 Built-in Providers (For Core Contributors)

To add a built-in provider to the core library, follow the same structure as existing providers:

  1. Create Provider Package: providers/newprovider/

    • newprovider.go - HTTP client implementation
    • types.go - Provider-specific request/response types
    • adapter.go - provider.Provider interface implementation
  2. Update Core Files:

    • Add factory function in providers.go
    • Add provider constant in constants.go
    • Add model constants if needed
  3. Reference Implementation: Look at any existing provider (e.g., providers/openai/) as they all follow the exact same pattern that external providers should use

🎯 Why This Architecture?
  • 🔌 No Core Changes: External providers don't require modifying the core library
  • 🏗️ Reference Pattern: Internal providers demonstrate the exact structure external providers should follow
  • 🧪 Easy Testing: Both internal and external providers use the same provider.Provider interface
  • 📦 Self-Contained: Each provider manages its own HTTP client, types, and adapter logic
  • 🔧 Direct Injection: Clean dependency injection via ProviderConfig.CustomProvider

📊 Model Support

Provider Models Features
OpenAI GPT-5, GPT-4.1, GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo Chat, Streaming, Functions
Anthropic Claude-Opus-4.1, Claude-Opus-4, Claude-Sonnet-4, Claude-3.7-Sonnet, Claude-3.5-Haiku Chat, Streaming, System messages
Gemini Gemini-2.5-Pro, Gemini-2.5-Flash, Gemini-1.5-Pro, Gemini-1.5-Flash Chat, Streaming
X.AI Grok-4.1-Fast, Grok-4, Grok-4-Fast, Grok-Code-Fast, Grok-3, Grok-3-Mini, Grok-2 Chat, Streaming, 2M context, Tool calling
Ollama Llama 3, Mistral, CodeLlama, Gemma, Qwen2.5, DeepSeek-Coder Chat, Streaming, Local inference
Bedrock* Claude models, Titan models Chat, Multiple model families

*Available as external module

🚨 Error Handling

OmniLLM provides comprehensive error handling with provider-specific context:

response, err := client.CreateChatCompletion(ctx, request)
if err != nil {
    if apiErr, ok := err.(*omnillm.APIError); ok {
        fmt.Printf("Provider: %s, Status: %d, Message: %s\n", 
            apiErr.Provider, apiErr.StatusCode, apiErr.Message)
    }
}

🤝 Contributing

Contributions are welcome! Please follow these steps:

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests to ensure everything works:
    go test ./... -short        # Run unit tests
    go build ./...              # Verify build
    go vet ./...                # Run static analysis
    
  5. Commit your changes (git commit -m 'Add some amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request
Adding Tests

When contributing new features:

  • Add unit tests for core logic
  • Add integration tests for provider implementations (with API key checks)
  • Ensure tests pass without API keys using -short flag
  • Mock external dependencies when possible

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Made with ❤️ for the Go and AI community

Documentation

Index

Constants

View Source
const (
	EnvVarAnthropicAPIKey = "ANTHROPIC_API_KEY" // #nosec G101
	EnvVarOpenAIAPIKey    = "OPENAI_API_KEY"    // #nosec G101
	EnvVarGeminiAPIKey    = "GEMINI_API_KEY"    // #nosec G101
	EnvVarXAIAPIKey       = "XAI_API_KEY"       // #nosec G101
)
View Source
const (
	// Bedrock Models - Re-exported from models package
	ModelBedrockClaude3Opus   = models.BedrockClaude3Opus
	ModelBedrockClaude3Sonnet = models.BedrockClaude3Sonnet
	ModelBedrockClaudeOpus4   = models.BedrockClaudeOpus4
	ModelBedrockTitan         = models.BedrockTitan

	// Claude Models - Re-exported from models package
	ModelClaudeOpus4_1   = models.ClaudeOpus4_1
	ModelClaudeOpus4     = models.ClaudeOpus4
	ModelClaudeSonnet4   = models.ClaudeSonnet4
	ModelClaude3_7Sonnet = models.Claude3_7Sonnet
	ModelClaude3_5Haiku  = models.Claude3_5Haiku
	ModelClaude3Opus     = models.Claude3Opus
	ModelClaude3Sonnet   = models.Claude3Sonnet
	ModelClaude3Haiku    = models.Claude3Haiku

	// Gemini Models - Re-exported from models package
	ModelGemini2_5Pro       = models.Gemini2_5Pro
	ModelGemini2_5Flash     = models.Gemini2_5Flash
	ModelGeminiLive2_5Flash = models.GeminiLive2_5Flash
	ModelGemini1_5Pro       = models.Gemini1_5Pro
	ModelGemini1_5Flash     = models.Gemini1_5Flash
	ModelGeminiPro          = models.GeminiPro

	// Ollama Models - Re-exported from models package
	ModelOllamaLlama3_8B   = models.OllamaLlama3_8B
	ModelOllamaLlama3_70B  = models.OllamaLlama3_70B
	ModelOllamaMistral7B   = models.OllamaMistral7B
	ModelOllamaMixtral8x7B = models.OllamaMixtral8x7B
	ModelOllamaCodeLlama   = models.OllamaCodeLlama
	ModelOllamaGemma2B     = models.OllamaGemma2B
	ModelOllamaGemma7B     = models.OllamaGemma7B
	ModelOllamaQwen2_5     = models.OllamaQwen2_5
	ModelOllamaDeepSeek    = models.OllamaDeepSeek

	// OpenAI Models - Re-exported from models package
	ModelGPT5           = models.GPT5
	ModelGPT5Mini       = models.GPT5Mini
	ModelGPT5Nano       = models.GPT5Nano
	ModelGPT5ChatLatest = models.GPT5ChatLatest
	ModelGPT4_1         = models.GPT4_1
	ModelGPT4_1Mini     = models.GPT4_1Mini
	ModelGPT4_1Nano     = models.GPT4_1Nano
	ModelGPT4o          = models.GPT4o
	ModelGPT4oMini      = models.GPT4oMini
	ModelGPT4Turbo      = models.GPT4Turbo
	ModelGPT35Turbo     = models.GPT35Turbo

	// Vertex AI Models - Re-exported from models package
	ModelVertexClaudeOpus4 = models.VertexClaudeOpus4

	// X.AI Grok Models - Re-exported from models package
	// Grok 4.1 (Latest - November 2025)
	ModelGrok4_1FastReasoning    = models.Grok4_1FastReasoning
	ModelGrok4_1FastNonReasoning = models.Grok4_1FastNonReasoning

	// Grok 4 (July 2025)
	ModelGrok4_0709            = models.Grok4_0709
	ModelGrok4FastReasoning    = models.Grok4FastReasoning
	ModelGrok4FastNonReasoning = models.Grok4FastNonReasoning
	ModelGrokCodeFast1         = models.GrokCodeFast1

	// Grok 3
	ModelGrok3     = models.Grok3
	ModelGrok3Mini = models.Grok3Mini

	// Grok 2
	ModelGrok2_1212   = models.Grok2_1212
	ModelGrok2_Vision = models.Grok2_Vision

	// Deprecated models
	ModelGrokBeta   = models.GrokBeta
	ModelGrokVision = models.GrokVision
)

Common model constants for each provider.

NOTE: For new code, prefer importing "github.com/agentplexus/omnillm/models" directly for better organization and documentation. These constants are maintained for backwards compatibility with existing code.

View Source
const (
	RoleSystem    = provider.RoleSystem
	RoleUser      = provider.RoleUser
	RoleAssistant = provider.RoleAssistant
	RoleTool      = provider.RoleTool
)

Role constants for convenience

Variables

View Source
var (
	// Common errors
	ErrUnsupportedProvider  = errors.New("unsupported provider")
	ErrBedrockExternal      = errors.New("bedrock provider moved to github.com/agentplexus/omnillm-bedrock; use CustomProvider to inject it")
	ErrInvalidConfiguration = errors.New("invalid configuration")
	ErrNoProviders          = errors.New("at least one provider must be configured")
	ErrEmptyAPIKey          = errors.New("API key cannot be empty")
	ErrEmptyModel           = errors.New("model cannot be empty")
	ErrEmptyMessages        = errors.New("messages cannot be empty")
	ErrStreamClosed         = errors.New("stream is closed")
	ErrInvalidResponse      = errors.New("invalid response format")
	ErrRateLimitExceeded    = errors.New("rate limit exceeded")
	ErrQuotaExceeded        = errors.New("quota exceeded")
	ErrInvalidRequest       = errors.New("invalid request")
	ErrModelNotFound        = errors.New("model not found")
	ErrServerError          = errors.New("server error")
	ErrNetworkError         = errors.New("network error")
)

Functions

func EstimatePromptTokens added in v0.11.0

func EstimatePromptTokens(model string, messages []provider.Message) (int, error)

EstimatePromptTokens is a convenience function that creates a default estimator and estimates tokens for a set of messages.

func GetModelContextWindow added in v0.11.0

func GetModelContextWindow(model string) int

GetModelContextWindow is a convenience function that returns the context window for a model using the default estimator.

func IsNonRetryableError added in v0.11.0

func IsNonRetryableError(err error) bool

IsNonRetryableError returns true if the error is permanent and retrying won't help.

func IsRetryableError added in v0.11.0

func IsRetryableError(err error) bool

IsRetryableError returns true if the error is transient and the request can be retried. This is useful for fallback provider logic - only retry on retryable errors.

Types

type APIError

type APIError struct {
	StatusCode int          `json:"status_code"`
	Message    string       `json:"message"`
	Type       string       `json:"type"`
	Code       string       `json:"code"`
	Provider   ProviderName `json:"provider"`
}

APIError represents an error response from the API

func NewAPIError

func NewAPIError(provider ProviderName, statusCode int, message, errorType, code string) *APIError

NewAPIError creates a new API error

func (*APIError) Error

func (e *APIError) Error() string

type CacheConfig added in v0.11.0

type CacheConfig struct {
	// TTL is the time-to-live for cached responses.
	// Default: 1 hour
	TTL time.Duration

	// KeyPrefix is the prefix for cache keys in the KVS.
	// Default: "omnillm:cache"
	KeyPrefix string

	// SkipStreaming skips caching for streaming requests.
	// Default: true (streaming responses are not cached)
	SkipStreaming bool

	// CacheableModels limits caching to specific models.
	// If nil or empty, all models are cached.
	CacheableModels []string

	// ExcludeParameters lists parameters to exclude from cache key calculation.
	// Common exclusions: "user" (user ID shouldn't affect cache)
	// Default: ["user"]
	ExcludeParameters []string

	// IncludeTemperature includes temperature in cache key.
	// Set to false if you want to cache regardless of temperature setting.
	// Default: true
	IncludeTemperature bool

	// IncludeSeed includes seed in cache key.
	// Default: true
	IncludeSeed bool
}

CacheConfig configures response caching behavior

func DefaultCacheConfig added in v0.11.0

func DefaultCacheConfig() CacheConfig

DefaultCacheConfig returns a CacheConfig with sensible defaults

type CacheEntry added in v0.11.0

type CacheEntry struct {
	// Response is the cached chat completion response
	Response *provider.ChatCompletionResponse `json:"response"`

	// CachedAt is when the response was cached
	CachedAt time.Time `json:"cached_at"`

	// ExpiresAt is when the cache entry expires
	ExpiresAt time.Time `json:"expires_at"`

	// Model is the model used for the request
	Model string `json:"model"`

	// RequestHash is the hash of the request (for verification)
	RequestHash string `json:"request_hash"`
}

CacheEntry represents a cached response with metadata

func (*CacheEntry) IsExpired added in v0.11.0

func (e *CacheEntry) IsExpired() bool

IsExpired returns true if the cache entry has expired

type CacheHitError added in v0.11.0

type CacheHitError struct {
	Entry *CacheEntry
}

CacheHitError is a marker type to indicate a cache hit (not an actual error)

func (*CacheHitError) Error added in v0.11.0

func (e *CacheHitError) Error() string

type CacheManager added in v0.11.0

type CacheManager struct {
	// contains filtered or unexported fields
}

CacheManager handles response caching using a KVS backend

func NewCacheManager added in v0.11.0

func NewCacheManager(kvsClient kvs.Client, config CacheConfig) *CacheManager

NewCacheManager creates a new cache manager with the given KVS client and configuration. If config has zero values, defaults are used for those fields.

func (*CacheManager) BuildCacheKey added in v0.11.0

func (m *CacheManager) BuildCacheKey(req *provider.ChatCompletionRequest) string

BuildCacheKey generates a deterministic cache key for a request. The key is a hash of the normalized request parameters.

func (*CacheManager) Config added in v0.11.0

func (m *CacheManager) Config() CacheConfig

Config returns the cache configuration

func (*CacheManager) Delete added in v0.11.0

Delete removes a cache entry for the given request.

func (*CacheManager) Get added in v0.11.0

Get retrieves a cached response for the given request. Returns nil if no valid cache entry exists.

func (*CacheManager) Set added in v0.11.0

Set stores a response in the cache for the given request.

func (*CacheManager) ShouldCache added in v0.11.0

func (m *CacheManager) ShouldCache(req *provider.ChatCompletionRequest) bool

ShouldCache determines if a request should be cached. Returns false for streaming requests (if configured), non-cacheable models, etc.

type CacheStats added in v0.11.0

type CacheStats struct {
	Hits   int64
	Misses int64
}

CacheStats contains statistics about cache usage

type ChatClient

type ChatClient struct {
	// contains filtered or unexported fields
}

ChatClient is the main client interface that wraps a Provider

func NewClient

func NewClient(config ClientConfig) (*ChatClient, error)

NewClient creates a new ChatClient based on the provider

func (*ChatClient) AppendMessage

func (c *ChatClient) AppendMessage(ctx context.Context, sessionID string, message provider.Message) error

AppendMessage appends a message to a conversation in memory

func (*ChatClient) Cache added in v0.11.0

func (c *ChatClient) Cache() *CacheManager

Cache returns the cache manager (nil if not configured)

func (*ChatClient) Close

func (c *ChatClient) Close() error

Close closes the client

func (*ChatClient) CreateChatCompletion

CreateChatCompletion creates a chat completion

func (*ChatClient) CreateChatCompletionStream

func (c *ChatClient) CreateChatCompletionStream(ctx context.Context, req *provider.ChatCompletionRequest) (provider.ChatCompletionStream, error)

CreateChatCompletionStream creates a streaming chat completion

func (*ChatClient) CreateChatCompletionStreamWithMemory

func (c *ChatClient) CreateChatCompletionStreamWithMemory(ctx context.Context, sessionID string, req *provider.ChatCompletionRequest) (provider.ChatCompletionStream, error)

CreateChatCompletionStreamWithMemory creates a streaming chat completion using conversation memory

func (*ChatClient) CreateChatCompletionWithMemory

func (c *ChatClient) CreateChatCompletionWithMemory(ctx context.Context, sessionID string, req *provider.ChatCompletionRequest) (*provider.ChatCompletionResponse, error)

CreateChatCompletionWithMemory creates a chat completion using conversation memory

func (*ChatClient) CreateConversationWithSystemMessage

func (c *ChatClient) CreateConversationWithSystemMessage(ctx context.Context, sessionID, systemMessage string) error

CreateConversationWithSystemMessage creates a new conversation with a system message

func (*ChatClient) DeleteConversation

func (c *ChatClient) DeleteConversation(ctx context.Context, sessionID string) error

DeleteConversation removes a conversation from memory

func (*ChatClient) GetConversationMessages

func (c *ChatClient) GetConversationMessages(ctx context.Context, sessionID string) ([]provider.Message, error)

GetConversationMessages retrieves messages from a conversation

func (*ChatClient) HasCache added in v0.11.0

func (c *ChatClient) HasCache() bool

HasCache returns true if caching is configured

func (*ChatClient) HasMemory

func (c *ChatClient) HasMemory() bool

HasMemory returns true if memory is configured

func (*ChatClient) LoadConversation

func (c *ChatClient) LoadConversation(ctx context.Context, sessionID string) (*ConversationMemory, error)

LoadConversation loads a conversation from memory

func (*ChatClient) Logger

func (c *ChatClient) Logger() *slog.Logger

Logger returns the client's logger

func (*ChatClient) Memory

func (c *ChatClient) Memory() *MemoryManager

Memory returns the memory manager (nil if not configured)

func (*ChatClient) Provider

func (c *ChatClient) Provider() provider.Provider

Provider returns the underlying provider

func (*ChatClient) SaveConversation

func (c *ChatClient) SaveConversation(ctx context.Context, conversation *ConversationMemory) error

SaveConversation saves a conversation to memory

func (*ChatClient) TokenEstimator added in v0.11.0

func (c *ChatClient) TokenEstimator() TokenEstimator

TokenEstimator returns the token estimator (nil if not configured)

type ChatCompletionChoice

type ChatCompletionChoice = provider.ChatCompletionChoice

type ChatCompletionChunk

type ChatCompletionChunk = provider.ChatCompletionChunk

type ChatCompletionRequest

type ChatCompletionRequest = provider.ChatCompletionRequest

type ChatCompletionResponse

type ChatCompletionResponse = provider.ChatCompletionResponse

type ChatCompletionStream

type ChatCompletionStream = provider.ChatCompletionStream

ChatCompletionStream is an alias to the provider.ChatCompletionStream interface for backward compatibility

type CircuitBreaker added in v0.11.0

type CircuitBreaker struct {
	// contains filtered or unexported fields
}

CircuitBreaker implements the circuit breaker pattern for provider health tracking

func NewCircuitBreaker added in v0.11.0

func NewCircuitBreaker(config CircuitBreakerConfig) *CircuitBreaker

NewCircuitBreaker creates a new circuit breaker with the given configuration. If config has zero values, defaults are used for those fields.

func (*CircuitBreaker) AllowRequest added in v0.11.0

func (cb *CircuitBreaker) AllowRequest() bool

AllowRequest returns true if the request should be allowed to proceed. In closed state, always allows. In open state, allows only after timeout. In half-open state, allows a limited number of test requests.

func (*CircuitBreaker) RecordFailure added in v0.11.0

func (cb *CircuitBreaker) RecordFailure()

RecordFailure records a failed request. May open the circuit if thresholds are exceeded.

func (*CircuitBreaker) RecordSuccess added in v0.11.0

func (cb *CircuitBreaker) RecordSuccess()

RecordSuccess records a successful request. In half-open state, may close the circuit if enough successes.

func (*CircuitBreaker) Reset added in v0.11.0

func (cb *CircuitBreaker) Reset()

Reset resets the circuit breaker to closed state with cleared counters

func (*CircuitBreaker) State added in v0.11.0

func (cb *CircuitBreaker) State() CircuitState

State returns the current state of the circuit breaker

func (*CircuitBreaker) Stats added in v0.11.0

func (cb *CircuitBreaker) Stats() CircuitBreakerStats

Stats returns current statistics for monitoring

type CircuitBreakerConfig added in v0.11.0

type CircuitBreakerConfig struct {
	// FailureThreshold is the number of consecutive failures before opening the circuit.
	// Default: 5
	FailureThreshold int

	// SuccessThreshold is the number of consecutive successes in half-open state
	// required to close the circuit.
	// Default: 2
	SuccessThreshold int

	// Timeout is how long to wait in open state before transitioning to half-open.
	// Default: 30 seconds
	Timeout time.Duration

	// FailureRateThreshold triggers circuit open when the failure rate exceeds this value (0-1).
	// Only evaluated after MinimumRequests is reached.
	// Default: 0.5 (50%)
	FailureRateThreshold float64

	// MinimumRequests is the minimum number of requests before failure rate is evaluated.
	// Default: 10
	MinimumRequests int
}

CircuitBreakerConfig configures circuit breaker behavior

func DefaultCircuitBreakerConfig added in v0.11.0

func DefaultCircuitBreakerConfig() CircuitBreakerConfig

DefaultCircuitBreakerConfig returns a CircuitBreakerConfig with sensible defaults

type CircuitBreakerStats added in v0.11.0

type CircuitBreakerStats struct {
	State                CircuitState
	ConsecutiveFailures  int
	ConsecutiveSuccesses int
	TotalRequests        int
	TotalFailures        int
	FailureRate          float64
	LastFailure          time.Time
	LastStateChange      time.Time
}

CircuitBreakerStats contains statistics about the circuit breaker

type CircuitOpenError added in v0.11.0

type CircuitOpenError struct {
	Provider    string
	State       CircuitState
	LastFailure time.Time
	RetryAfter  time.Duration
}

CircuitOpenError is returned when a request is rejected due to open circuit

func (*CircuitOpenError) Error added in v0.11.0

func (e *CircuitOpenError) Error() string

type CircuitState added in v0.11.0

type CircuitState int

CircuitState represents the state of a circuit breaker

const (
	// CircuitClosed indicates normal operation - requests pass through
	CircuitClosed CircuitState = iota
	// CircuitOpen indicates the circuit is open - requests fail fast
	CircuitOpen
	// CircuitHalfOpen indicates the circuit is testing recovery
	CircuitHalfOpen
)

func (CircuitState) String added in v0.11.0

func (s CircuitState) String() string

String returns the string representation of the circuit state

type ClientConfig

type ClientConfig struct {
	// Providers is an ordered list of providers. Index 0 is the primary provider,
	// and indices 1+ are fallback providers tried in order on retryable errors.
	// This is the preferred way to configure providers.
	//
	// Example:
	//   Providers: []ProviderConfig{
	//       {Provider: ProviderNameOpenAI, APIKey: "openai-key"},      // Primary
	//       {Provider: ProviderNameAnthropic, APIKey: "anthropic-key"}, // Fallback 1
	//       {Provider: ProviderNameGemini, APIKey: "gemini-key"},       // Fallback 2
	//   }
	//
	// For custom providers, use CustomProvider field in ProviderConfig:
	//   Providers: []ProviderConfig{
	//       {CustomProvider: myCustomProvider},
	//   }
	Providers []ProviderConfig

	// CircuitBreakerConfig configures circuit breaker behavior for fallback providers.
	// If nil (default), circuit breaker is disabled.
	// When enabled, providers that fail repeatedly are temporarily skipped.
	CircuitBreakerConfig *CircuitBreakerConfig

	// Memory configuration (optional)
	Memory       kvs.Client
	MemoryConfig *MemoryConfig

	// ObservabilityHook is called before/after LLM calls (optional)
	ObservabilityHook ObservabilityHook

	// Logger for internal logging (optional, defaults to null logger)
	Logger *slog.Logger

	// TokenEstimator enables pre-flight token estimation (optional).
	// Use NewTokenEstimator() to create one with custom configuration.
	TokenEstimator TokenEstimator

	// ValidateTokens enables automatic token validation before requests.
	// When true and TokenEstimator is set, requests that would exceed
	// the model's context window are rejected with TokenLimitError.
	// Default: false
	ValidateTokens bool

	// Cache is the KVS client for response caching (optional).
	// If provided, identical requests will return cached responses.
	// Uses the same kvs.Client interface as Memory.
	Cache kvs.Client

	// CacheConfig configures response caching behavior.
	// If nil, DefaultCacheConfig() is used when Cache is provided.
	CacheConfig *CacheConfig
}

ClientConfig holds configuration for creating a client

type ConversationMemory

type ConversationMemory struct {
	SessionID string         `json:"session_id"`
	Messages  []Message      `json:"messages"`
	CreatedAt time.Time      `json:"created_at"`
	UpdatedAt time.Time      `json:"updated_at"`
	Metadata  map[string]any `json:"metadata,omitempty"`
}

ConversationMemory represents stored conversation data

type ErrorCategory added in v0.11.0

type ErrorCategory int

ErrorCategory classifies errors for retry/fallback logic

const (
	// ErrorCategoryUnknown indicates the error type could not be determined
	ErrorCategoryUnknown ErrorCategory = iota
	// ErrorCategoryRetryable indicates the error is transient and the request can be retried
	// Examples: rate limits (429), server errors (5xx), network errors
	ErrorCategoryRetryable
	// ErrorCategoryNonRetryable indicates the error is permanent and retrying won't help
	// Examples: auth errors (401/403), invalid requests (400), not found (404)
	ErrorCategoryNonRetryable
)

func ClassifyError added in v0.11.0

func ClassifyError(err error) ErrorCategory

ClassifyError determines the category of an error for retry/fallback decisions

func (ErrorCategory) String added in v0.11.0

func (c ErrorCategory) String() string

String returns the string representation of the error category

type FallbackAttempt added in v0.11.0

type FallbackAttempt struct {
	// Provider is the name of the provider that was tried
	Provider string

	// Error is the error returned, or nil on success
	Error error

	// Duration is how long the attempt took
	Duration time.Duration

	// Skipped indicates the provider was skipped (e.g., circuit open)
	Skipped bool
}

FallbackAttempt records information about a single fallback attempt

type FallbackError added in v0.11.0

type FallbackError struct {
	// Attempts contains information about each provider attempt
	Attempts []FallbackAttempt

	// LastError is the last error encountered
	LastError error
}

FallbackError is returned when all providers fail

func (*FallbackError) Error added in v0.11.0

func (e *FallbackError) Error() string

func (*FallbackError) Unwrap added in v0.11.0

func (e *FallbackError) Unwrap() error

type FallbackProvider added in v0.11.0

type FallbackProvider struct {
	// contains filtered or unexported fields
}

FallbackProvider wraps multiple providers with fallback logic. It implements provider.Provider and tries providers in order until one succeeds.

func NewFallbackProvider added in v0.11.0

func NewFallbackProvider(
	primary provider.Provider,
	fallbacks []provider.Provider,
	config *FallbackProviderConfig,
) *FallbackProvider

NewFallbackProvider creates a provider that tries fallbacks on failure. The primary provider is tried first, then fallbacks in order.

func (*FallbackProvider) CircuitBreaker added in v0.11.0

func (fp *FallbackProvider) CircuitBreaker(providerName string) *CircuitBreaker

CircuitBreaker returns the circuit breaker for a provider, or nil if not configured

func (*FallbackProvider) Close added in v0.11.0

func (fp *FallbackProvider) Close() error

Close closes all providers

func (*FallbackProvider) CreateChatCompletion added in v0.11.0

CreateChatCompletion tries the primary provider first, then fallbacks on retryable errors.

func (*FallbackProvider) CreateChatCompletionStream added in v0.11.0

func (fp *FallbackProvider) CreateChatCompletionStream(
	ctx context.Context,
	req *provider.ChatCompletionRequest,
) (provider.ChatCompletionStream, error)

CreateChatCompletionStream tries the primary provider first, then fallbacks on retryable errors.

func (*FallbackProvider) FallbackProviders added in v0.11.0

func (fp *FallbackProvider) FallbackProviders() []provider.Provider

FallbackProviders returns the fallback providers

func (*FallbackProvider) Name added in v0.11.0

func (fp *FallbackProvider) Name() string

Name returns a composite name indicating fallback configuration

func (*FallbackProvider) PrimaryProvider added in v0.11.0

func (fp *FallbackProvider) PrimaryProvider() provider.Provider

PrimaryProvider returns the primary provider

type FallbackProviderConfig added in v0.11.0

type FallbackProviderConfig struct {
	// CircuitBreakerConfig configures circuit breaker behavior.
	// If nil, circuit breaker is disabled.
	CircuitBreakerConfig *CircuitBreakerConfig

	// Logger for logging fallback events
	Logger *slog.Logger
}

FallbackProviderConfig configures the fallback provider behavior

type LLMCallInfo

type LLMCallInfo struct {
	CallID       string    // Unique identifier for correlating BeforeRequest/AfterResponse
	ProviderName string    // e.g., "openai", "anthropic"
	StartTime    time.Time // When the call started
}

LLMCallInfo provides metadata about the LLM call for observability

type MemoryConfig

type MemoryConfig struct {
	// MaxMessages limits the number of messages to keep in memory per session
	MaxMessages int
	// TTL sets the time-to-live for stored conversations (0 for no expiration)
	TTL time.Duration
	// KeyPrefix allows customizing the key prefix for stored conversations
	KeyPrefix string
}

MemoryConfig holds configuration for conversation memory

func DefaultMemoryConfig

func DefaultMemoryConfig() MemoryConfig

DefaultMemoryConfig returns sensible defaults for memory configuration

type MemoryManager

type MemoryManager struct {
	// contains filtered or unexported fields
}

MemoryManager handles conversation persistence using KVS

func NewMemoryManager

func NewMemoryManager(kvsClient kvs.Client, config MemoryConfig) *MemoryManager

NewMemoryManager creates a new memory manager with the given KVS client and config

func (*MemoryManager) AppendMessage

func (m *MemoryManager) AppendMessage(ctx context.Context, sessionID string, message Message) error

AppendMessage adds a message to the conversation and saves it

func (*MemoryManager) AppendMessages

func (m *MemoryManager) AppendMessages(ctx context.Context, sessionID string, messages []Message) error

AppendMessages adds multiple messages to the conversation and saves it

func (*MemoryManager) CreateConversationWithSystemMessage

func (m *MemoryManager) CreateConversationWithSystemMessage(ctx context.Context, sessionID, systemMessage string) error

CreateConversationWithSystemMessage creates a new conversation with a system message

func (*MemoryManager) DeleteConversation

func (m *MemoryManager) DeleteConversation(ctx context.Context, sessionID string) error

DeleteConversation removes a conversation from memory

func (*MemoryManager) GetMessages

func (m *MemoryManager) GetMessages(ctx context.Context, sessionID string) ([]Message, error)

GetMessages returns just the messages from a conversation

func (*MemoryManager) LoadConversation

func (m *MemoryManager) LoadConversation(ctx context.Context, sessionID string) (*ConversationMemory, error)

LoadConversation retrieves a conversation from memory

func (*MemoryManager) SaveConversation

func (m *MemoryManager) SaveConversation(ctx context.Context, conversation *ConversationMemory) error

SaveConversation stores a conversation in memory

func (*MemoryManager) SetMetadata

func (m *MemoryManager) SetMetadata(ctx context.Context, sessionID string, metadata map[string]any) error

SetMetadata sets metadata for a conversation

type Message

type Message = provider.Message

type ModelInfo

type ModelInfo struct {
	ID        string       `json:"id"`
	Provider  ProviderName `json:"provider"`
	Name      string       `json:"name"`
	MaxTokens int          `json:"max_tokens"`
}

ModelInfo represents information about a model

func GetModelInfo

func GetModelInfo(modelID string) *ModelInfo

GetModelInfo returns model information

type ObservabilityHook

type ObservabilityHook interface {
	// BeforeRequest is called before each LLM call.
	// Returns a new context for trace/span propagation.
	// The hook should not modify the request.
	BeforeRequest(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest) context.Context

	// AfterResponse is called after each LLM call completes.
	// This is called for both successful and failed requests.
	AfterResponse(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest, resp *provider.ChatCompletionResponse, err error)

	// WrapStream wraps a stream for observability.
	// This allows the hook to observe streaming responses.
	// The returned stream must implement the same interface as the input.
	//
	// Note: For streaming, AfterResponse is only called if stream creation fails.
	// To track streaming completion timing and content, the wrapper returned here
	// should handle Close() or detect EOF in Recv() to finalize metrics/traces.
	WrapStream(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest, stream provider.ChatCompletionStream) provider.ChatCompletionStream
}

ObservabilityHook allows external packages to observe LLM calls. Implementations can use this to add tracing, logging, or metrics without modifying the core OmniLLM library.

type Provider

type Provider = provider.Provider

Provider is an alias to the provider.Provider interface for backward compatibility

type ProviderConfig added in v0.11.0

type ProviderConfig struct {
	// Provider is the provider type (e.g., ProviderNameOpenAI).
	// Ignored if CustomProvider is set.
	Provider ProviderName

	// APIKey is the API key for the provider
	APIKey string

	// BaseURL is an optional custom base URL
	BaseURL string

	// Region is for providers that require a region (e.g., AWS Bedrock)
	Region string

	// Timeout sets the HTTP client timeout for this provider
	Timeout time.Duration

	// HTTPClient is an optional custom HTTP client
	HTTPClient *http.Client

	// Extra holds provider-specific configuration
	Extra map[string]any

	// CustomProvider allows injecting a custom provider implementation.
	// When set, Provider, APIKey, BaseURL, etc. are ignored.
	CustomProvider provider.Provider
}

ProviderConfig holds configuration for a single provider instance. Used in the Providers slice where index 0 is primary and 1+ are fallbacks.

type ProviderName

type ProviderName string

ProviderName represents the different LLM provider names

const (
	ProviderNameOpenAI    ProviderName = "openai"
	ProviderNameAnthropic ProviderName = "anthropic"
	ProviderNameBedrock   ProviderName = "bedrock"
	ProviderNameOllama    ProviderName = "ollama"
	ProviderNameGemini    ProviderName = "gemini"
	ProviderNameXAI       ProviderName = "xai"
)

type Role

type Role = provider.Role

Type aliases for backward compatibility and convenience

type TokenEstimator added in v0.11.0

type TokenEstimator interface {
	// EstimateTokens estimates the token count for a set of messages.
	// The estimate may not be exact but should be reasonably close.
	EstimateTokens(model string, messages []provider.Message) (int, error)

	// GetContextWindow returns the maximum context window size for a model.
	// Returns 0 if the model is unknown.
	GetContextWindow(model string) int
}

TokenEstimator estimates token counts for messages before sending to the API. This is useful for validating requests won't exceed model limits.

func NewTokenEstimator added in v0.11.0

func NewTokenEstimator(config TokenEstimatorConfig) TokenEstimator

NewTokenEstimator creates a new token estimator with the given configuration. If config has zero values, defaults are used for those fields.

type TokenEstimatorConfig added in v0.11.0

type TokenEstimatorConfig struct {
	// CharactersPerToken is the average number of characters per token.
	// Default: 4.0 (reasonable for English text)
	// Lower values (e.g., 3.0) give more conservative estimates.
	CharactersPerToken float64

	// CustomContextWindows allows overriding context window sizes for specific models.
	// Keys should be model IDs (e.g., "gpt-4o", "claude-3-opus").
	CustomContextWindows map[string]int

	// TokenOverheadPerMessage is extra tokens added per message for formatting.
	// Default: 4 (accounts for role, separators, etc.)
	TokenOverheadPerMessage int
}

TokenEstimatorConfig configures token estimation behavior

func DefaultTokenEstimatorConfig added in v0.11.0

func DefaultTokenEstimatorConfig() TokenEstimatorConfig

DefaultTokenEstimatorConfig returns a TokenEstimatorConfig with sensible defaults

type TokenLimitError added in v0.11.0

type TokenLimitError struct {
	// EstimatedTokens is the estimated prompt token count
	EstimatedTokens int

	// ContextWindow is the model's maximum context window
	ContextWindow int

	// AvailableTokens is how many tokens are available (may be negative)
	AvailableTokens int

	// Model is the model ID
	Model string
}

TokenLimitError is returned when a request exceeds token limits

func (*TokenLimitError) Error added in v0.11.0

func (e *TokenLimitError) Error() string

type TokenValidation added in v0.11.0

type TokenValidation struct {
	// EstimatedTokens is the estimated prompt token count
	EstimatedTokens int

	// ContextWindow is the model's maximum context window
	ContextWindow int

	// MaxCompletionTokens is the requested max completion tokens
	MaxCompletionTokens int

	// AvailableTokens is how many tokens are available for completion
	// (ContextWindow - EstimatedTokens)
	AvailableTokens int

	// ExceedsLimit is true if the prompt exceeds the context window
	ExceedsLimit bool

	// ExceedsWithCompletion is true if prompt + max_tokens exceeds context
	ExceedsWithCompletion bool
}

TokenValidation contains the result of token validation

func ValidateTokens added in v0.11.0

func ValidateTokens(
	estimator TokenEstimator,
	model string,
	messages []provider.Message,
	maxCompletionTokens int,
) (*TokenValidation, error)

ValidateTokens checks if the request fits within model limits. Returns validation details including whether limits are exceeded.

type Tool

type Tool = provider.Tool

type ToolCall

type ToolCall = provider.ToolCall

type ToolFunction

type ToolFunction = provider.ToolFunction

type ToolSpec

type ToolSpec = provider.ToolSpec

type Usage

type Usage = provider.Usage

Directories

Path Synopsis
examples
basic command
conversation command
custom_provider command
gemini command
memory_demo command
ollama command
providers_demo command
streaming command
xai command
Package models provides a comprehensive catalog of LLM model identifiers and documentation references for all supported providers.
Package models provides a comprehensive catalog of LLM model identifiers and documentation references for all supported providers.
Package provider defines the core interfaces that external LLM providers must implement.
Package provider defines the core interfaces that external LLM providers must implement.
providers
anthropic
Package anthropic provides Anthropic provider adapter for the OmniLLM unified interface
Package anthropic provides Anthropic provider adapter for the OmniLLM unified interface
gemini
Package gemini provides Google Gemini provider adapter for the OmniLLM unified interface
Package gemini provides Google Gemini provider adapter for the OmniLLM unified interface
ollama
Package ollama provides Ollama provider adapter for the OmniLLM unified interface
Package ollama provides Ollama provider adapter for the OmniLLM unified interface
openai
Package openai provides OpenAI provider adapter for the OmniLLM unified interface
Package openai provides OpenAI provider adapter for the OmniLLM unified interface
xai
Package xai provides X.AI Grok provider adapter for the OmniLLM unified interface
Package xai provides X.AI Grok provider adapter for the OmniLLM unified interface
Package testing provides mock implementations for testing
Package testing provides mock implementations for testing

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL