[Provider] Add MiniMax Support

Closes #1855 /claim #1855

Summary

This PR adds complete support for MiniMax AI

Changes

Backend Integration

Core Implementation:

✅ Adapter (backend/src/routes/proxy/adapterV2/minimax.ts)
- OpenAI-compatible request/response adapter with MiniMax-specific extensions
- Streaming support with SSE parsing
- Token estimation using tiktoken (MiniMax streaming returns usage: null)
- Reasoning content extraction from reasoning_details array
- Tool call handling and formatting
- TOON compression for tool results
- Model optimization support
- Custom error handling for MiniMax’s unique error format
✅ Type Definitions (backend/src/types/llm-providers/minimax/ - 4 files)
- Complete Zod schemas for API types (api.ts, messages.ts, tools.ts, index.ts)
- Message, tool, and request/response schemas
- Extended schema for reasoning_details thinking content
- Stream chunk type allows role: "" (empty string) - MiniMax quirk
- OpenAI-compatible format with MiniMax extensions
✅ Routes (backend/src/routes/proxy/routesv2/minimax.ts)
- HTTP proxy endpoints with agent support
- Chat completions (streaming and non-streaming)
- Error handling and validation
- Workaround for missing /v1/models endpoint - uses hardcoded model list
✅ Database Migration (backend/src/database/migrations/0131_add_minimax_token_prices.sql)
- Token pricing for all MiniMax models
- MiniMax-M2: $0.30 input / $1.20 output per million tokens
- MiniMax-M2.1: $0.30 input / $1.20 output per million tokens
- MiniMax-M2.1-lightning: $0.30 input / $2.40 output per million tokens
✅ Models Dev Client Integration (backend/src/clients/models-dev-client.ts)
- Added minimax provider mapping for automatic price sync

Frontend Integration

✅ Interaction Handler (frontend/src/lib/llmProviders/minimax.ts)
- Message parsing and rendering
- Reasoning content display (thinking)
- Tool call display
- Token usage calculation
- Stop reason handling
✅ UI Components
- Model selector integration
- API key form with MiniMax branding
- Provider display name
- Chat interface support

Key MiniMax Differences

1. No Models Endpoint

MiniMax doesn’t provide a /v1/models endpoint. We use a hardcoded list:

MiniMax-M2
MiniMax-M2.1
MiniMax-M2.1-lightning (fastest)

2. Streaming Usage Data

Problem: MiniMax streaming API returns "usage": null in all chunks (no token counts)
Solution: Implemented token estimation using tiktoken that:

Counts input tokens from request messages
Counts output tokens from accumulated text, reasoning, and tool calls
Runs automatically at stream end to ensure interactions are logged

3. Reasoning Details

MiniMax supports extended thinking with reasoning_details array:

Enable with extra_body: { reasoning_split: true }
Returns thinking process separate from main content
Adapter extracts reasoning deltas and accumulates full text

4. Stream Chunk Role Field

MiniMax sends "role": "" (empty string) in some stream chunks instead of "role": "assistant". Type schema updated to allow both.

Streaming Support

✅ Non-streaming responses: Fully supported with native usage data
✅ Streaming responses: Fully supported with SSE + token estimation

Feature Completeness

LLM Proxy ✅

Feature	Status	Notes
Tool invocation	✅ Supported	Full tool call format conversion
Tool persistence	✅ Supported	Tool calls maintained across conversation
Token/cost limits	✅ Supported	Enforced per-request limits
Model optimization	✅ Supported	Can switch to cheaper models
Tool results compression	✅ Supported	TOON compression implemented
Dual LLM verification	✅ Supported	Works with optimization rules
Metrics and observability	✅ Supported	Full Prometheus metrics integration
Token estimation	✅ Supported	Automatic for streaming (usage is null)

Chat ✅

Feature	Status	Notes
Chat conversations	✅ Works	Full message history support
Model listing	✅ Works	Hardcoded model list (no API endpoint)
Model selection	✅ Works	Dropdown with all available models
Streaming responses	✅ Works	Real-time SSE streaming
Reasoning content	✅ Works	Displays thinking process separately
Error handling	✅ Works	MiniMax-specific error codes handled
API key management	✅ Works	Personal/team/org-wide key hierarchy
Conversation titles	✅ Works	Auto-generation using fast model
Token tracking	✅ Works	Estimated tokens logged to interactions

Testing Infrastructure

✅ E2E Test Coverage (5 test suites)
- Tool invocation policies (tool-invocation.spec.ts)
- Tool persistence (tool-persistence.spec.ts)
- Tool result compression (tool-result-compression.spec.ts)
- Model optimization (model-optimization.spec.ts)
- Token cost limits (token-cost-limits.spec.ts)
✅ WireMock Stubs (13 mapping files in helm/e2e-tests/mappings/)
- Model listing
- Chat completions with various scenarios
- Tool invocation test cases
- Compression and optimization scenarios
- Cost limit testing
✅ CI Configuration (.github/values-ci.yaml)

API Key Instructions

Obtaining an API Key

Visit MiniMax Platform and generate an API key (a minimum of $25 recharge is required to use the API)

Demo Video

https://github.com/user-attachments/assets/4df301c2-be8b-425c-991c-d883743c284e

Testing Done

✅ Chat conversation with MiniMax-M2.1-lightning
✅ Verified streaming response with curl to actual API
✅ Verified non-streaming response with curl to actual API
✅ Token estimation accuracy verified (input/output/reasoning tokens)
✅ Conversation title generation works
✅ Interactions logged with cost data
✅ Model listing shows all MiniMax models
✅ pnpm lint passes
✅ pnpm type-check passes

Documentation

Updated platform-supported-llm-providers.md