Add Google VertexAI Provider Support

This PR adds support for Google VertexAI models (Gemini) as a new provider in the Hub, allowing users to route their LLM requests to Google’s models through our unified API interface.

Definition of Done

  • Implementation of VertexAI provider
    • Chat completions endpoint with Gemini models
    • Text completions endpoint
    • Embeddings endpoint with textembedding-gecko
    • Tool/function calling support
    • Streaming support
    • Multi-modal support
  • Unit tests
    • Provider configuration tests
    • Request/response conversion tests
    • Error handling tests
    • Model format conversion tests
  • Integration tests
    • Chat completions test with recorded responses
    • Completions test with recorded responses
    • Embeddings test with recorded responses
    • Tool calling test with recorded responses
    • Test cassettes for offline testing
    • Quota retry mechanism
  • Documentation
    • Configuration examples in README
    • Authentication methods (API key and service account)
    • Model support documentation
    • Usage examples with OpenAI SDK
  • Configuration
    • Example configuration in config-example.yaml
    • Support for both API key and service account auth
    • Required parameters (project_id)
    • Optional parameters (location, credentials_path)
  • Error Handling
    • Proper status code mapping from Google API
    • Informative error messages
    • Quota limit handling with configurable retries
    • Authentication error handling
  • Code Review and Approval

Changes Made

  1. Added new VertexAI provider implementation

    • Full support for Gemini models (chat, completion)
    • Text embeddings with textembedding-gecko
    • Tool/function calling with proper mapping
    • Streaming support for real-time responses
    • Multi-modal capabilities for image+text inputs
  2. Added comprehensive test suite

    • Unit tests for core functionality
    • Integration tests with recorded responses
    • Test cassettes for offline testing
    • Quota retry mechanism for rate limits
    • Both auth methods covered
  3. Updated documentation

    • Detailed VertexAI setup instructions
    • Authentication configuration guide
    • Model compatibility list
    • Usage examples with OpenAI SDK
    • Configuration parameter reference
  4. Added robust configuration support

    • API key authentication
    • Service account authentication (JSON key file)
    • Project ID configuration
    • Regional endpoint configuration
    • Default values for optional parameters

Testing

  • Unit tests: cargo test
  • Integration tests with credentials:
    # Service Account Auth
    export VERTEXAI_CREDENTIALS_PATH="../credentials/vertexai-key.json"
    
    # Record new responses
    RECORD_MODE=1 cargo test
    
    # Replay mode (default)
    cargo test
    
  • Test retry configuration:
    export RETRY_DELAY=60  # Seconds between retries
    

Security Considerations

  • Credentials stored outside repository
  • Test cassettes cleaned of sensitive data
  • Support for both API key and service account auth
  • Environment variable support for credentials
  • No hardcoded sensitive values

Notes

  • Default location is us-central1 (configurable)
  • Automatic retry on quota limits (configurable)
  • Test cassettes provided for offline development
  • Compatible with existing OpenAI SDK clients

Fixes #19 /claim #19

Claim

Total prize pool $300
Total paid $0
Status Pending
Submitted December 30, 2024
Last updated December 30, 2024

Contributors

DA

David Anyatonwu

@onyedikachi-david

100%

Sponsors

TR

Traceloop (YC W23)

@traceloop

$300