Skip to content
GitHubBuy Me A Coffee

Providers & models

Vault Operator supports 12 AI providers. Setup instructions for each one follow.

For all providers, open Settings > Vault Operator > Providers, click "+ Add provider", and pick your provider type.

Cloud providers

Anthropic

What you needAPI key from console.anthropic.com
Recommended modelsClaude Sonnet 4.6 (best overall), Claude Haiku (fast and cheap)
EmbeddingNot available natively. Use OpenAI for embeddings.

Setup:

  1. Create an account at console.anthropic.com
  2. Go to API Keys and create a new key
  3. In Vault Operator, select Anthropic as provider, paste the key, and pick a model

Best tool use

Anthropic models are the most reliable at calling Vault Operator's tools correctly. If quality matters most, start here.

OpenAI

What you needAPI key from platform.openai.com
Recommended modelsGPT-4o (balanced), o3 (reasoning), GPT-4o-mini (budget)
EmbeddingNative support. text-embedding-3-small recommended.

Setup:

  1. Create an account at platform.openai.com
  2. Go to API Keys and generate a new key
  3. In Vault Operator, select OpenAI as provider, paste the key, and pick a model

Embedding models

An OpenAI key also gives you access to embedding models for semantic search. Configure in Settings > Embeddings.

Google Gemini

What you needAPI key from Google AI Studio
Recommended modelsGemini 2.5 Flash (fast, free tier available), Gemini 2.5 Pro (best quality)
EmbeddingNot available natively

Setup:

  1. Go to Google AI Studio and sign in with your Google account
  2. Click Create API Key and copy it
  3. In Vault Operator, select Google Gemini as provider, paste the key
  4. Browse available models or pick from the pre-configured list

Free tier

Google Gemini has a free tier with reasonable rate limits. Good starting point if you want to try Vault Operator without paying.

OpenRouter

What you needAPI key from openrouter.ai
Recommended modelsAny. OpenRouter gives access to 100+ models from multiple providers.
EmbeddingNot available

Setup:

  1. Create an account at openrouter.ai
  2. Go to Keys and create a new API key
  3. In Vault Operator, select OpenRouter as provider, paste the key
  4. Browse or type any model ID (e.g., anthropic/claude-sonnet-4.6, google/gemini-2.5-pro)

Azure OpenAI

What you needAzure subscription, a deployed model, API key, and endpoint URL
Recommended modelsGPT-4o (deployed in your Azure region)
EmbeddingNative support via deployed embedding model

Setup:

  1. Deploy a model in your Azure OpenAI resource
  2. Copy the endpoint URL, API key, and deployment name
  3. In Vault Operator, select Azure OpenAI as provider and fill in all three fields

Enterprise use

Azure OpenAI fits organizations with compliance requirements. Data stays inside your Azure tenant.

Amazon Bedrock

What you needAWS account with Bedrock enabled, IAM user with invoke permissions, access key ID + secret access key
Recommended modelsClaude Sonnet 4.5, Claude Opus 4.5, Amazon Nova (via cross-region inference profiles)
EmbeddingNot supported in phase 1. Use OpenAI or Ollama for embeddings
Regionseu-central-1, eu-west-1, eu-west-2, eu-west-3, eu-north-1, us-east-1, us-east-2, us-west-2, plus Asia Pacific

Setup:

  1. In the AWS console, open Bedrock in your preferred region. For the EU, Frankfurt (eu-central-1) is the most common choice
  2. Go to Model access and request access to the model families you want to use. Approval is usually instant for the major foundation models
  3. Create an IAM user (or role) with a policy that allows these actions:
    json
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream"
      ],
      "Resource": "*"
    }
  4. For EU cross-region inference profiles (recommended), the resource ARN pattern covers all EU regions. For a more restricted policy, scope it to the specific inference profile ARNs you use
  5. Generate an access key ID and secret access key for the user and copy both
  6. In Vault Operator, select Amazon Bedrock as provider, pick your region, and paste the credentials. Use the quick pick dropdown to select a model

Cross-region inference profiles

Model IDs prefixed with eu. or us. are cross-region inference profiles. They route requests across the regions in that geography for higher availability. In Europe, eu.anthropic.claude-sonnet-4-5-20250929-v1:0 is the recommended default. It works from any EU region and keeps data inside the EU.

Direct regional model IDs (without a prefix) only work in the specific region that hosts the model. Frankfurt supports a smaller direct model list than the EU inference profiles do.

Temporary credentials

For AWS SSO or STS-issued credentials, fill the session token field as well. Long-lived IAM user credentials don't need it.

Billing

Bedrock bills per-token directly through your AWS account. There is no free tier for most foundation models. Check the AWS Bedrock pricing page before heavy use.

Gateway providers

ChatGPT (OAuth)

What you needAn active ChatGPT Plus or Pro subscription
Available modelsgpt-5, gpt-5.1, gpt-5.2, gpt-5-codex, gpt-5-codex-mini, gpt-5.1-codex variants, gpt-5.2-codex, gpt-5.3-codex (Codex-backend lineup)
EmbeddingNot available

Setup (OAuth PKCE loopback flow, desktop only):

  1. In Vault Operator, select ChatGPT (OAuth) as provider
  2. Click "Sign in with ChatGPT". The browser opens with auth.openai.com.
  3. Sign in with the same account that holds your ChatGPT Plus / Pro subscription
  4. After approval the browser redirects to a localhost callback the plugin opened for the duration of the flow. The tab closes itself.
  5. Click "Refresh" to load the Codex model lineup, then map the tiers (Budget / Main / Frontier)

Behind the scenes the plugin routes requests through chatgpt.com/backend-api/codex/responses, the same endpoint that the Codex CLI uses. Tokens are stored encrypted via your OS keychain (safeStorage). Refresh tokens auto-renew before expiry.

Covered by your subscription

ChatGPT-OAuth bills against your existing Plus / Pro plan, not against an OpenAI API key. There is no per-token cost; rate limits follow the subscription tier. The plugin still tracks the equivalent API cost in the sidebar footer for transparency.

Reasoning effort fixed at low

GPT-5 family models require a reasoning block in every request. Vault Operator sends reasoning: { effort: 'low' }, the narrowest value accepted across the family. This minimises latency and cost. Higher reasoning effort is not currently exposed as a setting; if you need it for a specific task, use the OpenAI API provider with a gpt-5*-pro model via the standard /v1/responses endpoint.

GitHub Copilot

What you needAn active GitHub Copilot subscription (Individual, Business, or Enterprise)
Recommended modelsGPT-4o, Claude Sonnet (available through Copilot)
EmbeddingNot available

Setup (OAuth device flow):

  1. In Vault Operator, select GitHub Copilot as provider
  2. Click "Sign in with GitHub". A device code appears.
  3. Open github.com/login/device in your browser
  4. Enter the code and authorize the app
  5. Vault Operator automatically detects your available models

No extra cost

If you already pay for GitHub Copilot, this costs nothing extra. The models come with your subscription.

Kilo Gateway

What you needA Kilo Code account with gateway access
Recommended modelsDepends on your organization's available models
EmbeddingNot available

Setup (device auth, recommended):

  1. In Vault Operator, select Kilo Gateway as provider
  2. Click "Sign in". A device code and URL appear.
  3. Open the URL in your browser, enter the code, and authorize
  4. Models are loaded dynamically from your organization

Setup (manual token):

  1. Obtain a gateway token from your Kilo Code admin
  2. In Vault Operator, select Kilo Gateway and choose "Manual Token"
  3. Paste the token. Models load automatically.

Local providers

Ollama

What you needOllama installed on your machine
Recommended modelsQwen 2.5 7B (balanced), Llama 3.2 (general), Codestral (code)
EmbeddingSupported via nomic-embed-text or similar

Setup:

  1. Install Ollama from ollama.ai
  2. Pull a model: ollama pull qwen2.5:7b
  3. In Vault Operator, select Ollama as provider. No API key needed.
  4. The Base URL field pre-fills with http://localhost:11434; adjust only if you run Ollama on a non-default port.
  5. Click "Refresh" to populate the model list from Ollama's native /api/tags endpoint.

Privacy

With Ollama, no data leaves your machine. Good for sensitive vaults.

LM Studio

What you needLM Studio installed with a model loaded
Recommended modelsAny GGUF model from the built-in catalog
EmbeddingSupported for compatible models

Setup:

  1. Install LM Studio from lmstudio.ai
  2. Download a model from the catalog and load it
  3. Start the local server (LM Studio > Developer tab)
  4. In Vault Operator, select LM Studio as provider. No API key needed.
  5. The Base URL field pre-fills with http://localhost:1234; adjust only if you changed the server port.

Custom endpoint

What you needAny OpenAI-compatible API endpoint
Recommended modelsDepends on the server
EmbeddingDepends on the server

Setup:

  1. In Vault Operator, select Custom as provider
  2. Enter the base URL (e.g., http://localhost:8080/v1)
  3. Enter an API key if your server requires one
  4. Type the model name exactly as the server expects

This works with any server that implements the OpenAI chat completions API: vLLM, text-generation-inference, LocalAI, and self-hosted endpoints.

Provider comparison

ProviderAuthCostPrivacyEmbeddingBest for
AnthropicAPI keyPay-per-useCloudNoBest quality
OpenAIAPI keyPay-per-useCloudYesStructured output, embeddings
Google GeminiAPI keyFree tier + payCloudNoFree starting point
OpenRouterAPI keyPay-per-useCloudNoModel variety
Azure OpenAIAPI key + endpointEnterpriseEnterprise tenantYesCompliance
Amazon BedrockIAM access keyPay-per-use via AWSCloud (your AWS account)NoEU data residency via eu-central-1
ChatGPT (OAuth)OAuth (PKCE)Plus / Pro subscriptionCloudNoExisting ChatGPT subscribers, Codex-line models
GitHub CopilotOAuthSubscriptionCloudNoExisting subscribers
Kilo GatewayDevice auth / tokenOrganizationCloudNoTeam deployments
OllamaNoneFreeFully localYesPrivacy, offline
LM StudioNoneFreeFully localYesVisual model browser
CustomVariesVariesVariesVariesSelf-hosted setups