📄️ OpenAI
LiteLLM supports OpenAI Chat + Embedding calls.
📄️ OpenAI (Text Completion)
LiteLLM supports OpenAI text completion models
📄️ OpenAI-Compatible Endpoints
To call models hosted behind an openai proxy, make 2 changes:
📄️ Azure OpenAI
API Keys, Params
📄️ Azure AI Studio
Ensure the following:
📄️ VertexAI [Anthropic, Gemini, Model Garden]
Pre-requisites
📄️ PaLM API - Google
Pre-requisites
📄️ Gemini - Google AI Studio
Pre-requisites
📄️ Mistral AI API
https://docs.mistral.ai/api/
📄️ Anthropic
LiteLLM supports
📄️ AWS Sagemaker
LiteLLM supports All Sagemaker Huggingface Jumpstart Models
📄️ AWS Bedrock
Anthropic, Amazon Titan, A121 LLMs are Supported on Bedrock
📄️ Cohere
API KEYS
📄️ Anyscale
https://app.endpoints.anyscale.com/
📄️ Huggingface
LiteLLM supports the following types of Huggingface models:
📄️ IBM watsonx.ai
LiteLLM supports all IBM watsonx.ai foundational models and embeddings.
📄️ 🆕 Predibase
LiteLLM supports all models on Predibase
📄️ Triton Inference Server
LiteLLM supports Embedding Models on Triton Inference Servers
📄️ Ollama
LiteLLM supports all models from Ollama
📄️ Perplexity AI (pplx-api)
https://www.perplexity.ai
📄️ Groq
https://groq.com/
📄️ Deepseek
https://deepseek.com/
📄️ Fireworks AI
https://fireworks.ai/
📄️ VLLM
LiteLLM supports all models on VLLM.
📄️ Xinference [Xorbits Inference]
https://inference.readthedocs.io/en/latest/index.html
📄️ Cloudflare Workers AI
https://developers.cloudflare.com/workers-ai/models/text-generation/
📄️ DeepInfra
https://deepinfra.com/
📄️ AI21
LiteLLM supports j2-light, j2-mid and j2-ultra from AI21.
📄️ NLP Cloud
LiteLLM supports all LLMs on NLP Cloud.
📄️ Replicate
LiteLLM supports all models on Replicate
📄️ Together AI
LiteLLM supports all models on Together AI.
📄️ Voyage AI
https://docs.voyageai.com/embeddings/
📄️ Aleph Alpha
LiteLLM supports all models from Aleph Alpha.
📄️ Baseten
LiteLLM supports any Text-Gen-Interface models on Baseten.
📄️ OpenRouter
LiteLLM supports all the text / chat / vision models from OpenRouter
📄️ Custom API Server (OpenAI Format)
LiteLLM allows you to call your custom endpoint in the OpenAI ChatCompletion format
📄️ Petals
Petals//github.com/bigscience-workshop/petals