Azure AI Foundry Models / OpenAI Catalog of OpenAI and open models with enterprise governance and Azure AI Inference APIs. 0340 Inference/Hosting & APIs# Azure# enterprise# governance
Fireworks AI High-throughput inference and fine-tuning for open models; global, scalable endpoints. 0300 Inference/Hosting & APIs# endpoints# fine-tuning# Fireworks
NVIDIA NIM Prebuilt, optimized inference microservices for leading models on any NVIDIA-accelerated stack. 0320 Inference/Hosting & APIs# GPU# inference# microservices
OpenRouter One API that routes to hundreds of models with pricing, latency, and fallback controls. 0750 Inference/Hosting & APIs# API# catalog# fallback
Anthropic Claude API Production API for Claude models with tools like web search and enterprise controls. 0430 Inference/Hosting & APIs# Anthropic# API# Claude
Replicate Run and deploy community and custom models with a simple cloud API and playgrounds. 0430 Inference/Hosting & APIs# API# deploy# hosted models
Mistral AI — La Plateforme High-performance chat/embedding APIs and Studio; OpenAI-compatible endpoints. 0400 Inference/Hosting & APIs# API# chat# compatibility
Cerebras Inference Wafer-scale engine cloud with OpenAI-style APIs for ultra-fast open-model inference. 0460 Inference/Hosting & APIs# API# Cerebras# inference
Cohere Platform Enterprise LLMs for chat, embeddings, rerank, and tools—private and secure by design. 0390 Inference/Hosting & APIs# API# Cohere# embeddings
SambaNova Cloud RDU-accelerated inference platform with OpenAI-compatible API keys for top open models. 0450 Inference/Hosting & APIs# API# inference# LLM
xAI API (Grok) Grok models via OpenAI-compatible API; advanced reasoning, coding, and vision. 0390 Inference/Hosting & APIs# API# compatibility# Grok
Anyscale Endpoints (Ray Serve) OpenAI-compatible serving on Ray with autoscaling and many-model deployments. 0370 Inference/Hosting & APIs# Anyscale# autoscaling# endpoints
IBM watsonx.ai Business-ready foundation models and Model Gateway with governance and pricing controls. 0410 Inference/Hosting & APIs# API# foundation models# governance
Baseten Production inference platform—dedicated deployments, autoscaling, and GPU options. 0330 Inference/Hosting & APIs# autoscaling# Baseten# dedicated
Databricks Mosaic AI Model Serving Serve custom and foundation models as REST endpoints with AI Gateway governance. 0340 Inference/Hosting & APIs# Databricks# Gateway# governance