Inference/Hosting & APIs

Total 24 articles sites

Serve models via APIs and infra

Writing & Documents Images & Design Video & Avatars Audio & Voice Productivity & Office Coding & Dev Search & Research Agents & Automation Marketing & Growth Customer Support Open Source & Models Prompts & Templates Entertainment

Sorting

release update Views Like

Azure AI Foundry Models / OpenAI

Catalog of OpenAI and open models with enterprise governance and Azure AI Inference APIs.

0340

Inference/Hosting & APIs # Azure # enterprise # governance

Fireworks AI

High-throughput inference and fine-tuning for open models; global, scalable endpoints.

0300

Inference/Hosting & APIs # endpoints # fine-tuning # Fireworks

NVIDIA NIM

Prebuilt, optimized inference microservices for leading models on any NVIDIA-accelerated stack.

0320

Inference/Hosting & APIs # GPU # inference # microservices

OpenRouter

One API that routes to hundreds of models with pricing, latency, and fallback controls.

0750

Inference/Hosting & APIs # API # catalog # fallback

Anthropic Claude API

Production API for Claude models with tools like web search and enterprise controls.

0430

Inference/Hosting & APIs # Anthropic # API # Claude

Replicate

Run and deploy community and custom models with a simple cloud API and playgrounds.

0430

Inference/Hosting & APIs # API # deploy # hosted models

Mistral AI — La Plateforme

High-performance chat/embedding APIs and Studio; OpenAI-compatible endpoints.

0400

Inference/Hosting & APIs # API # chat # compatibility

Cerebras Inference

Wafer-scale engine cloud with OpenAI-style APIs for ultra-fast open-model inference.

0460

Inference/Hosting & APIs # API # Cerebras # inference

Cohere Platform

Enterprise LLMs for chat, embeddings, rerank, and tools—private and secure by design.

0390

Inference/Hosting & APIs # API # Cohere # embeddings

SambaNova Cloud

RDU-accelerated inference platform with OpenAI-compatible API keys for top open models.

0450

Inference/Hosting & APIs # API # inference # LLM

xAI API (Grok)

Grok models via OpenAI-compatible API; advanced reasoning, coding, and vision.

0390

Inference/Hosting & APIs # API # compatibility # Grok

Anyscale Endpoints (Ray Serve)

OpenAI-compatible serving on Ray with autoscaling and many-model deployments.

0370

Inference/Hosting & APIs # Anyscale # autoscaling # endpoints

IBM watsonx.ai

Business-ready foundation models and Model Gateway with governance and pricing controls.

0410

Inference/Hosting & APIs # API # foundation models # governance

Baseten

Production inference platform—dedicated deployments, autoscaling, and GPU options.

0330

Inference/Hosting & APIs # autoscaling # Baseten # dedicated

Databricks Mosaic AI Model Serving

Serve custom and foundation models as REST endpoints with AI Gateway governance.

0340

Inference/Hosting & APIs # Databricks # Gateway # governance