Inference/Hosting & APIs

Total 24 articles sites

Serve models via APIs and infra

Writing & Documents Images & Design Video & Avatars Audio & Voice Productivity & Office Coding & Dev Search & Research Agents & Automation Marketing & Growth Customer Support Open Source & Models Prompts & Templates Entertainment

Sorting

release update Views Like

OpenRouter

One API that routes to hundreds of models with pricing, latency, and fallback controls.

0750

Inference/Hosting & APIs # API # catalog # fallback

GroqCloud

Ultra-low latency LPU-powered inference for text, speech, and vision models.

0490

Inference/Hosting & APIs # API # Groq # inference

Together AI Inference

Fast serverless APIs and dedicated endpoints for 200+ open models.

0470

Inference/Hosting & APIs # API # dedicated # inference

Cerebras Inference

Wafer-scale engine cloud with OpenAI-style APIs for ultra-fast open-model inference.

0460

Inference/Hosting & APIs # API # Cerebras # inference

SambaNova Cloud

RDU-accelerated inference platform with OpenAI-compatible API keys for top open models.

0450

Inference/Hosting & APIs # API # inference # LLM

Replicate

Run and deploy community and custom models with a simple cloud API and playgrounds.

0430

Inference/Hosting & APIs # API # deploy # hosted models

Anthropic Claude API

Production API for Claude models with tools like web search and enterprise controls.

0430

Inference/Hosting & APIs # Anthropic # API # Claude

IBM watsonx.ai

Business-ready foundation models and Model Gateway with governance and pricing controls.

0410

Inference/Hosting & APIs # API # foundation models # governance

OpenAI API

Unified API for GPT, audio, vision, and realtime—with tooling for evals, moderation, and assistants.

0410

Inference/Hosting & APIs # assistants # enterprise # evaluation

Mistral AI — La Plateforme

High-performance chat/embedding APIs and Studio; OpenAI-compatible endpoints.

0400

Inference/Hosting & APIs # API # chat # compatibility

xAI API (Grok)

Grok models via OpenAI-compatible API; advanced reasoning, coding, and vision.

0390

Inference/Hosting & APIs # API # compatibility # Grok

Cohere Platform

Enterprise LLMs for chat, embeddings, rerank, and tools—private and secure by design.

0390

Inference/Hosting & APIs # API # Cohere # embeddings

Anyscale Endpoints (Ray Serve)

OpenAI-compatible serving on Ray with autoscaling and many-model deployments.

0380

Inference/Hosting & APIs # Anyscale # autoscaling # endpoints

Amazon Bedrock

Managed access to many foundation models, agents, guardrails, and knowledge bases via one API.

0380

Inference/Hosting & APIs # agents # AWS # foundation models

Snowflake Cortex

Run LLMs in your governed data platform—AISQL functions and secure inference.

0350

Inference/Hosting & APIs # AISQL # Cortex # governed