GPU

Total 7 articles sites

Sorting

release update Views Like

LM Studio (Desktop)

Popular desktop app to run and chat with local LLMs; simple model management and fast GPU/CPU offload for RP sessions.

0390

Roleplay Frontends (Local/Self-Hosted UI)# adult # desktop # GPU

RunPod Serverless Endpoints

Always-on, pre-warmed GPU endpoints for low-latency model inference at scale.

0320

Inference/Hosting & APIs # endpoints # GPU # inference

Modal Inference

Serverless GPU inference with sub-second cold starts and Python-first workflows.

0310

Inference/Hosting & APIs # GPU # inference # Modal

Baseten

Production inference platform—dedicated deployments, autoscaling, and GPU options.

0320

Inference/Hosting & APIs # autoscaling # Baseten # dedicated

NVIDIA NIM

Prebuilt, optimized inference microservices for leading models on any NVIDIA-accelerated stack.

0310

Inference/Hosting & APIs # GPU # inference # microservices

NVIDIA NGC Models

Optimized model catalog for NVIDIA GPUs—LLMs, vision, speech—with containers and inference recipes.

0300

Model Hubs # containers # GPU # inference

NVIDIA Riva

GPU-accelerated ASR/TTS SDK for low-latency, on-prem or cloud voice AI deployments.

0350

Speech-to-Text # ASR # GPU # low latency