Plans and Pricing

Free to start. Pay for deployment and inference.

Plans

	Starter	Pro	Team	Enterprise
Price	Free	$99/mo	$249/seat/mo	Custom
Build pipelines	3	Unlimited	Unlimited	Unlimited
Optimize	3/month	Unlimited	Unlimited	Unlimited
Test in playground	100/day	Unlimited	Unlimited	Unlimited
Deploy endpoints	No	Yes	Yes	Yes
Always-on endpoints	No	No	Yes	Yes
TensorRT-LLM	No	No	Yes	Yes
Custom model uploads	No	No	Yes	Yes
Max replicas	-	8	32	Custom
Support	Community	Priority email	Shared Slack	Dedicated CSM

Starter is free forever. Build pipelines, optimize models, and test in the playground. No credit card required.

Pro ($99/mo or $79/mo annual) unlocks deployment. Your pipelines become live API endpoints with scale-to-zero, full optimized model search (AWQ, GPTQ, FP8), and Forge GPU kernel optimization.

Team ($249/seat/mo or $199/seat/mo annual, min 3 seats) adds always-on endpoints with zero cold start, TensorRT-LLM, speculative decoding, custom model uploads, and audit logs.

Enterprise includes dedicated GPU infrastructure, custom SLAs, SOC 2/HIPAA compliance, and volume pricing. Contact sales.

Token pricing

When you deploy an endpoint, inference is billed per million tokens. Estimated starting rates by model size:

Model size	Input (from)	Output (from)
Small (1-8B)	$0.08 / MTok	$0.20 / MTok
Medium (8-30B)	$0.20 / MTok	$0.80 / MTok
Large (30-70B)	$0.45 / MTok	$1.50 / MTok
XL (70B+)	$0.80 / MTok	$2.50 / MTok

These are estimated starting prices. Your actual per-token cost depends on your full pipeline: model choice, quantization method, GPU tier, routing, and deployment mode. RunInfra shows your real estimated cost in the deploy tab before you go live.

Team plans get 10% off at 100M+ tokens/month. Enterprise gets up to 40% off.

What counts as usage

Optimization session: One optimization run on one pipeline. Starter gets 3/month, resets monthly.
Playground request: One inference call in the test playground. Starter gets 100/day.
Token: Input tokens (your prompt) and output tokens (model response) are counted separately.

Manage your plan

Upgrade or downgrade at Settings > Billing. Track usage at Settings > Usage.

How is this guide?