Documentation

Build, optimize, and deploy AI inference pipelines through conversation.

RunInfra turns plain English into production AI endpoints. Describe what you need, and the agent handles the rest.

Your first pipeline in 5 minutes.

How to talk to the agent effectively.

Real conversations for every use case.

Deploy, test, and use your endpoint.

Learn more

How RunInfra makes models faster and cheaper.

100+ models from Hugging Face.

Per-token pricing and available GPU tiers.

Free to start. Pro from $99/mo.

Track requests, latency, cost, and errors.

The full workflow, step by step.

How is this guide?