Skip to main content

Documentation Index

Fetch the complete documentation index at: https://runinfra.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Two things: per-token inference (input + output, separately) and per optimization session beyond your plan’s included budget. Active mode deployments add a flat monthly base fee per warm replica.
The standard LLM unit. A rough rule of thumb: 1 token is about 4 characters of English text. The response returns usage.total_tokens so you never need to guess.
Sessions included per plan: Starter 3/mo, Pro 20/mo, Team 100/seat/mo. Overage sessions are $2.50 each on Pro and Team, billed from your credit balance. Enterprise has volume pricing.
Yes on paid plans. Up to 2x your monthly allowance. Pro carries up to 40 sessions; Team carries up to 200 per seat. Rolled-over sessions clear on any plan change.
Failed or cancelled sessions are refunded to your budget automatically.
Settings > Usage shows a daily cost chart, token breakdown, per-model cost, and error counts for the current billing period and the previous two.
Yes. Settings > Billing > Spend limit enforces a hard monthly cap. Requests return 403 plan_limit_exceeded when you hit it.
Credits are prepaid balance used for overage sessions, Active-mode base fees, and any post-plan tokens. Top up at Settings > Billing. Credits never expire as long as the account is active.
Volume pricing is available on Team and Enterprise. Contact sales for a quote.
Active mode keeps replicas warm 24/7. The base fee covers the reserved GPU time. Exact amount depends on GPU tier and min_replicas; the Deploy tab shows the projected monthly cost before you commit.
Yes on Team and Enterprise. Invoices are issued monthly and available at Settings > Billing > Invoices. PDF and PO fields supported.
Card on Pro and Team. ACH, wire, and custom PO on Enterprise.
Yes. Upgrades take effect immediately with prorated billing. Downgrades take effect at the start of the next billing cycle.

Not here?

Plans and pricing

The canonical plan comparison table.

GPUs and pricing

GPU tiers and per-token rates.

Account and access

Workspaces, seats, API keys.

Contact sales

Enterprise quotes and volume deals.