Troubleshooting

Common issues and how to fix them.

Chat and pipeline building

The agent keeps asking questions instead of building

Say: "Just go with your best recommendation and we'll iterate from there."

The agent picked the wrong model

Say the exact model you want: "Switch to Llama 3.1 8B."

I want to start over

Say: "Reset the pipeline and start from scratch."

Optimization

Optimization is taking too long

Optimization typically takes 2-5 minutes. If it seems stuck, ask: "What's the status of the optimization?"

Results don't meet my constraints

Try different approaches:

"Optimize again with a smaller model"
"Try a faster GPU"
"Relax the latency constraint to 300ms"

Quality score is too low with the optimized variant

Try a higher precision variant: "Search for an FP8 version instead of AWQ 4-bit." FP8 preserves more quality, especially on H100/H200 GPUs.

Deployment

Deployment failed

The agent shows error diagnostics. Common fixes:

"Try a different GPU tier" (current GPU may be unavailable)
"The model might be too large for this GPU. Recommend something bigger."

First request is slow (30-60 seconds)

Normal for the very first request after deployment. The model loads and compiles. Subsequent requests are fast (under 2 seconds cold start) thanks to RunInfra Cloud's weight caching.

Endpoint returns 503

The endpoint is stopped or still provisioning. Check status at Deployments or ask: "What's the status of my deployment?"

API integration

401 Unauthorized

Your API key is invalid, revoked, or missing. Check:

Key starts with ri_
Authorization: Bearer ri_your_key header is present
Key hasn't been revoked

403 Forbidden

The API key doesn't match the pipeline ID in the URL. Each key is scoped to one pipeline.

429 Too Many Requests

Rate limit exceeded. Wait for the Retry-After period, then retry. Increase your key's rate limit at Settings > API Keys.

Upgrade required (403 with upgrade prompt)

You hit a plan limit. Upgrade at Settings > Billing.

Still stuck?

Free plan: Community support
Pro: Priority email support
Team: Shared Slack channel
Enterprise: Dedicated customer success manager

Send feedback from within the app or email support from your billing page.

How is this guide?