Troubleshooting
Common issues and how to fix them.
Chat and pipeline building
The agent keeps asking questions instead of building
Say: "Just go with your best recommendation and we'll iterate from there."
The agent picked the wrong model
Say the exact model you want: "Switch to Llama 3.1 8B."
I want to start over
Say: "Reset the pipeline and start from scratch."
Optimization
Optimization is taking too long
Optimization typically takes 2-5 minutes. If it seems stuck, ask: "What's the status of the optimization?"
Results don't meet my constraints
Try different approaches:
- "Optimize again with a smaller model"
- "Try a faster GPU"
- "Relax the latency constraint to 300ms"
Quality score is too low with the optimized variant
Try a higher precision variant: "Search for an FP8 version instead of AWQ 4-bit." FP8 preserves more quality, especially on H100/H200 GPUs.
Deployment
Deployment failed
The agent shows error diagnostics. Common fixes:
- "Try a different GPU tier" (current GPU may be unavailable)
- "The model might be too large for this GPU. Recommend something bigger."
First request is slow (30-60 seconds)
Normal for the very first request after deployment. The model loads and compiles. Subsequent requests are fast (under 2 seconds cold start) thanks to RunInfra Cloud's weight caching.
Endpoint returns 503
The endpoint is stopped or still provisioning. Check status at Deployments or ask: "What's the status of my deployment?"
API integration
401 Unauthorized
Your API key is invalid, revoked, or missing. Check:
- Key starts with
ri_ Authorization: Bearer ri_your_keyheader is present- Key hasn't been revoked
403 Forbidden
The API key doesn't match the pipeline ID in the URL. Each key is scoped to one pipeline.
429 Too Many Requests
Rate limit exceeded. Wait for the Retry-After period, then retry. Increase your key's rate limit at Settings > API Keys.
Upgrade required (403 with upgrade prompt)
You hit a plan limit. Upgrade at Settings > Billing.
Still stuck?
- Free plan: Community support
- Pro: Priority email support
- Team: Shared Slack channel
- Enterprise: Dedicated customer success manager
Send feedback from within the app or email support from your billing page.
How is this guide?