Documentation Index
Fetch the complete documentation index at: https://runinfra.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
A document AI pipeline takes a PDF, image, or scanned form and returns structured JSON: extracted fields, table cells, line items, signatures, layout. RunInfra ships the recipe with Qwen2.5-VL and Llama 3.2 Vision as the canonical open vision-language models, so you stop paying per-page invoices and start paying per million tokens against a model you own.
Architecture
Document (PDF / image / form)
-> Page renderer (PDF -> image tiles at the right DPI)
-> Vision-language model (Qwen2.5-VL 7B or Llama 3.2 11B Vision, FP8)
-> JSON schema enforcement (structured output)
-> Validated record
The vision encoder is fused with the LLM at the engine level (vLLM with multimodal support), so a single forward pass processes images + the schema prompt together.
What you get out of the box
- Schema-driven output: pass a JSON schema, get a validated object back
- PDF + image input via base64 or
data_url references
- Multi-page batching with one request per document, paged internally
- OpenAI-compatible chat completions endpoint with
response_format
- Common formats supported: receipts, invoices, forms, ID cards, tables, contracts
Example prompt
In Pipes:
Build a document AI pipeline that extracts invoice data into structured JSON.
Use Qwen2.5-VL 7B. The schema needs: vendor name, invoice number, date,
line items (description, qty, unit price, total), subtotal, tax, total.
Quick example
from openai import OpenAI
import base64
client = OpenAI(base_url="https://api.runinfra.ai/v1", api_key="YOUR_RUNINFRA_API_KEY")
with open("invoice.pdf", "rb") as f:
b64 = base64.b64encode(f.read()).decode()
resp = client.chat.completions.create(
model="your-pipeline-id",
messages=[{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": f"data:application/pdf;base64,{b64}"}},
{"type": "text", "text": "Extract this invoice."},
],
}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "invoice",
"schema": {
"type": "object",
"properties": {
"vendor": {"type": "string"},
"invoice_number": {"type": "string"},
"total": {"type": "number"},
},
"required": ["vendor", "invoice_number", "total"],
},
},
},
)
Deeper details
See runinfra.ai/use-cases/document-ai for the marketing page with per-document cost math and supported model list.