Skip to main content

Documentation Index

Fetch the complete documentation index at: https://runinfra.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

LlamaIndex’s OpenAI LLM class and OpenAIEmbedding class both accept a custom api_base. Point them at RunInfra.

Install

pip install llama-index llama-index-llms-openai llama-index-embeddings-openai

LLM

from llama_index.llms.openai import OpenAI

llm = OpenAI(
    model="default",
    api_base="https://api.runinfra.ai/v1",
    api_key="YOUR_RUNINFRA_API_KEY",
)

response = llm.complete("What is RunInfra?")
print(response.text)

Embeddings

from llama_index.embeddings.openai import OpenAIEmbedding

embed = OpenAIEmbedding(
    model="default",
    api_base="https://api.runinfra.ai/v1",
    api_key="YOUR_RUNINFRA_API_KEY",
)

vector = embed.get_text_embedding("Hello world")

Full RAG example

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings

Settings.llm = llm
Settings.embed_model = embed

documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(documents)

query_engine = index.as_query_engine()
print(query_engine.query("How fast are RunInfra cold starts?"))

Streaming

for chunk in llm.stream_complete("Tell me a short story"):
    print(chunk.delta, end="", flush=True)

Next steps

LangChain

Same idea, different framework.

RAG cookbook

Raw RAG without a framework.

OpenAI compatibility

The contract powering this integration.

Embeddings API

Endpoint parameters and response shape.