Transcription - RunInfra

A transcription pipeline takes an audio file and returns the transcribed text through an OpenAI-compatible endpoint. RunInfra deploys an open ASR model (such as Whisper large-v3 or distil-large-v3) onto your serving runtime and exposes it at /v1/audio/transcriptions.

Architecture

Audio file (mp3 / mp4 / wav / m4a / webm)
  -> RunInfra /v1/audio/transcriptions (multipart upload)
  -> Whisper ASR deployment on the audio runtime
  -> Transcript (json / text / srt / vtt)

The endpoint is a transparent pass-through to the ASR model you deploy. The file plus any OpenAI-compatible fields (language, prompt, response_format) are forwarded to the deployment, and the response is returned as-is.

What you get out of the box

OpenAI-compatible /v1/audio/transcriptions endpoint (multipart upload)
response_format: json, text, srt, vtt (availability depends on the deployed model)
Per-second billing metered on transcribed audio duration

Speaker diarization and PII redaction are not built into the managed transcription endpoint. The response shape, including any segment or speaker fields, is determined by the ASR model you deploy. If you need diarization or redaction, run those steps on the deployment side or as a post-processing pass over the returned text.

Example prompt

In the dashboard:

Build a transcription pipeline for our recorded support calls.
Use Whisper large-v3 and output SRT subtitles.

Quick example

from openai import OpenAI

client = OpenAI(base_url="https://api.runinfra.ai/v1", api_key="YOUR_RUNINFRA_API_KEY")

with open("call.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="your-pipeline-id",
        file=f,
        response_format="json",
    )

print(transcript.text)

Output shape

The default json format returns the transcribed text:

{
  "text": "Hello, I'm calling about my recent order..."
}

The text, srt, and vtt formats return the corresponding plain-text or subtitle body. Any additional fields depend on the ASR model you deploy, so check your deployment’s response before relying on a specific schema.

Deeper details

See runinfra.ai/use-cases/transcription for the marketing page with per-minute cost math and supported audio formats.

​Architecture

​What you get out of the box

​Example prompt

​Quick example

​Output shape

​Deeper details

Architecture

What you get out of the box

Example prompt

Quick example

Output shape

Deeper details