Documentation Index
Fetch the complete documentation index at: https://runinfra.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
A transcription pipeline takes audio files of any length and returns a transcript with timestamps, speaker labels, and optional PII redaction. RunInfra ships the recipe with Whisper (large-v3, distil-large-v3, or turbo) for ASR, pyannote-style diarization for speaker labels, and a small classifier for PII redaction.
Architecture
Audio file (mp3 / mp4 / wav / m4a / webm)
-> Whisper ASR (large-v3 or distil-large-v3, FP8)
-> Diarization (speaker turns, optional)
-> PII redaction pass (names, emails, phone numbers, optional)
-> Transcript with timestamps + speaker labels + redaction markers
Long-form audio is chunked with overlap, transcribed in parallel batches on the GPU, then stitched with timestamp alignment. The whole stack runs on one L40S for files under 90 minutes.
What you get out of the box
- OpenAI-compatible
/v1/audio/transcriptions endpoint (multipart upload)
response_format: json, text, srt, vtt, verbose_json
- Speaker labels via diarization (set
diarize=true)
- PII redaction with replacement tokens (set
redact=true)
- Long-form support: files up to several hours, chunked and stitched
Example prompt
In Pipes:
Build a transcription pipeline for our recorded support calls.
Use Whisper large-v3 with diarization and PII redaction.
Output should be SRT subtitles plus a JSON transcript with speaker labels.
Quick example
from openai import OpenAI
client = OpenAI(base_url="https://api.runinfra.ai/v1", api_key="YOUR_RUNINFRA_API_KEY")
with open("call.mp3", "rb") as f:
transcript = client.audio.transcriptions.create(
model="your-pipeline-id",
file=f,
response_format="verbose_json",
extra_body={"diarize": True, "redact": True},
)
for segment in transcript.segments:
print(f"[{segment['speaker']}] {segment['text']}")
Output shape
{
"text": "Hello, I'm calling about [REDACTED_EMAIL] order...",
"segments": [
{ "start": 0.0, "end": 2.1, "speaker": "Speaker 1", "text": "Hello, I'm calling about [REDACTED_EMAIL] order..." }
],
"language": "en"
}
Deeper details
See runinfra.ai/use-cases/transcription for the marketing page with per-minute cost math and supported audio formats.