Speech-to-text transcription using the Zerfoo OpenAI-compatible API server.
- Loads a Whisper GGUF model via
inference.LoadFile - Implements the
serve.Transcriberinterface to bridge the model to the API server - Starts an in-process HTTP server using
serve.NewServerwithserve.WithTranscriber - Sends the audio file to
/v1/audio/transcriptionsusing multipart form upload - Prints the transcription result
This demonstrates embedding a full OpenAI-compatible transcription service inside a Go application. The same endpoint works with any OpenAI client library.
Requires a Whisper-architecture GGUF model file.
go build -o audio-transcription ./examples/audio-transcription/
# Transcribe an audio file
./audio-transcription --model path/to/whisper.gguf --audio recording.wav
# With language hint
./audio-transcription --model path/to/whisper.gguf --audio recording.mp3 --language en
# With GPU
./audio-transcription --model path/to/whisper.gguf --device cuda --audio recording.wav| Flag | Default | Description |
|---|---|---|
--model |
(required) | Path to a Whisper GGUF model file |
--audio |
(required) | Path to an audio file (WAV, MP3, FLAC, OGG) |
--device |
cpu | Compute device: "cpu", "cuda" |
--language |
(auto) | Optional language hint (e.g., "en", "fr") |