Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
multipart/form-data
Audio file upload or public HTTP/HTTPS URL. Supported formats .wav, .mp3, .m4a, .webm, .flac. Audio file to transcribe
Model to use for transcription
Available options:
openai/whisper-large-v3 Optional ISO 639-1 language code. If auto is provided, language is auto-detected.
Example:
"en"
Optional text to bias decoding.
The format of the response
Available options:
json, verbose_json Sampling temperature between 0.0 and 1.0
Required range:
0 <= x <= 1Controls level of timestamp detail in verbose_json. Only used when response_format is verbose_json. Can be a single granularity or an array to get multiple levels.
Available options:
segment, word Example:
["word", "segment"]Response
OK
- Option 1
- Option 2
The transcribed text
Example:
"Hello, world!"