Audio to Text

Converts an audio file from a given URL to text using AI. Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, or webm. Use this to transcribe voice notes, audio files, or extract text from videos. It utilizes Gemini 1.5 Pro or OpenAI Whisper as a fallback.

Input Schema

Parameter	Type	Description
audio_url	string	The valid public URL of the audio file to transcribe.

REST Endpoint

Currently, this tool does not have a dedicated public REST endpoint, but is available via the MCP server and Assistant tools.

Example MCP Call:


{
  "name": "audio_to_text",
  "arguments": {
    "audio_url": "https://example.com/audio.mp3"
  }
}

Response:


{
  "success": true,
  "text": "The transcribed text goes here..."
}