generateAudio Tool
The generateAudio tool allows AI assistants to generate audio from text (Text-to-Speech) using AI providers. Currently, it fully supports the gemini provider (default) and vercel provider (which interfaces with OpenAI’s TTS).
Use Cases
- Converting text responses or generated stories into playable audio files.
- Narrating content for multimedia creation.
- Creating voice messages or podcast-style content within the conversation.
Parameters
The tool accepts the following parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
text | string | Yes | The text to convert to speech. |
provider | string | No | AI provider to use for generation. gemini is the default and recommended. |
voice | string | No | The voice to use. For Gemini: Aoede, Charon, Fenrir, Kore, Puck. For Vercel: alloy, echo, etc. |
format | string | No | The audio format. Defaults to wav for Gemini, and mp3 for Vercel. |
model | string | No | The TTS model to use. E.g., gpt-4o-mini-tts or tts-1. |
Behavior
- Credit Check: Validates and deducts the required credits for audio generation (similar to image generation pricing).
- Audio Generation: Calls the
/api/ai/audioendpoint with the provided parameters. - Storage: The generated audio is automatically uploaded to the site’s storage assets.
- Instance Asset: If an
instance_idis provided in the context, the audio is also linked to the instance assets. - Response: Returns the public URL of the generated audio along with metadata (mimeType, format, etc.).
Example Usage
Basic TTS
{
"text": "Hello world! This is an AI generated voice.",
"provider": "gemini",
"voice": "Puck"
}Response
The tool returns an object containing the status and the URL to the generated audio file:
{
"success": true,
"provider": "gemini",
"audio_url": "https://YOUR_STORAGE_URL/site_id/generated_audio_12345.wav",
"mimeType": "audio/wav",
"metadata": {
"format": "wav",
"voice": "Puck",
"generated_at": "2024-05-15T12:00:00.000Z"
},
"message": "Successfully generated audio using gemini. Audio is saved and ready to use. URL: https://..."
}Last updated on