Search K
Appearance
Appearance
The GoogleTTS node uses the Google Cloud Text-to-Speech API to generate realistic spoken audio from text or SSML input.
It supports multiple languages, voices, and audio formats (MP3, OGG, WAV/LINEAR16), allowing you to convert workflow-generated text into spoken output for use in video narration, alerts, audio messages, or AI-powered assistants.
This node connects to Google Cloud using a stored ThirdParty-Google Cloud credential and outputs an audio file that can be played, downloaded, or embedded in later steps.
| Field | Type | Description | Required |
|---|---|---|---|
| ThirdParty - Google Cloud | Third-Party Token | Select or provide the Google Cloud third-party credential (OAuth token). | ✅ |
| Input Type | Picklist | Determines whether input is Text or SSML. | ✅ |
| Text | Text (multi-line) | Plain text input to synthesize. Used if Input Type = Text. | ✅ if Text mode |
| SSML | Text (multi-line) | SSML-formatted input for advanced control of pauses, tone, etc. | ✅ if SSML mode |
| Voice | Text | The full voice name (e.g., en-US-Neural2-D, en-GB-News-L). | ✅ |
| Audio Format | Picklist | Output format: MP3, OGG_OPUS, or LINEAR16 (WAV). | ✅ |
| Speaking Rate | Number | Adjusts the voice speed. Default = 1.0 (normal). | ❌ |
| Pitch (st) | Number | Pitch adjustment in semitones. Default = 0.0. | ❌ |
| Volume Gain (dB) | Number | Volume adjustment in decibels. Default = 0.0. | ❌ |
| Sample Rate (Hz) | Number | Optional override of the sample rate (e.g., 24000, 48000). Required for LINEAR16. | ❌ |
| Chunk Size (chars) | Number | Max characters per TTS request. Default = 4500. | ❌ |
| File Name | Text | Base name for the output audio file (without extension). | ✅ |
| Variable | Type | Description |
|---|---|---|
| Audio.File | String | Full path of the generated audio file. |
| Audio.Voice | String | The Google TTS voice used. |
| Audio.Format | String | The format of the generated audio file (mp3, ogg, or wav). |
| Audio.Chunks | Number | Number of chunks the input text was divided into. |
| Audio.Characters | Number | Total number of characters processed. |
| taskMessage | String | Completion message. |
| statusReturn | String | Completed if successful, or Fail if an error occurred. |
{
"Audio": {
"File": "C:\\MinuteView\\Working\\narration.mp3",
"Voice": "en-US-Neural2-D",
"Format": "MP3",
"Chunks": 2,
"Characters": 8400
},
"taskMessage": "Google TTS synthesis completed successfully",
"statusReturn": "Completed"
}| Setting | Example |
|---|---|
| ThirdParty - Google Cloud | GoogleCloud-ProdToken |
| Input Type | Text |
| Text | Welcome to MinuteView Automations, your engineering workflow companion. |
| Voice | en-US-Neural2-D |
| Audio Format | MP3 |
| Speaking Rate | 1.1 |
| Pitch (st) | 0.5 |
| File Name | welcome_message |
Result: → Generates welcome_message.mp3 in the working folder with a natural American English voice.
| Setting | Example |
|---|---|
| Input Type | SSML |
| SSML |
<speak>
Hello there! <break time="500ms"/>
<emphasis level="moderate">Welcome to MinuteView Automations.</emphasis>
</speak>
``` |
| **Voice** | `en-GB-Neural2-A` |
| **Audio Format** | `LINEAR16` |
| **Sample Rate (Hz)** | `48000` |
| **File Name** | `intro_voice` |
**Result:**
→ Produces `intro_voice.wav` with SSML-controlled timing and emphasis.
---
## Notes
- The node automatically splits long text into chunks (max 4500 characters).
- All chunks are concatenated into a single output file.
- If **Audio Format** = `LINEAR16`, the node creates a valid `.wav` file with a PCM header.
- Language code is inferred automatically from the **Voice** name (e.g., `en-US-Neural2-D` → `en-US`).
- Compatible with any voice available in Google Cloud TTS.
- Works for all text languages supported by the selected voice.
- Requires a valid **Google Cloud third-party token** (OAuth-based).
---
## Common Use Cases
| Scenario | Description |
|-----------|--------------|
| 🔊 **AI Narration** | Convert dynamically generated text to audio for training, documentation, or presentations. |
| 💬 **Chatbot Voice Output** | Generate speech responses for AI chat or assistant workflows. |
| 🎧 **Audio Alerts** | Play or send system notifications with voice messages. |
| 🗣️ **Multilingual Output** | Generate speech in any supported language and accent. |
---
## Status Messages
| Status | Description |
|---------|-------------|
| **Completed** | Audio synthesis completed successfully. |
| **Fail** | Error occurred (invalid credentials, empty input, or API failure). |
---
## Error Handling
The node logs detailed workflow messages in case of failure:
- Missing or invalid Google Cloud token
- Empty text or SSML input
- Invalid voice or audio format
- API error or connection issue
- Output file write error
Check the **Workflow Log** for `[ERROR] GoogleTTS failed:` entries to diagnose issues.
---
## Example Workflow Integration
```mermaid
graph LR
A[AI Generate Response] --> B[Google TTS]
B --> C[Save File to SharePoint]
B --> D[Play Audio Notification]Category: AI & Google Cloud Task Name: GoogleTTS