Imagine sending a voice message to a Telegram bot and getting it back in Morgan Freeman’s voice. Or cloning your own voice to create professional voiceovers without recording every single time.
Sounds like magic? It’s not , it’s automation and you’re about to build it yourself in the next 15 minutes.
This guide walks you through creating a voice cloning Telegram bot using n8n, ElevenLabs, and Google Drive. No coding experience needed, just follow along.
What You’ll Build Today
By the end of this tutorial, you’ll have a fully functional Telegram bot that:
Get Your Voice Clone Bot Template
- Receives voice messages from you via Telegram
- Processes them through ElevenLabs’ AI voice cloning technology
- Transforms your voice into any pre-selected voice (celebrity, professional narrator, or your own cloned voice)
- Saves the cloned audio to Google Drive automatically
- Sends the AI-generated voice message back to you in seconds
This isn’t just a fun project. Content creators use this for voiceovers, podcasters create intro variations and marketers generate multiple voice versions for A/B testing. The possibilities are endless once you automate voice cloning.
What You’ll Need Before Starting
Before we dive into building, gather these prerequisites:
1. n8n Account
- Sign up at n8n.io (free plan works fine)
- This is your automation platform where the magic happens
2. ElevenLabs Account
- Create an account at elevenlabs.io
- Grab your API key from the settings
- Note: Free tier gives you 10,000 characters per month
3. Telegram Bot Token
- Open Telegram and search for @BotFather
- Create a new bot and save the token (looks like: 123456789:ABCdefGHIjklMNOpqrsTUVwxyz)
4. Google Drive Account
- Any Gmail account works
- We’ll use this to store your cloned audio files
Got everything? Great! Let’s start building.
Before Creating this workflow check this – Instagram Comments Automation (N8n Complete Setup Guide)
Understanding the Workflow Architecture
Before jumping into node configuration, let’s understand what happens behind the scenes:
- Trigger: You send a voice message to your Telegram bot
- Security Check: The workflow verifies it’s actually you (not a random person)
- Message Router: Identifies whether you sent text, voice, or image
- Audio Retrieval: Downloads your voice message from Telegram servers
- Voice Cloning: Sends audio to ElevenLabs for AI voice transformation
- Cloud Storage: Uploads the cloned audio to Google Drive
- Response: Sends the AI-generated voice back to you via Telegram
Think of it as an assembly line for voice cloning. Each node is a workstation handling one specific task. Now let’s build each station.
Step 1: Setting Up Your n8n Workspace
Log into your n8n account and create a new workflow:
- Click the “+” button or “New Workflow” on your dashboard
- Name your workflow something memorable like “Voice Clone Bot”
- You’ll see a blank canvas , this is where we’ll build
The interface might look intimidating at first, but it’s actually super intuitive. You’ll drag nodes (think of them as LEGO blocks) onto this canvas and connect them. Each node performs one action in your automation.
Step 2: Add the Telegram Trigger Node
The trigger is what starts your automation. Here’s how to set it up:
- Click the “+” button on the canvas
- Search for “Telegram Trigger” in the node panel
- Drag it onto your canvas
Now configure the node:
Connecting Your Telegram Bot:
- Click “Create New Credentials”
- Paste your Telegram Bot Token (the one you got from @BotFather)
- Name it something like “My Voice Clone Bot”
- Click “Save”
Trigger Settings:
- Under “Updates”, select “message”
- This tells the bot to activate whenever someone sends any message
- Leave other fields as default
Why this matters: Without this trigger, your workflow just sits idle. The Telegram Trigger is the doorbell that alerts your automation “Hey, someone sent a message!”
Step 3: Add Security with the Sanitize Node
Security is critical. You don’t want random people using your ElevenLabs credits. This node blocks unauthorized users:
- Add a “Code” node after the Telegram Trigger
- Rename it to “Sanitize” (click the node name to edit)
- Select “Run Once for All Items” in settings
Important: Replace 7773500682 with YOUR Telegram User ID (from @userinfobot)
What this does: It checks every incoming message. If the sender’s ID matches yours, it proceeds. If not, it stops immediately. Think of it as a bouncer checking IDs at a club entrance.
Step 4: Create the Message Router (Switch Node)
Not every message you send will be a voice message. Sometimes you might accidentally send text or images. The Switch node routes different message types:
- Add a “Switch” node after Sanitize
- This acts like a traffic controller—directing different messages to different paths
Configure three conditions:
Condition 1 – Text Messages:
- Click “Add Routing Rule”
- Set condition: {{ $json.message.text }} exists
- Rename output to “Text”
- This catches regular text messages (we won’t use this path, but it prevents errors)
Condition 2 – Voice Messages:
- Add another routing rule
- Set condition: {{ $json.message.voice.file_id }} exists
- Rename output to “Audio”
- This is the golden path—where voice messages go
Condition 3 – Images:
- Add third routing rule
- Set condition: {{ $json.message.photo[0] }} exists
- Rename output to “Immagine” (or “Image”)
- Future-proofs your workflow if you want to add image processing later
The Switch node is basically an if-else statement in visual form. “If voice exists, go this way. If text exists, go that way.”
Step 5: Download Voice Messages (Get Audio Node)
When someone sends a voice message on Telegram, it doesn’t automatically include the audio file—just a reference ID. This node downloads the actual audio:
- Add a “Telegram” node connected to the “Audio” output of the Switch node
- Rename it to “Get audio”
Configure the node:
- Credentials: Use the same Telegram credentials from Step 2
- Resource: Select “File”
- Operation: Select “Get”
- File ID: Insert this expression: {{ $json.message.voice.file_id }}
That expression pulls the file ID from the incoming message and tells Telegram “Give me the actual audio file for this ID.”
Why this step exists: Telegram doesn’t send full files in webhook triggers for bandwidth reasons. You have to explicitly request the file download.
Step 6: Clone the Voice with ElevenLabs
This is where the AI magic happens. We’ll send your audio to ElevenLabs and get back a voice-cloned version:
- Add an “HTTP Request” node after “Get audio”
Rename it to “Generate cloned audio”
Configure the request:
Basic Settings:
- Method: POST
- URL: https://api.elevenlabs.io/v1/speech-to-speech/21m00Tcm4TlvDq8ikWAM
- Replace 21m00Tcm4TlvDq8ikWAM with your chosen Voice ID from ElevenLabs
Headers Section:
- Enable “Send Headers”
- Add header:
- Name: xi-api-key
- Value: YOUR_ELEVENLABS_API_KEY (get this from elevenlabs.io/settings)
- Add another header:
- Name: Accept
- Value: audio/mpeg
Body Section:
- Content Type: Select “Multipart Form Data”
- Add parameter:
- Name: output_format
- Value: mp3_44100_128
- Add another parameter (select “Form Binary Data”):
- Parameter Type: Form Binary Data
- Name: audio
- Input Data Field Name: data
Response Options:
- Go to “Options” → “Response”
- Response Format: Select “File”
- Output Property Name: cloned_voice.mp3
What’s happening here: Your workflow uploads the Telegram voice message to ElevenLabs, which runs it through their AI voice cloning model, transforms it into your selected voice, and returns the new audio as an MP3 file.
Pro tip: Different Voice IDs create different voices. Experiment with various voices in the ElevenLabs library to find ones you like!
Step 7: Save to Google Drive
Storing your cloned audio in Google Drive creates a permanent library and makes sharing easier:
- Add a “Google Drive” node after “Generate cloned audio”
- Rename it to “Upload file”
Connect Google Drive:
- Click “Create New Credentials”
- Authorize n8n to access your Google Drive
- Follow the OAuth popup (it’s safe—n8n just needs upload permissions)
Configure Upload:
- Operation: Upload (default)
- File Name: cloned_{{ $json.result.file_path.match(/[^/]+$/)[0] }}
- This creates unique filenames like “cloned_voice_message_001.mp3”
- Drive: Select “My Drive”
- Folder: Choose or create a folder called “Elevenlabs” (or whatever you prefer)
The expression in the filename pulls the original Telegram filename and adds “cloned_” as a prefix. This keeps your Drive organized.
Why Google Drive: You could skip this and send files directly back to Telegram, but Drive gives you a searchable library of all your cloned voices. Plus, you can share them easily or use them in other projects.
Step 8: Send Cloned Audio Back to Telegram
The final step—sending your AI-cloned voice back to you:
- Add a “Telegram” node after “Upload file”
- Rename it to “Send an audio file”
Configure the node:
- Credentials: Use your existing Telegram credentials
- Resource: Message
- Operation: Send Audio
- Chat ID: {{ $(‘Telegram Trigger’).item.json.message.chat.id }}
- This expression automatically sends the audio back to whoever triggered the workflow (you!)
- Binary Data: Toggle ON
- Binary Property: data
The Chat ID expression is clever—it grabs the chat ID from the original trigger, so the bot knows exactly where to reply.
Step 9: Connect All Nodes and Test
Now connect all your nodes in order:
- Telegram Trigger → Sanitize
- Sanitize → Switch
- Switch (Audio output) → Get audio
- Get audio → Generate cloned audio
- Generate cloned audio → Upload file
- Upload file → Send an audio file
Activate your workflow:
- Click the toggle switch in the top right corner (it should turn blue)
- This puts your bot online and listening for messages
Testing Time:
- Open Telegram and go to your bot
- Send a voice message (say anything—”Testing my voice clone bot!”)
- Wait 10-20 seconds (the AI needs time to process)
- You’ll receive your voice message back, but in your selected ElevenLabs voice
- Check your Google Drive—the cloned audio should be saved there too
If it works—congratulations! You’ve just built an AI voice cloning bot. If not, check the execution log in n8n. It shows exactly where the workflow failed and why.
Final Verdict
You just built a sophisticated AI voice cloning system that most people think requires a team of developers.
But here’s the truth , automation tools like n8n make powerful AI accessible to everyone.
What you created in 15 minutes would have taken weeks of coding just a few years ago.
Now you have a bot that clones voices, stores them in the cloud, and delivers them instantly.
The real question isn’t what you built today, but what you’ll build tomorrow with these skills.