Wake word not responding on ESP32-S3

Verify INMP441 wiring: VDD→3.3V, GND→GND, SD→GPIO32, WS→GPIO25, SCK→GPIO26. Speak clearly at 30-100cm. Ensure MultiNet6 (English) recognition model is selected in XiaoZhi Customize tool. Standard ESP32 does NOT support custom wake word.

OLED display is blank / no face on XiaoZhi AI

Check I2C wiring: VCC→3.3V, GND→GND, SCL→GPIO15, SDA→GPIO4. Confirm OLED address is 0x3C for SSD1306. Ensure correct screen resolution (128×64) selected in Customize tool.

No audio or distorted sound from ESP32 speaker

Check MAX98357A wiring: VIN→3.3V/5V, GND→GND, LRC→GPIO33, DIN→GPIO27, BCLK→GPIO14. Reduce volume with voice command. Ensure power supply delivers at least 1A. Keep audio wires away from power lines.

ESP32 keeps restarting / boot loop

Use a USB port delivering at least 1A. Ensure ESP32-S3 has 8MB PSRAM (N16R8 module). Use v2 partition tables for v2 firmware. Reflash with Erase Device option checked.

Cannot connect ESP32 to Wi-Fi / Network Error

ESP32-S3 only supports 2.4GHz Wi-Fi, not 5GHz. Create a separate 2.4GHz SSID or temporarily disable 5GHz band. Hold BOOT for 5 seconds to reset stored credentials.

XiaoZhi audio cuts off mid-sentence / TTS stops

Weak Wi-Fi signal causes audio packet loss during streaming TTS. Move ESP32-S3 closer to router. Ensure dedicated 2.4GHz network. Check for MQTT audio sequence warnings.

Build XiaoZhi AI ESP32-S3 Voice Chatbot — Custom Wake Word, Face Animation & Full Tutorial 2026

ESP32-S3 · XiaoZhi AI Pro · 2026

Build Your Own
XiaoZhi AI Voice Chatbot
with Custom Wake Word

Q: ESP32-S3 not detected / no COM port appearing

Use a USB cable that supports data transfer, not charge-only. Install or update CP2102/CH340/CH9102 USB-to-UART drivers. Hold BOOT while plugging in and while clicking Connect in the web flasher.

Build a fully featured AI voice assistant on ESP32-S3 — featuring animated OLED face expressions, custom voice wake word detection using the ESP-SR MultiNet model, NTP-synchronized time, OTA firmware updates, Wi-Fi diagnostics via MCP protocol, and remote reboot. No coding required. Flash from browser via ESP Web Tools in minutes. Supports Hindi, Hinglish, English and multilingual voice conversations powered by Qwen and DeepSeek LLMs on the xiaozhi.me platform.

~55 min build No coding needed Free firmware Voice wake word Face animation Chrome / Edge

What's New in XiaoZhi AI Pro

This is a complete upgrade over the original. Same core components — ESP32-S3 replaces the standard ESP32 for hardware wake word support — with seven major new capabilities spanning display, audio, and connectivity.

Face Animation Custom Wake Word NTP Clock Sync Wi-Fi Diagnostics MCP Remote Reboot MCP Hide/Show Face Toggle OTA Theme Flash Voice LED Control

Contents

01 Introduction 02 What's New 03 Features 04 Components 05 Buy Kit 06 Circuit Diagram 07 Build Guide 08 Flash Firmware 09 Wi-Fi Setup 10 Create Account 11 Pair Device 12 Configure AI 13 Custom Wake Word 14 FAQ & Troubleshooting 15 Done!

Introduction — What is XiaoZhi AI?

Open-source voice-first AI on ESP32 — fully customizable

XiaoZhi AI (小智) is an open-source ESP32 firmware project (MIT license, 26K+ GitHub stars) that transforms an ESP32-S3 microcontroller into a cloud-connected AI voice assistant. It handles real-time streaming speech recognition (ASR), natural language processing via large language models like Qwen and DeepSeek, and expressive text-to-speech (TTS) output — all tied to the xiaozhi.me platform. Unlike closed ecosystems like Amazon Alexa or Google Assistant, XiaoZhi gives you complete control over your AI's personality, language, custom wake word, and behavior through an intuitive web console.

Version 2 of this tutorial upgrades to the ESP32-S3, which unlocks hardware-accelerated wake word detection through the Espressif ESP-SR MultiNet model — the standard ESP32 cannot run this. It also adds an animated OLED face with blinking and talking expressions, NTP time synchronization, voice-controlled face/text toggle, and two new MCP (Model Context Protocol) tools — live Wi-Fi diagnostics and remote device reboot. This is now a genuinely powerful, production-ready open source voice AI you can build at home for under $20 in components. Perfect for students, makers, hobbyists, and IoT enthusiasts.

What's New in XiaoZhi AI Pro

Major upgrades over the previous ESP32 version

Version 2 is not just a component swap — it brings eight significant new features that make the experience far more natural and capable. Here's what changed and why it matters.

Animated Face on OLED

The 128×64 display now shows expressive animated eye sprites — blinking, talking, and idle states — giving your AI a physical personality.

Custom Wake Word

No more button press to activate. Just say your chosen wake phrase — like "Hey Maxon" — and the AI wakes instantly. Powered by ESP32-S3's MultiNet model.

NTP Time Sync

The device synchronizes with internet time servers automatically. The AI now knows the correct time and date for every interaction.

Wi-Fi Diagnostics MCP

Ask "What's my Wi-Fi strength?" or "What's my IP address?" and the AI responds with live network details fetched via a custom MCP tool.

Remote Reboot MCP

Say "Reboot yourself" and the AI confirms, then triggers a firmware restart. No need to physically reach the device for a reset.

Face Toggle by Voice

Say "Hide your face and show text" or "Show your face only" — the display mode switches between face-only and text-only layouts on demand.

Voice-Activated LED Control

Say "Turn on the light" or "Set LED to blue" — control an external RGB LED strip or indicator via voice commands through a custom MCP GPIO tool.

OTA Theme Updates

Customize wake word, face animation sprites, and display themes wirelessly. Generate assets.bin in the console and flash over Wi-Fi — no USB cable required.

Features & Voice Commands

Everything your XiaoZhi AI V2 can do

Your XiaoZhi AI Pro responds to natural voice commands for nearly every function. Here is everything it can do, organized by capability.

Custom Wake Word

Say your phrase to activate — no button needed. Powered by MultiNet on ESP32-S3.

Animated Face

Expressive OLED animations: idle, talking, blinking states that respond to conversation.

Face Toggle by Voice

"Show only your face" / "Switch to text mode" — toggled with a voice command.

NTP Time Sync

"What time is it?" returns the accurate current time, synced automatically over internet.

Wi-Fi Diagnostics

"What's my IP?" / "How strong is my Wi-Fi?" — answered via the network MCP tool.

Remote Reboot

"Reboot yourself" — the AI confirms then triggers a firmware restart via MCP tool.

Volume Control

"Set volume to 50%" or "Louder" — adjusts output level mid-conversation.

Live Weather

"What's the weather in Mumbai?" fetches current conditions via MCP weather service.

Music Playback

"Play Hindi music" or "Play some jazz" — triggers MCP music service.

Conversation Memory

Short-term memory keeps context across turns for natural multi-step conversations.

Multilingual

Hindi, English, Hinglish — adapts to whichever language you speak in.

OTA Updates

Update firmware and wake word settings wirelessly via the XiaoZhi console. No USB needed.

Voice Command Quick Reference

"Set volume to 50%"Adjust audio output level mid-conversation

"Reboot yourself"Remotely restart the device firmware via MCP tool

"Turn on the light"Control RGB LED strip or GPIO devices by voice

"Show your face only"Toggle OLED between face animation and text mode

"What time is it?"Returns accurate time via NTP sync

"What's my IP?"Shows Wi-Fi diagnostics: SSID, signal strength, IP address

"Weather in Delhi"Fetches live conditions via MCP weather service

"Play some music"Triggers MCP music playback service

"Tell me a joke"Built-in humor, facts, and general knowledge responses

"Update your theme"OTA firmware and wake word updates via cloud console

Expandable via MCP (Model Context Protocol) XiaoZhi supports custom MCP tools, meaning you can add GPIO control, relay switching, sensor reading, and more — all triggerable by voice. The Wi-Fi diagnostics and reboot tools in this version are examples of custom MCP integrations.

Full Build Walkthrough — Watch & Build Along

Complete video guide from breadboard to talking AI assistant

Follow along with the complete video build guide. The video covers every step: component overview, breadboard wiring, firmware flashing via browser, Wi-Fi configuration, account setup, device pairing, AI personality configuration, and custom wake word setup on the ESP32-S3.

Video chapters 0:00 — Intro & Components · 2:30 — Breadboard Circuit Assembly · 7:15 — Flashing Firmware via Browser · 9:40 — Wi-Fi Configuration · 11:20 — Account Setup & Pairing · 13:00 — AI Personality Configuration · 16:30 — Custom Wake Word Setup · 19:15 — Testing & Demo

Required Components

Everything needed — all available online or locally

The component list is nearly identical to the previous version. The only change is swapping the standard ESP32 dev board for an ESP32-S3 dev board, which provides the hardware acceleration needed for on-device wake word detection.

Component	Description	Qty	Note
ESP32-S3 Dev Board	38-pin variant · Main controller	1	V2 UPGRADE
INMP441 I2S Mic	Digital I2S microphone · Captures voice	1	—
MAX98357A Amplifier	I2S DAC + amplifier module	1	—
2W 4Ω Speaker	Small speaker for audio output	1	—
0.96" OLED (128×64)	I2C display · Face animations + text	1	NEW USE
Breadboard 400-tie	For prototyping the circuit	2	—
Jumper Wires M-M	Male-to-male · Assorted colors	~25	—
USB-C Data Cable	Must support data, not charge-only	1	—
Computer (Chrome/Edge)	Required for Web Serial flashing	1	—

Why ESP32-S3 specifically? The ESP32-S3 includes dedicated hardware for running the MultiNet speech model used by XiaoZhi for custom wake word detection. Standard ESP32 cannot run this model reliably.

Get the Ready-Made XiaoZhi AI S3 Kit

Don't want to source and assemble parts? We sell a fully assembled, 100% tested Version 2 kit with ESP32-S3. Power it on and start talking immediately.

Order on WhatsApp — 8535889926

100% Tested Ready to Use Fast Delivery Support Included Wake Word Pre-configured

Circuit Diagram

ESP32-S3 + INMP441 + MAX98357A + OLED + Speaker

The circuit is virtually identical to the previous version — the only hardware difference is using ESP32-S3 instead of the standard ESP32. All four modules connect the same way: I2S microphone, I2S amplifier, I2C OLED, and speaker.

Circuit Diagram — ESP32-S3 Version

Full circuit — ESP32-S3 + INMP441 microphone + MAX98357A amplifier + 0.96" OLED display

Refer to the diagram above for all pin assignments. The image contains the complete wiring reference for every component.

Breadboard Assembly Guide

Step-by-step hardware build — takes about 15–20 minutes

Refer to the assembled build photo below. Follow the circuit diagram for all wiring — the image shows the final layout with all modules connected.

Assembled Build Reference

Ready to power on Once assembled, plug the ESP32-S3 into your computer using a USB-C data cable and proceed to firmware flashing.

Flash the XiaoZhi Firmware

Browser-based · No drivers · No IDE · ~2 minutes

The XiaoZhi firmware is flashed directly from your browser using ESP Web Tools. You need Google Chrome or Microsoft Edge on a desktop or laptop. Mobile browsers will not work.

Read this before clicking Flash Do not plug in your ESP32-S3 yet. Click flash first, note which COM ports are listed, then plug in while holding the BOOT button. The new port that appears is your device.

Click "Start Flashing" below — do not plug in USB yet

The port selection dialog opens. Note currently listed ports.

Hold BOOT button on ESP32-S3 and plug in USB-C

Keep holding BOOT while plugging in. A new COM port will appear.

Select the new COM port that just appeared

That is your ESP32-S3 in bootloader mode. Click Connect.

Click "Install" → check "Erase Device" → Next → Install

Erasing is strongly recommended for a clean flash.

Wait approximately 2 minutes — do not unplug

Progress bar shows flashing status. Once complete, release BOOT and click Done.

One-Click XiaoZhi Firmware Flasher

No software to install — flashes directly from your browser via Web Serial

Chrome / Edge No Drivers ~2 Minutes Erase & Flash

Web Serial not supported. Please use Google Chrome or Microsoft Edge on a desktop computer.

No new COM port appearing? Your USB cable may be charge-only. Try a different cable that explicitly supports data transfer. If it still doesn't appear, install CP2102 or CH340 drivers.

Connect to Your Wi-Fi Network

Via captive portal at 192.168.4.1 · 2.4 GHz networks only

After successful flashing, the ESP32-S3 boots and creates a temporary Wi-Fi hotspot named XiaoZhi-XXXX. Connect to this hotspot from your phone or computer, then follow these steps to configure your home Wi-Fi credentials through the device's built-in web portal.

Connect to "XiaoZhi-XXXX" hotspot

Open your phone or laptop Wi-Fi settings and connect to the network named XiaoZhi-XXXX. No password is required — this is the ESP32-S3's temporary access point.

Open 192.168.4.1 in your browser

Once connected, open a browser and go to 192.168.4.1. The ESP32-S3 configuration portal loads. You will see the main dashboard with device status and available tabs.

Configuration Portal — Main Dashboard

Go to the Advanced tab and select your timezone

Tap the Advanced tab in the top navigation. Find the Timezone dropdown and select your local timezone (e.g. Asia/Kolkata for India). This ensures the device reports accurate time in responses.

Advanced Tab — Timezone Selection

Click Save Configuration, then switch to WiFi Config

After selecting your timezone, click the Save Configuration button. Once saved, switch to the WiFi Config tab to set up your home network connection.

WiFi Config Tab — Select Network & Enter Password

Select your network, enter password, and click Connect

In the WiFi Config tab, click your home Wi-Fi network from the available list (2.4 GHz only). Type your Wi-Fi password in the password field and click the Connect button. The device will attempt to connect.

Wait for the success confirmation message

Once connected successfully, the portal displays a green success message. The ESP32-S3 restarts automatically and joins your home network. The OLED display will show a pairing code — do not disconnect power during this process.

Connection Successful

2.4 GHz networks only ESP32-S3 does not support 5 GHz Wi-Fi. If your router broadcasts both bands with the same name, temporarily connect a device to confirm your 2.4 GHz SSID.

Create Your XiaoZhi Account

Free account at xiaozhi.me · Google login recommended

Before you can pair your device, you need a free account on the XiaoZhi AI platform. This is where you manage your agent's personality, language, voice, and advanced settings.

Reconnect to your home Wi-Fi network

Switch your device back from the XiaoZhi hotspot to your regular network.

Open xiaozhi.me in your browser

Click "Console" in the navigation.

Sign up using Google

The fastest method. Your account is active immediately.

XiaoZhi.me — Homepage

Pair Your ESP32-S3 Device

Enter the 6-digit code scrolling on the OLED display

After creating your account and signing into the console, you'll see the Agents page. Now link your physical ESP32-S3 to your account using the pairing code displayed on the OLED.

Click "+ Add Device" in the Agents console

An input dialog appears for the pairing code.

Read the 6-digit code scrolling on your OLED

The code refreshes every 30 seconds. Type it quickly.

Enter the code and click Confirm

The device links to your account.

Accept the agreement and click "Start Using"

Select the Open Source (Free) tier to continue.

Add Device Dialog — Console

Device paired successfully Your ESP32-S3 now appears as an agent card in the console. You can see it listed as Online with a green indicator.

Configure Your AI Agent's Personality

Set name, voice, language, system prompt, MCP tools

Click "Configure Role" on your device card. This opens the full configuration panel where you design your AI's identity — its name, voice, language, and behavioral instructions.

Configure Role — XiaoZhi Console

What is "Role Introduction"? This is the system prompt — the core instruction set that defines who your AI is, how it behaves, what language it speaks, and what it knows. It's the AI's personality blueprint.

Settings reference — what each option does:

Setting	What It Controls	Recommended
Assistant Name	What the AI calls itself in greetings	Any name — e.g., "Maxon" or "Jarvis"
Dialogue Language	Primary language for voice output	Switch to preferred language
Voice Role	TTS voice and accent selection	Try several and pick the best fit
Role Introduction	Full personality and behavior system prompt	Use the generator tool below
Memory Type	How the AI retains conversation context	Short-term Memory
Language Model	AI engine powering responses	Xiaozhi Lite (free, fast)
Voice Recognition Speed	Speech-to-text processing speed	Normal
Character Speech Speed	How fast the AI talks	Normal or slightly slower
Official Services (MCP)	Built-in tools: Weather, Music, Jokes	Enable Weather, Music

Use the interactive system prompt generator below to craft a detailed, personalized instruction set for your AI.

Set Up Custom Wake Word

Say your own phrase to wake the AI — no button press needed

This is the most powerful upgrade in Version 2. Instead of pressing the BOOT button every time, you can wake your AI by saying a custom phrase — like "Hey Maxon". The ESP32-S3 runs the MultiNet model locally to detect your phrase.

The setup is done entirely through the XiaoZhi web console — no code, no flashing, just a few clicks. Here is the complete flow:

Manage Devices

Click this on your agent card in the console

Customize

Visible only when device is Online

Custom Wake Word

Select this tab in Theme Design

Generate assets.bin

Flash to device over Wi-Fi

Detailed step-by-step:

Click "Manage Devices" on your agent card

In the Agents section of the console, find your device card and click the "Manage Devices" button. This opens the device management panel.

Click "Customize" — only visible when device is Online

On the device entry, you'll see a "Customize" button next to Theme Settings. This button only appears when your ESP32-S3 is powered on and connected to Wi-Fi.

Manage Devices → Customize Button

Manage Devices and Customize button in XiaoZhi console

Click "Manage Devices" then "Customize" to open the Xiaozhi AI Customization tool

Step 1 — Chip & Screen Configuration loads automatically

The Xiaozhi AI Customization tool opens. If your device is connected and active, it auto-detects your chip model (ESP32-S3) and screen resolution (128×64). Click Next.

Step 1 — Chip Configuration Auto-detected

Chip model auto-detected: ESP32-S3, 128x64

Device configuration auto-loads: ESP32-S3, Screen 128×64px, RGB565 color format

Auto-detection not working? If the chip is not detected automatically, expand "Manual Configuration" and set Chip Model to ESP32-S3 and screen dimensions to 128×64 manually.

Step 2 — Theme Design: Click "Custom Wake Word"

The Theme Design page opens with four tabs: Wake Word Config, Font Config, Emoji Collection, Chat Background. Click the "Custom Wake Word" button.

Step 2 — Select Custom Wake Word

Theme Design page: Custom Wake Word button selected

Theme Design → Wake Word Config → Click "Custom Wake Word"

Enter your Wake Word Name and Wake Command

The Custom Wake Word Settings section expands. Fill in two fields:
• Wake Word Name — a label for this wake word (e.g., Maxon)
• Wake Command — the exact phrase you will speak (e.g., Hey Maxon)
You can name it anything you want. Keep the phrase 2–3 syllables for best recognition.

Custom Wake Word Settings

Custom wake word name: Maxon, wake command: Hey Maxon

Wake Word Name: Maxon · Wake Command: Hey Maxon · Recognition Model: MultiNet6 (English)

Select Recognition Model — choose English if not from China

In the "Select Recognition Model" dropdown, choose MultiNet6 (English) for English wake commands, or MultiNet6 (Chinese) for Chinese commands. The sensitivity threshold default of 20 is fine — lower value means more sensitive.

Select Recognition Model

Select MultiNet6 (English) for English wake words — available only on ESP32-S3

Click Next → Step 3 Preview → Click "Generate assets.bin"

You arrive at the Step 3 Preview page. The device preview shows a 128×64 OLED simulation. The Configuration Summary confirms your wake word setting. Click the green "Generate assets.bin" button.

Step 3 — Preview & Generate

Preview page showing Generate assets.bin button

Preview confirms wake word is "Maxon" — click Generate assets.bin to proceed

Confirm configuration → Click "Start Generate"

A confirmation dialog shows the full configuration summary: Chip Model ESP32-S3, Resolution 128×64, Wake Word Maxon, and the list of files to be included (index.json ~1KB, srmodels.bin ~1.2MB). Click "Start Generate".

Generate assets.bin — Confirmation Dialog

Generate assets.bin dialog showing configuration summary

Configuration summary confirms all settings — click "Start Generate" to build the binary

Wait for generation — then click "Flash to Device Online"

The assets.bin file generates in ~2 seconds (3.61 MB). When the success dialog appears with a green checkmark, make sure your ESP32-S3 is powered on and online, then click the blue "Flash to Device Online" button.

assets.bin Ready — Flash to Device

assets.bin file ready, 3.61 MB, Flash to Device Online button highlighted

3.61 MB assets.bin generated in 2.2s — click "Flash to Device Online" to send via OTA

OTA flashing — device says "Updating the System"

The progress bar shows the OTA upload in real time. Your ESP32-S3 will speak "Updating the System" and the OLED may flash. Do not power off the device during this process.

OTA Flashing in Progress

Flashing in progress — do not close the window or power off the device

Wait 1–2 minutes — device restarts and wake word is active

After flashing completes, the device reboots automatically. Once it reconnects to Wi-Fi, your custom wake word is active. Say your phrase to test it — the AI should respond immediately without pressing any button.

Wake word is active Say "Hey Maxon" (or your custom phrase) — the AI wakes and responds. No button press needed. The OLED face animation activates on wake.

Tips for best wake word recognition Use a 2–3 syllable phrase. Speak clearly at normal conversational volume about 1–2 meters from the microphone. If false triggers occur, increase the sensitivity threshold slightly (higher = less sensitive).

Frequently Asked Questions & Troubleshooting

Fix common issues with ESP32-S3, wake word, display, audio, and more

ESP32-S3 not detected / no COM port appearing

If your computer does not detect the ESP32-S3 when connected via USB, the most common cause is a charge-only cable. Use a USB cable that explicitly supports data transfer. If using a known-good data cable, install or update the CP2102 / CH340 / CH9102 USB-to-UART drivers. On Windows, open Device Manager and check if the port appears under "Ports (COM & LPT)" — if it shows as an unknown device, the driver is missing. Also try a different USB port, preferably USB 2.0. Hold the BOOT button while plugging in and while clicking "Connect" in the web flasher.

Wake word not responding / not detected

First verify the device is powered on and connected to Wi-Fi (the OLED should show the face animation or status). Ensure the INMP441 microphone is wired correctly: VDD→3.3V, GND→GND, SD→GPIO32, WS→GPIO25, SCK→GPIO26, L/R→GND on the mic module itself. Speak clearly at 30–100 cm distance. If using an ESP32 (non-S3), wake word is NOT supported — you must use ESP32-S3 for MultiNet hardware wake word detection. Try rebooting the device. If still not working, re-generate assets.bin with the wake word and flash it again via OTA. Check that the recognition model is set to "MultiNet6 (English)" for English commands in the XiaoZhi Customize tool.

OLED display is blank / no face animation

Check the I2C wiring: OLED VCC→3.3V, GND→GND, SCL→GPIO15, SDA→GPIO4. Ensure the OLED address matches the firmware default (0x3C for most SSD1306 displays). Try adjusting the I2C contrast or enable the OLED reset pin in the firmware configuration. If the display worked before but stopped, check for loose Dupont wires on the breadboard. For 128×32 OLEDs, make sure you selected the correct screen resolution in the Customize tool — 128×64 is the standard for this tutorial. If the OLED shows data but no face animation, the emoji assets may not have been flashed correctly — regenerate and re-flash assets.bin.

No audio / distorted sound from speaker

Verify the MAX98357A wiring: VIN→3.3V or 5V (check module specs), GND→GND, LRC→GPIO33, DIN→GPIO27, BCLK→GPIO14. The speaker must be connected to SPK+ and SPK− terminals — not to GND. For distorted audio, reduce the volume by saying "Set volume to 30%" or lower the gain in the audio codec configuration. Ensure the power supply can deliver at least 1A — a weak power source causes audio crackling. Keep audio signal wires (DIN, BCLK, LRC) away from power wires to reduce interference. If there is static noise, try adding a 100µF capacitor across the amplifier VIN and GND.

Device keeps restarting / boot loop

Continuous reboots are usually caused by insufficient power or memory overflow. Use a USB port that can deliver at least 1A — avoid USB hubs and front-panel ports. If using an ESP32 (budget version) without PSRAM, disconnect the OLED display to free memory. For ESP32-S3, ensure PSRAM is properly configured: the board must have at least 8MB PSRAM (N16R8 module). Check that the partition table matches your firmware version — v2 firmware requires v2 partition tables (8MB or 16MB). If you see "Brownout detector was triggered" in the serial monitor, the power supply is insufficient. Try reflashing the firmware with the "Erase Device" option checked.

Cannot connect to Wi-Fi / "Network Error" message

ESP32-S3 supports only 2.4 GHz Wi-Fi — it cannot connect to 5 GHz networks. If your router broadcasts both bands under the same SSID, the device may try to connect to the 5 GHz band and fail. Temporarily disable the 5 GHz band on your router, or create a separate 2.4 GHz SSID. Ensure the Wi-Fi password is correct (there is no show/hide toggle in the portal). If the captive portal at 192.168.4.1 does not load, disable mobile data on your phone and reconnect to the XiaoZhi-XXXX hotspot. For persistent issues, try rebooting the router and the device. If the error says "Network Error" or "Unable to Connect", the cloud server at xiaozhi.me may be temporarily unreachable — check your internet connection.

Flash fails / "overlap at address" error

This error occurs when the srmodels.bin file is too large for the allocated partition space. It typically happens when using a 4 MB flash chip with v2 firmware — v2 requires 8 MB or 16 MB flash. If you are using the v1 firmware branch, make sure you selected the correct partition table (v1/4m.csv for 4 MB flash). For ESP32-S3 boards with 2 MB PSRAM (like Super Mini S3), disable the English Speech Commands Model in menuconfig and set PSRAM to Quad Mode. The simplest fix is to use the pre-compiled firmware from the GitHub releases page that matches your board type exactly.

Audio cuts off mid-sentence / TTS stops early

Audio interruptions during TTS playback are often network-related. The device uses streaming audio — if the Wi-Fi signal is weak or unstable, audio packets may arrive out of sequence or timeout. Move the ESP32-S3 closer to the router and ensure it is on a dedicated 2.4 GHz network. Check for MQTT audio packet sequence warnings in the serial monitor — these indicate packet loss. If using a custom MCP tool that returns large responses, the TTS buffer may overflow; try keeping responses concise. On rare occasions, the cloud TTS engine may have a transient issue — try asking the same question again. If the problem persists, test on a different network to rule out ISP issues.

OLED shows "Connecting" forever / stuck on Wi-Fi

If the OLED remains stuck on the connecting screen, the device is unable to establish a Wi-Fi connection. Reset the Wi-Fi configuration by holding the BOOT button for 5 seconds — this clears the stored credentials and restarts the captive portal. Reconnect to the XiaoZhi-XXXX hotspot and re-enter your Wi-Fi details. Make sure your router is broadcasting on the 2.4 GHz band and is within range. If you changed your Wi-Fi password recently, the device still has the old password stored — use the BOOT long-press reset to clear it. For enterprise networks (WPA2-Enterprise), XiaoZhi does not support captive portal login — use a personal hotspot instead.

Still having issues? Check the official XiaoZhi documentation at xiaozhi.dev/docs for detailed troubleshooting guides. You can also open an issue on the GitHub repository (26K+ stars, 130+ contributors) or join the community Discord for live help.

Final Setup — Activate Your AI

Save settings and bring your voice assistant to life

Your device is flashed, paired, and configured. Now bring everything together with this final activation sequence. Once completed, your XiaoZhi AI will respond to voice commands, display animated expressions, and be ready for daily use.

Save all settings in the XiaoZhi console

Click Save after configuring the role, personality, and MCP tools.

Hard reset the ESP32-S3

Press the physical RST or EN button on the board to apply all saved settings.

Wait for Wi-Fi and NTP sync

The OLED shows a connecting indicator. Once connected, the face animation appears.

Say your wake word

Speak your custom wake phrase — "Hey Maxon" or whatever you configured. The AI activates and is ready to listen.

Manual Activation (Backup)

Short press on BOOT also wakes the AI if you prefer not to use the wake word.

Auto-Sleep

After a few seconds of silence, the device sleeps to save power. Wake word or BOOT activates again.

Update Personality Anytime

Change role, voice, or language in the console at any time. Hit Save and hard reset to apply.

Change Wake Word Anytime

Repeat the Customize → Generate assets.bin → Flash process to use a different wake phrase.

Settings not applying after save? Always do a hard reset using the physical RST/EN button on the ESP32-S3 board after saving any configuration changes. A software reboot alone does not always apply new settings.

Your AI Voice Assistant is Ready

You've built, flashed, configured, and set up a custom wake word on a fully functional XiaoZhi AI V2. Say your phrase and start talking.

Save in Console Hard Reset Board Wait for Connection Speak Wake Word

Want It Pre-Built & Ready to Go?

Get a fully assembled, tested XiaoZhi AI S3 kit with wake word pre-configured. Power it on and start talking immediately.

Order on WhatsApp — +91 8535889926

Pre-Tested Fast Shipping Free Support Wake Word Configured

Build Your Own XiaoZhi AI Voice Chatbot with Custom Wake Word

What's New in XiaoZhi AI Pro

Introduction — What is XiaoZhi AI?

What's New in XiaoZhi AI Pro

Animated Face on OLED

Custom Wake Word

NTP Time Sync

Wi-Fi Diagnostics MCP

Remote Reboot MCP

Face Toggle by Voice

Voice-Activated LED Control

OTA Theme Updates

Features & Voice Commands

Custom Wake Word

Animated Face

Face Toggle by Voice

NTP Time Sync

Wi-Fi Diagnostics

Remote Reboot

Volume Control

Live Weather

Music Playback

Conversation Memory

Multilingual

OTA Updates

Full Build Walkthrough — Watch & Build Along

Required Components

Get the Ready-Made XiaoZhi AI S3 Kit

Circuit Diagram

Breadboard Assembly Guide

Flash the XiaoZhi Firmware

Connect to Your Wi-Fi Network

Create Your XiaoZhi Account

Pair Your ESP32-S3 Device

Configure Your AI Agent's Personality

Set Up Custom Wake Word

Frequently Asked Questions & Troubleshooting

Final Setup — Activate Your AI

Your AI Voice Assistant is Ready

Want It Pre-Built & Ready to Go?

Related Posts