Atom S3R AI Assistant

Atom S3R AI Assistant is a local-first voice assistant built around an M5Stack Atom S3R and the M5 Echo Pyramid audio base. The device acts as a small physical AI appliance: it listens through the microphone array, sends speech to a local backend, receives a generated spoken response, and plays it back through the built-in speaker.

What It Does Today

The user starts speaking through the Atom S3R device.
The device streams microphone audio to a backend over WebSocket.
Whisper transcribes the audio to text.
Node-RED routes the request through the assistant pipeline.
Ollama runs a local Gemma model to generate a concise response.
Kokoro converts the response text into speech.
Node-RED streams the generated PCM audio back to the Atom.
The Atom plays the response through the Echo Pyramid speaker.

Main Components

Atom S3R Firmware: Wi-Fi, WebSocket audio streaming, playback, LEDs, touch input, display, and local configuration.
M5 Echo Pyramid Base: microphone input, audio codec, speaker output, RGB LED feedback, and touch input.
Node-RED: orchestration and routing for the voice pipeline.
Whisper: local speech recognition.
Ollama and Gemma: local LLM response generation.
Kokoro: local text-to-speech generation.
LXC FastAPI Tools Service: reusable backend tools for future integrations.

Architecture

Atom S3R / Echo Pyramid
-> Node-RED WebSocket receiver
-> Whisper ASR
-> Node-RED router and orchestration flow
-> Local tools / OpenClaw / Ollama as needed
-> Kokoro TTS
-> Node-RED WebSocket sender
-> Atom speaker playback

Project Status

The core voice pipeline is working. The next major milestone is moving from a conversational prototype to a tool-using assistant by adding reusable backend APIs for real-world tasks.