Atom S3R AI Assistant

Atom S3R AI Assistant is a local-first voice assistant built around an M5Stack Atom S3R and the M5 Echo Pyramid audio base. The device acts as a small physical AI appliance: it listens through the microphone array, sends speech to a local backend, receives a generated spoken response, and plays it back through the built-in speaker.

What It Does Today

  1. The user starts speaking through the Atom S3R device.
  2. The device streams microphone audio to a backend over WebSocket.
  3. Whisper transcribes the audio to text.
  4. Node-RED routes the request through the assistant pipeline.
  5. Ollama runs a local Gemma model to generate a concise response.
  6. Kokoro converts the response text into speech.
  7. Node-RED streams the generated PCM audio back to the Atom.
  8. The Atom plays the response through the Echo Pyramid speaker.

Main Components

  • Atom S3R Firmware: Wi-Fi, WebSocket audio streaming, playback, LEDs, touch input, display, and local configuration.
  • M5 Echo Pyramid Base: microphone input, audio codec, speaker output, RGB LED feedback, and touch input.
  • Node-RED: orchestration and routing for the voice pipeline.
  • Whisper: local speech recognition.
  • Ollama and Gemma: local LLM response generation.
  • Kokoro: local text-to-speech generation.
  • LXC FastAPI Tools Service: reusable backend tools for future integrations.

Architecture

Atom S3R / Echo Pyramid
-> Node-RED WebSocket receiver
-> Whisper ASR
-> Node-RED router and orchestration flow
-> Local tools / OpenClaw / Ollama as needed
-> Kokoro TTS
-> Node-RED WebSocket sender
-> Atom speaker playback

Project Status

The core voice pipeline is working. The next major milestone is moving from a conversational prototype to a tool-using assistant by adding reusable backend APIs for real-world tasks.