ARIA - ElevenLabs Worldwide Hackathon
AI Tinkerers - Austin
Hackathon Showcase 1st Place Winner

ARIA

Team consisting of a Dell AI Engineer, a Health AI Founder, and a UT Austin MS Data Engineer, expert in LLMs, Computer Vision, and production ETL/AWS.

3 members Watch Demo

ARIA - Adaptive Retail Intelligence Agent

ARIA is a stable, multimodal retail agent that speaks naturally, detects emotional triggers in real-time, and delivers personalized product recommendations. It demonstrates how voice-powered AI can transform retail interactions through empathetic, context-aware conversations that boost customer satisfaction and sales.

Core Functionality

  • Voice-first conversational AI powered by Groq LLM with natural cashier-style dialogue
  • Trigger detection engine analyzing 16+ emotional/behavioral cues (fatigue, stress, hunger, cravings)
  • Smart recommendation system mapping triggers + customer history to relevant products
  • Adaptive customer profiles that learn from acceptance patterns over time
  • Real-time analytics showing live trigger weights, recommendations, and conversation flow

Working Prototype Stability

  • Fully functional FastAPI backend with WebSocket-based real-time conversations
  • Voice I/O through ElevenLabs TTS + browser speech-to-text transcription
  • Tested across three demo personas with stable multi-turn dialogues
  • Graceful error handling and fallback logic throughout the pipeline

Technical Complexity & Multimodal Orchestration

ARIA orchestrates a complete agentic loop:
Browser mic → Speech transcription → Trigger analysis → LLM reasoning → Product recommendation → Voice synthesis → Email automation

This pipeline combines symbolic rules, cloud LLM inference, real-time speech processing, and async orchestration—coordinating multiple AI services into one cohesive agent.

Innovation & Real-World Impact

  • Psychology-aware conversations that detect subtle emotional states beyond explicit requests
  • Dynamic personalization adapting recommendation strategy based on individual acceptance patterns
  • Low-latency voice interaction making AI feel natural and human-like
  • Real-world applications: Smart retail kiosks, staffing-lite stores, drive-through automation, accessibility assistance

ARIA shows how voice agents can reduce decision fatigue, increase conversions, and enhance customer experience in retail environments.

Theme Alignment: Browsers + Voices + Cloud + Tools = Cohesive Agent

  • 🌐 Browsers: Web Speech API + real-time WebSocket UI displaying live triggers, recommendations, and conversation state
  • 🗣️ Voices: ElevenLabs TTS + STT create natural bidirectional voice conversations; Anam.ai provides conversational agent persona face
  • ☁️ Cloud: Groq LLM for conversational reasoning + n8n workflows for email automation
  • 🛠️ Tools: Custom trigger analyzer, recommendation engine, customer profiler, and product catalog form the agent’s decision layer

These components work as one unified multimodal agent that listens, thinks, speaks, and acts autonomously.

Technologies Used

Backend: FastAPI, WebSockets, Python asyncio, httpx
AI/ML: Groq API (Llama 3.3-70B), ElevenLabs API (TTS + STT), Anam.ai (conversational persona)
Frontend: Vanilla JavaScript, HTML/CSS, Web Speech API
Automation: n8n workflow engine, Docker
Data: Python dotenv, JSON-based profiles
Audio: MediaRecorder API, WebM/Opus codec

AI Tinkerers Bolt CodeRabbit ElevenLabs Sapphire Ventures anam.ai n8n