At 2 AM, when everything feels heavy and there's nobody to call, exhale gives you somewhere to go.
A privacy-first AI wellness app that guides you through a breathing exercise before opening a calm, private space to say what's on your mind.
Regulate first, talk second. Research shows people cannot process emotions effectively when their nervous system is activated. Exhale slows you down physiologically before a single word is typed, using a research-backed 4-2-6-2 breathing pattern that emphasizes the exhale for maximum parasympathetic activation.
Behavior as context. Every action before the conversation begins, time on landing, breathing completion, moods selected, time of day, is assembled into a context object and passed silently to the AI. It understands who you are before you say a single word.
Privacy by architecture. Conversations are held in temporary session storage and deleted the moment you leave. Nothing is ever permanently stored. Not by policy, by design.
Talk or speak, your choice. After the breathing exercise you can type or switch to voice at any point mid-conversation. The mic icon sits quietly in the input field. Tap it and the app shifts into a full voice overlay. It listens, transcribes via Whisper, responds in a calm AI voice via Orpheus TTS, and detects silence to know when you are done speaking. Switch back to text at any time; everything carries over seamlessly.
flowchart LR
User["User"] --> Next["Next.js\nVercel"]
Next --> Chat["api/chat"]
Next --> Crisis["api/crisis"]
Next --> Transcribe["api/transcribe"]
Next --> Speak["api/speak"]
Chat --> Redis["Redis\nephemeral session"]
Chat --> Llama["Llama 3.3 70B"]
Crisis --> Llama
Transcribe --> Whisper["Whisper\nv3 Turbo"]
Speak --> Orpheus["Orpheus TTS"]
Redis -->|"deleted on exit"| Gone["gone forever"]
Every conversation lives in Redis under a random UUID with a 1 hour TTL. The client sends only a session ID and the new message, full conversation history never travels through the browser after turn 1. On session end the Redis key is explicitly deleted. The TTL is a safety net, not the primary mechanism.
After every 2 turns beyond turn 3, older messages are compressed into a running emotional summary via a lightweight LLM call. Only the summary and last 4 full turns are sent on each request. Token count stays roughly flat at around 1,200 tokens per request regardless of conversation length, compared to around 6,500 tokens at turn 20 without compression.
Two layer system. The system prompt instructs the AI to detect crisis language and respond warmly while surfacing 988. A parallel AI classifier runs on every user message, intent-based, not keyword-based. "I want to jump off this hamster wheel" is not a crisis. "I want to jump off a building" is. Only genuine self-harm or suicidal language triggers the response.
| Layer | Technology | Why |
|---|---|---|
| Framework | Next.js 16 + TypeScript | App router, API routes, strong typing |
| Styling | CSS custom properties + inline styles | Design tokens, warm dark theme |
| Animation | Canvas API | Particle animation |
| LLM | Llama 3.3 70B | Free tier, OpenAI compatible, fast |
| Speech to Text | Whisper Large v3 Turbo | Voice input transcription |
| Text to Speech | Orpheus | AI voice responses |
| Session storage | Redis | Ephemeral, serverless, auto-expiry |
| Deployment | Vercel | Seamless Next.js deployment |
| Permanent storage | None | Privacy by design |
- Crisis detection is AI-powered, not keyword-based, understands intent, not just phrases
- 988 Suicide and Crisis Lifeline is surfaced warmly when crisis language is detected, conversation never shut down
- Disclaimer is persistent throughout the chat, not buried in terms of service
- No therapy claims, Exhale never claims to diagnose, treat, or provide therapy
- Session limits, 20 turn maximum, 10 minute inactivity timeout, crisis overrides both
You will need:
- An LLM API key that supports chat, speech to text, and text to speech
- A Redis instance for session storage
Clone the repo, add your credentials to .env.local:
GROQ_API_KEY=
UPSTASH_REDIS_REST_URL=
UPSTASH_REDIS_REST_TOKEN=
Then run:
cd exhale
npm install
npm run dev| Version | Focus | Status |
|---|---|---|
| V1 | Core text chat — breathing, mood selection, AI conversation, crisis detection | Shipped |
| V2 | Session architecture — Redis history, rolling summarization, rate limiting | Shipped |
| V3 | Voice mode + mobile — Whisper STT, Orpheus TTS, iOS fixes, Safari audio | Shipped |
| V4 | Voice optimization — sentence-by-sentence streaming, target latency ~1.5s | Planned |
Built by Aaris Khan — April 2026
A portfolio project demonstrating full-stack development, ephemeral session architecture, multimodal AI integration, responsible AI design, and empathy-driven product thinking.

