Skip to content

Aaris03Khan/Exhale

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 

Repository files navigation

exhalecovegit

exhale-compressed

At 2 AM, when everything feels heavy and there's nobody to call, exhale gives you somewhere to go.

A privacy-first AI wellness app that guides you through a breathing exercise before opening a calm, private space to say what's on your mind.


How it works

Regulate first, talk second. Research shows people cannot process emotions effectively when their nervous system is activated. Exhale slows you down physiologically before a single word is typed, using a research-backed 4-2-6-2 breathing pattern that emphasizes the exhale for maximum parasympathetic activation.

Behavior as context. Every action before the conversation begins, time on landing, breathing completion, moods selected, time of day, is assembled into a context object and passed silently to the AI. It understands who you are before you say a single word.

Privacy by architecture. Conversations are held in temporary session storage and deleted the moment you leave. Nothing is ever permanently stored. Not by policy, by design.

Talk or speak, your choice. After the breathing exercise you can type or switch to voice at any point mid-conversation. The mic icon sits quietly in the input field. Tap it and the app shifts into a full voice overlay. It listens, transcribes via Whisper, responds in a calm AI voice via Orpheus TTS, and detects silence to know when you are done speaking. Switch back to text at any time; everything carries over seamlessly.


Architecture

flowchart LR
    User["User"] --> Next["Next.js\nVercel"]
    
    Next --> Chat["api/chat"]
    Next --> Crisis["api/crisis"]
    Next --> Transcribe["api/transcribe"]
    Next --> Speak["api/speak"]

    Chat --> Redis["Redis\nephemeral session"]
    Chat --> Llama["Llama 3.3 70B"]
    Crisis --> Llama
    Transcribe --> Whisper["Whisper\nv3 Turbo"]
    Speak --> Orpheus["Orpheus TTS"]

    Redis -->|"deleted on exit"| Gone["gone forever"]
Loading

Session architecture

Every conversation lives in Redis under a random UUID with a 1 hour TTL. The client sends only a session ID and the new message, full conversation history never travels through the browser after turn 1. On session end the Redis key is explicitly deleted. The TTL is a safety net, not the primary mechanism.

Token efficiency

After every 2 turns beyond turn 3, older messages are compressed into a running emotional summary via a lightweight LLM call. Only the summary and last 4 full turns are sent on each request. Token count stays roughly flat at around 1,200 tokens per request regardless of conversation length, compared to around 6,500 tokens at turn 20 without compression.

Crisis detection

Two layer system. The system prompt instructs the AI to detect crisis language and respond warmly while surfacing 988. A parallel AI classifier runs on every user message, intent-based, not keyword-based. "I want to jump off this hamster wheel" is not a crisis. "I want to jump off a building" is. Only genuine self-harm or suicidal language triggers the response.


Tech stack

Layer Technology Why
Framework Next.js 16 + TypeScript App router, API routes, strong typing
Styling CSS custom properties + inline styles Design tokens, warm dark theme
Animation Canvas API Particle animation
LLM Llama 3.3 70B Free tier, OpenAI compatible, fast
Speech to Text Whisper Large v3 Turbo Voice input transcription
Text to Speech Orpheus AI voice responses
Session storage Redis Ephemeral, serverless, auto-expiry
Deployment Vercel Seamless Next.js deployment
Permanent storage None Privacy by design

Responsible AI

  • Crisis detection is AI-powered, not keyword-based, understands intent, not just phrases
  • 988 Suicide and Crisis Lifeline is surfaced warmly when crisis language is detected, conversation never shut down
  • Disclaimer is persistent throughout the chat, not buried in terms of service
  • No therapy claims, Exhale never claims to diagnose, treat, or provide therapy
  • Session limits, 20 turn maximum, 10 minute inactivity timeout, crisis overrides both

Local setup

You will need:

  • An LLM API key that supports chat, speech to text, and text to speech
  • A Redis instance for session storage

Clone the repo, add your credentials to .env.local:

GROQ_API_KEY=
UPSTASH_REDIS_REST_URL=
UPSTASH_REDIS_REST_TOKEN=

Then run:

cd exhale
npm install
npm run dev

Roadmap

Version Focus Status
V1 Core text chat — breathing, mood selection, AI conversation, crisis detection Shipped
V2 Session architecture — Redis history, rolling summarization, rate limiting Shipped
V3 Voice mode + mobile — Whisper STT, Orpheus TTS, iOS fixes, Safari audio Shipped
V4 Voice optimization — sentence-by-sentence streaming, target latency ~1.5s Planned

About

Built by Aaris Khan — April 2026

A portfolio project demonstrating full-stack development, ephemeral session architecture, multimodal AI integration, responsible AI design, and empathy-driven product thinking.

About

Breathe first. Then talk. A private AI space for when you need somewhere to put what you're carrying.

Topics

Resources

License

Stars

Watchers

Forks

Contributors