GitHub - Aaris03Khan/Exhale: Breathe first. Then talk. A private AI space for when you need somewhere to put what you're carrying.

At 2 AM, when everything feels heavy and there's nobody to call, exhale gives you somewhere to go.

A privacy-first AI wellness app that guides you through a breathing exercise before opening a calm, private space to say what's on your mind.

How it works

Regulate first, talk second. Research shows people cannot process emotions effectively when their nervous system is activated. Exhale slows you down physiologically before a single word is typed, using a research-backed 4-2-6-2 breathing pattern that emphasizes the exhale for maximum parasympathetic activation.

Behavior as context. Every action before the conversation begins, time on landing, breathing completion, moods selected, time of day, is assembled into a context object and passed silently to the AI. It understands who you are before you say a single word.

Privacy by architecture. Conversations are held in temporary session storage and deleted the moment you leave. Nothing is ever permanently stored. Not by policy, by design.

Talk or speak, your choice. After the breathing exercise you can type or switch to voice at any point mid-conversation. The mic icon sits quietly in the input field. Tap it and the app shifts into a full voice overlay. It listens, transcribes via Whisper, responds in a calm AI voice via Orpheus TTS, and detects silence to know when you are done speaking. Switch back to text at any time; everything carries over seamlessly.

Architecture

flowchart LR
    User["User"] --> Next["Next.js\nVercel"]
    
    Next --> Chat["api/chat"]
    Next --> Crisis["api/crisis"]
    Next --> Transcribe["api/transcribe"]
    Next --> Speak["api/speak"]

    Chat --> Redis["Redis\nephemeral session"]
    Chat --> Llama["Llama 3.3 70B"]
    Crisis --> Llama
    Transcribe --> Whisper["Whisper\nv3 Turbo"]
    Speak --> Orpheus["Orpheus TTS"]

    Redis -->|"deleted on exit"| Gone["gone forever"]

Session architecture

Every conversation lives in Redis under a random UUID with a 1 hour TTL. The client sends only a session ID and the new message, full conversation history never travels through the browser after turn 1. On session end the Redis key is explicitly deleted. The TTL is a safety net, not the primary mechanism.

Token efficiency

After every 2 turns beyond turn 3, older messages are compressed into a running emotional summary via a lightweight LLM call. Only the summary and last 4 full turns are sent on each request. Token count stays roughly flat at around 1,200 tokens per request regardless of conversation length, compared to around 6,500 tokens at turn 20 without compression.

Crisis detection

Two layer system. The system prompt instructs the AI to detect crisis language and respond warmly while surfacing 988. A parallel AI classifier runs on every user message, intent-based, not keyword-based. "I want to jump off this hamster wheel" is not a crisis. "I want to jump off a building" is. Only genuine self-harm or suicidal language triggers the response.

Tech stack

Layer	Technology	Why
Framework	Next.js 16 + TypeScript	App router, API routes, strong typing
Styling	CSS custom properties + inline styles	Design tokens, warm dark theme
Animation	Canvas API	Particle animation
LLM	Llama 3.3 70B	Free tier, OpenAI compatible, fast
Speech to Text	Whisper Large v3 Turbo	Voice input transcription
Text to Speech	Orpheus	AI voice responses
Session storage	Redis	Ephemeral, serverless, auto-expiry
Deployment	Vercel	Seamless Next.js deployment
Permanent storage	None	Privacy by design

Responsible AI

Crisis detection is AI-powered, not keyword-based, understands intent, not just phrases
988 Suicide and Crisis Lifeline is surfaced warmly when crisis language is detected, conversation never shut down
Disclaimer is persistent throughout the chat, not buried in terms of service
No therapy claims, Exhale never claims to diagnose, treat, or provide therapy
Session limits, 20 turn maximum, 10 minute inactivity timeout, crisis overrides both

Local setup

You will need:

An LLM API key that supports chat, speech to text, and text to speech
A Redis instance for session storage

Clone the repo, add your credentials to .env.local:

GROQ_API_KEY=
UPSTASH_REDIS_REST_URL=
UPSTASH_REDIS_REST_TOKEN=

Then run:

cd exhale
npm install
npm run dev

Roadmap

Version	Focus	Status
V1	Core text chat — breathing, mood selection, AI conversation, crisis detection	Shipped
V2	Session architecture — Redis history, rolling summarization, rate limiting	Shipped
V3	Voice mode + mobile — Whisper STT, Orpheus TTS, iOS fixes, Safari audio	Shipped
V4	Voice optimization — sentence-by-sentence streaming, target latency ~1.5s	Planned

About

Built by Aaris Khan — April 2026

A portfolio project demonstrating full-stack development, ephemeral session architecture, multimodal AI integration, responsible AI design, and empathy-driven product thinking.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
exhale		exhale
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How it works

Architecture

Session architecture

Token efficiency

Crisis detection

Tech stack

Responsible AI

Local setup

Roadmap

About

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

How it works

Architecture

Session architecture

Token efficiency

Crisis detection

Tech stack

Responsible AI

Local setup

Roadmap

About

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages