Skip to content

GraphTechnologyDevelopers/rag-architect

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rag-architect

A principal-level AI architecture profile for production RAG, agentic workflows, evaluation, observability, and implementation-ready issue design.

Hermes Agent RAG AI Architecture License: MIT

rag-architect is an open-source Hermes Agent profile and skill pack for turning ambitious AI/RAG goals into production-grade architecture, ADRs, evaluation plans, observability specs, and implementation-ready GitHub issues.

It is designed for teams building real AI systems where quality, reliability, latency, cost, and business impact all matter.

Created by Grey Newell · GitHub: @greynewell

If this helps you ship better RAG or agentic systems, please ⭐ star the repo and share it with builders working on production AI.


What this is

rag-architect is a reusable operating system for a principal-level AI architect agent.

It helps an AI agent act less like a code generator and more like a senior technical leader who can:

  • design production Retrieval Augmented Generation systems
  • own Pinecone namespace strategy, chunking, embeddings, hybrid retrieval, and reranking decisions
  • define golden datasets, retrieval metrics, generation scoring, latency benchmarks, and regression gates
  • instrument AI systems for cost, token usage, quality signals, and anomaly detection
  • turn architecture into high-quality GitHub issues that simpler coding agents can execute
  • keep model routing and unit economics visible without sacrificing output quality
  • document clear reasoning, tradeoffs, acceptance criteria, rollout plans, and rollback paths

The core idea: production AI architecture should be executable by implementation agents.


Who this is for

Use this if you are:

  • building production RAG systems
  • designing LLM agents connected to internal tools, APIs, or data sources
  • managing multiple Pinecone namespaces or retrieval strategies
  • trying to make AI features ship in weeks, not months
  • creating eval frameworks for retrieval, generation, and agent workflows
  • writing ADRs, architecture docs, or GitHub issues for AI engineering teams
  • using Hermes Agent profiles and skills to specialize agent behavior

You can use the materials directly with Hermes Agent, or adapt the templates and checklists for Claude Code, Codex, OpenCode, Cursor, or your own agent workflows.


What's inside

rag-architect/
├── SOUL.md
├── skills/
│   └── ai-architecture/
│       ├── principal-ai-architect/
│       │   ├── SKILL.md
│       │   ├── templates/adr.md
│       │   └── references/architecture-output-checklist.md
│       ├── production-rag-architecture/
│       │   ├── SKILL.md
│       │   ├── templates/rag-design-review.md
│       │   └── references/namespace-decision-matrix.md
│       ├── rag-evaluation-observability/
│       │   ├── SKILL.md
│       │   ├── templates/eval-plan.md
│       │   └── references/llm-judge-rubric.md
│       └── agentic-workflow-issue-factory/
│           ├── SKILL.md
│           ├── templates/implementation-issue.md
│           └── templates/agent-tool-design.md
└── scripts/validate.py

SOUL.md

SOUL.md defines the persistent role identity and operating standard for the rag-architect profile. It tells the agent to optimize for:

  • measurable business impact
  • production reliability
  • evaluation-driven iteration
  • clean abstractions
  • cost-aware model routing
  • observability and anomaly detection
  • documentation that simpler coding agents can execute

Skills

Skill Use it for
principal-ai-architect principal-level architecture, ADRs, roadmap slicing, tradeoff analysis, business-aligned planning
production-rag-architecture ingestion, chunking, embeddings, Pinecone namespaces, hybrid retrieval, reranking, context assembly
rag-evaluation-observability golden datasets, retrieval/generation metrics, regression gates, LLM-as-judge rubrics, traces, latency/cost benchmarks
agentic-workflow-issue-factory GitHub issues, implementation specs, acceptance criteria, rollout notes, agent tool design

Quick start with Hermes Agent

Hermes Agent supports profiles and profile-local skills. This repo is structured so you can copy it into a rag-architect profile.

1. Clone the repo

git clone https://github.com/greynewell/rag-architect.git
cd rag-architect

2. Install into a Hermes profile

mkdir -p ~/.hermes/profiles/rag-architect
cp SOUL.md ~/.hermes/profiles/rag-architect/SOUL.md
mkdir -p ~/.hermes/profiles/rag-architect/skills
cp -R skills/* ~/.hermes/profiles/rag-architect/skills/

3. Launch Hermes with the profile

hermes --profile rag-architect

Then ask for work like:

Design a production RAG architecture for our internal knowledge base.
Turn this RAG roadmap into GitHub issues that coding agents can implement.
Review our Pinecone namespace strategy and propose eval gates.
Create an ADR for model routing and cost-tiering.

Use without Hermes

You can still use this repo as a standalone architecture toolkit:

  • read SOUL.md as the operating charter
  • use templates/adr.md for architecture decisions
  • use templates/rag-design-review.md for RAG design reviews
  • use templates/eval-plan.md for evaluation plans
  • use templates/implementation-issue.md for GitHub issues
  • paste relevant SKILL.md files into your preferred coding agent as task context

Example outputs this profile is designed to produce

Architecture decision records

  • Pinecone multi-namespace strategy
  • embedding model selection
  • hybrid retrieval and reranking policy
  • model routing and cost-tiering
  • agent tool side-effect and permission policy

GitHub issue specs

  • Add Recall@k breakdown by namespace
  • Add no-answer eval cases for hallucination resistance
  • Add cost-per-request trace fields
  • Add hybrid retrieval eval slice for acronym-heavy queries
  • Add router logging for selected namespace and model tier

Evaluation plans

  • golden dataset schema
  • retrieval metrics: Recall@k, MRR, NDCG@k, context precision
  • generation metrics: faithfulness, relevance, completeness, citation quality
  • agent metrics: task completion, tool correctness, escalation behavior
  • operational metrics: p50/p95/p99 latency, cost per successful answer

Design philosophy

Production AI systems fail in ways demos do not reveal.

A useful AI architecture agent must therefore ask:

  • What business workflow improves?
  • What evidence proves quality improved?
  • What happens when retrieval confidence is low?
  • What is the cost per successful answer?
  • What trace explains this exact model response?
  • What regression gate prevents a silent quality drop?
  • What issue can a coding agent implement in one to three sessions?

rag-architect encodes those questions into reusable skills, templates, and checklists.


Validation

Run the local validator before publishing changes:

python3 scripts/validate.py

It checks:

  • skill frontmatter
  • required fields
  • linked template/reference files
  • markdown presence
  • common secret-like strings

Repository topics

Recommended GitHub topics for discoverability:

rag
retrieval-augmented-generation
llm
ai-agents
agentic-workflows
ai-architecture
llmops
mlops
evaluation
observability
pinecone
hermes-agent
prompt-engineering
golden-datasets
model-routing

Contributing

Contributions are welcome.

Good contributions include:

  • sharper RAG design checklists
  • better eval rubrics
  • additional ADR templates
  • more production observability fields
  • examples from real agentic workflows
  • improvements to Hermes Agent installation docs
  • issue templates for AI engineering teams

Please keep contributions practical, implementation-ready, and grounded in production AI systems.


Star and share

If rag-architect helps you design better production AI systems:

  • ⭐ star the repo
  • share it with RAG/LLM engineering teams
  • use the templates in your next architecture review
  • open an issue with your own production RAG lessons

License

MIT License. See LICENSE.

About

Principal-level Hermes Agent profile and skill pack for production RAG, AI agents, evaluation, observability, and implementation-ready architecture docs.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%