rag-architect

A principal-level AI architecture profile for production RAG, agentic workflows, evaluation, observability, and implementation-ready issue design.

rag-architect is an open-source Hermes Agent profile and skill pack for turning ambitious AI/RAG goals into production-grade architecture, ADRs, evaluation plans, observability specs, and implementation-ready GitHub issues.

It is designed for teams building real AI systems where quality, reliability, latency, cost, and business impact all matter.

Created by Grey Newell · GitHub: @greynewell

If this helps you ship better RAG or agentic systems, please ⭐ star the repo and share it with builders working on production AI.

What this is

rag-architect is a reusable operating system for a principal-level AI architect agent.

It helps an AI agent act less like a code generator and more like a senior technical leader who can:

design production Retrieval Augmented Generation systems
own Pinecone namespace strategy, chunking, embeddings, hybrid retrieval, and reranking decisions
define golden datasets, retrieval metrics, generation scoring, latency benchmarks, and regression gates
instrument AI systems for cost, token usage, quality signals, and anomaly detection
turn architecture into high-quality GitHub issues that simpler coding agents can execute
keep model routing and unit economics visible without sacrificing output quality
document clear reasoning, tradeoffs, acceptance criteria, rollout plans, and rollback paths

The core idea: production AI architecture should be executable by implementation agents.

Who this is for

Use this if you are:

building production RAG systems
designing LLM agents connected to internal tools, APIs, or data sources
managing multiple Pinecone namespaces or retrieval strategies
trying to make AI features ship in weeks, not months
creating eval frameworks for retrieval, generation, and agent workflows
writing ADRs, architecture docs, or GitHub issues for AI engineering teams
using Hermes Agent profiles and skills to specialize agent behavior

You can use the materials directly with Hermes Agent, or adapt the templates and checklists for Claude Code, Codex, OpenCode, Cursor, or your own agent workflows.

What's inside

rag-architect/
├── SOUL.md
├── skills/
│   └── ai-architecture/
│       ├── principal-ai-architect/
│       │   ├── SKILL.md
│       │   ├── templates/adr.md
│       │   └── references/architecture-output-checklist.md
│       ├── production-rag-architecture/
│       │   ├── SKILL.md
│       │   ├── templates/rag-design-review.md
│       │   └── references/namespace-decision-matrix.md
│       ├── rag-evaluation-observability/
│       │   ├── SKILL.md
│       │   ├── templates/eval-plan.md
│       │   └── references/llm-judge-rubric.md
│       └── agentic-workflow-issue-factory/
│           ├── SKILL.md
│           ├── templates/implementation-issue.md
│           └── templates/agent-tool-design.md
└── scripts/validate.py

SOUL.md

SOUL.md defines the persistent role identity and operating standard for the rag-architect profile. It tells the agent to optimize for:

measurable business impact
production reliability
evaluation-driven iteration
clean abstractions
cost-aware model routing
observability and anomaly detection
documentation that simpler coding agents can execute

Skills

Skill	Use it for
`principal-ai-architect`	principal-level architecture, ADRs, roadmap slicing, tradeoff analysis, business-aligned planning
`production-rag-architecture`	ingestion, chunking, embeddings, Pinecone namespaces, hybrid retrieval, reranking, context assembly
`rag-evaluation-observability`	golden datasets, retrieval/generation metrics, regression gates, LLM-as-judge rubrics, traces, latency/cost benchmarks
`agentic-workflow-issue-factory`	GitHub issues, implementation specs, acceptance criteria, rollout notes, agent tool design

Quick start with Hermes Agent

Hermes Agent supports profiles and profile-local skills. This repo is structured so you can copy it into a rag-architect profile.

1. Clone the repo

git clone https://github.com/greynewell/rag-architect.git
cd rag-architect

2. Install into a Hermes profile

mkdir -p ~/.hermes/profiles/rag-architect
cp SOUL.md ~/.hermes/profiles/rag-architect/SOUL.md
mkdir -p ~/.hermes/profiles/rag-architect/skills
cp -R skills/* ~/.hermes/profiles/rag-architect/skills/

3. Launch Hermes with the profile

hermes --profile rag-architect

Then ask for work like:

Design a production RAG architecture for our internal knowledge base.

Turn this RAG roadmap into GitHub issues that coding agents can implement.

Review our Pinecone namespace strategy and propose eval gates.

Create an ADR for model routing and cost-tiering.

Use without Hermes

You can still use this repo as a standalone architecture toolkit:

read SOUL.md as the operating charter
use templates/adr.md for architecture decisions
use templates/rag-design-review.md for RAG design reviews
use templates/eval-plan.md for evaluation plans
use templates/implementation-issue.md for GitHub issues
paste relevant SKILL.md files into your preferred coding agent as task context

Example outputs this profile is designed to produce

Architecture decision records

Pinecone multi-namespace strategy
embedding model selection
hybrid retrieval and reranking policy
model routing and cost-tiering
agent tool side-effect and permission policy

GitHub issue specs

Add Recall@k breakdown by namespace
Add no-answer eval cases for hallucination resistance
Add cost-per-request trace fields
Add hybrid retrieval eval slice for acronym-heavy queries
Add router logging for selected namespace and model tier

Evaluation plans

golden dataset schema
retrieval metrics: Recall@k, MRR, NDCG@k, context precision
generation metrics: faithfulness, relevance, completeness, citation quality
agent metrics: task completion, tool correctness, escalation behavior
operational metrics: p50/p95/p99 latency, cost per successful answer

Design philosophy

Production AI systems fail in ways demos do not reveal.

A useful AI architecture agent must therefore ask:

What business workflow improves?
What evidence proves quality improved?
What happens when retrieval confidence is low?
What is the cost per successful answer?
What trace explains this exact model response?
What regression gate prevents a silent quality drop?
What issue can a coding agent implement in one to three sessions?

rag-architect encodes those questions into reusable skills, templates, and checklists.

Validation

Run the local validator before publishing changes:

python3 scripts/validate.py

It checks:

skill frontmatter
required fields
linked template/reference files
markdown presence
common secret-like strings

Repository topics

Recommended GitHub topics for discoverability:

rag
retrieval-augmented-generation
llm
ai-agents
agentic-workflows
ai-architecture
llmops
mlops
evaluation
observability
pinecone
hermes-agent
prompt-engineering
golden-datasets
model-routing

Contributing

Contributions are welcome.

Good contributions include:

sharper RAG design checklists
better eval rubrics
additional ADR templates
more production observability fields
examples from real agentic workflows
improvements to Hermes Agent installation docs
issue templates for AI engineering teams

Please keep contributions practical, implementation-ready, and grounded in production AI systems.

Star and share

If rag-architect helps you design better production AI systems:

⭐ star the repo
share it with RAG/LLM engineering teams
use the templates in your next architecture review
open an issue with your own production RAG lessons

License

MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
scripts		scripts
skills/ai-architecture		skills/ai-architecture
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SOUL.md		SOUL.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rag-architect

What this is

Who this is for

What's inside

SOUL.md

Skills

Quick start with Hermes Agent

1. Clone the repo

2. Install into a Hermes profile

3. Launch Hermes with the profile

Use without Hermes

Example outputs this profile is designed to produce

Architecture decision records

GitHub issue specs

Evaluation plans

Design philosophy

Validation

Repository topics

Contributing

Star and share

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rag-architect

What this is

Who this is for

What's inside

SOUL.md

Skills

Quick start with Hermes Agent

1. Clone the repo

2. Install into a Hermes profile

3. Launch Hermes with the profile

Use without Hermes

Example outputs this profile is designed to produce

Architecture decision records

GitHub issue specs

Evaluation plans

Design philosophy

Validation

Repository topics

Contributing

Star and share

License

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages