Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
2e763ec
bugfix: remove enable_mla in unit test to pass compiling. (#1129)
phantomlei3 Mar 30, 2026
c806739
feat: add an interface for the thread pool to bind cpu-core. (#1112)
Dragonliu2018 Mar 30, 2026
fd45a37
fix: remove sensitive information. (#1132)
zhang-minchao Mar 30, 2026
1ca7807
bugfix: disable block copy kernel by default on non-NPU builds. (#1133)
yingxudeng Mar 30, 2026
c9ecb7e
bugfix: ensure multi-stream initialization takes effect in RecMaster.…
Kang-Meng Mar 30, 2026
a582b36
bugfix: remove sensitive information. (#1137)
yq33victor Mar 30, 2026
5799823
perf: reserve vector capacity before batch push_back. (#1089)
Dragonliu2018 Mar 31, 2026
c475cc6
bugfix: unify model name extraction for model_id and model_version (#…
kuma-loong Mar 31, 2026
2912904
perf: replace push_back with emplace_back to eliminate temporary obje…
Dragonliu2018 Mar 31, 2026
4c681cf
bugfix: fix mla option propagation in speculative engine path. (#1141)
phantomlei3 Mar 31, 2026
1df961a
feat: add rope_in_place tilelang kernel for npu device. (#964)
zhang-minchao Mar 31, 2026
4a31645
bugfix: profile TPOT with real decode-state requests. (#1116)
JimHsiung Mar 31, 2026
fd0ed3d
bugfix: align rec initialization flags with options. (#1142)
yingxudeng Mar 31, 2026
6bf3203
refactor: extract multi-modal input processors to processors dir. (#1…
wly-115 Mar 31, 2026
31cc2ec
feat: add CLAUDE.md/AGENTS.md and code review skills. (#1120)
XuZhang99 Mar 31, 2026
93245b3
docs: add xLLM git workflow skill. (#1148)
RobbieLeung Mar 31, 2026
d44bbdf
chore: update submodule configuration and initialization commands to …
XuZhang99 Apr 1, 2026
63c727f
feat: add flashinfer version sampling ops. (#1156)
XuZhang99 Apr 1, 2026
e85c131
bugfix: compute q_cu_seq_lens during SHM deserialization to fix NPU t…
yq33victor Apr 1, 2026
fcfd12c
feat: support oxygenvlm model on mlu device. (#1131)
phantomlei3 Apr 2, 2026
e539c93
feat: support joyai-llm-flash model on npu device. (#1121)
longhui-z Apr 2, 2026
f3e59ac
bugfix: reslove prefill coredump when enable piecewisegraph and tp on…
XuZhang99 Apr 2, 2026
5b3a9a4
feat: support torch2.10.0+cu130 for cuda device. (#1166)
XuZhang99 Apr 3, 2026
87d9e35
feat: implement column parallel for lm head to improve performance. (…
wxh571001500 Apr 3, 2026
7cb0377
feat: support QwenImageEditPlus pipeline on npu deivce. (#1163)
shan-chen-feng Apr 3, 2026
c728037
bugfix: fix undefined tensor device crash in DP empty-input path. (#1…
yq33victor Apr 5, 2026
80167b8
bugfix: rollback shared prefix blocks on allocate failure. (#1146)
RobbieLeung Apr 5, 2026
6fc6cce
feat: support etcd auth in etcd client via environment variables. (#1…
JimHsiung Apr 5, 2026
34c9a59
perf: optimize qwen3.5 hybrid linear cache flow[4/N]. (#1160)
JC-ut0 Apr 5, 2026
8eb6391
feat: support video inference for Qwen3-VL on NPU device. (#1151)
xanecdotex Apr 5, 2026
f9e58db
refactor: extract chat json preprocessing into ChatJsonParser class h…
liutongxuan Apr 7, 2026
2e5dbf2
feat: support namespace prefix for etcd key. (#1189)
JimHsiung Apr 7, 2026
c5ad9a8
feat: add onerec model implement[3/N]. (#1050)
DragonFive Apr 7, 2026
ae9c62d
bugfix: fix qwen3.5 gated delta net conv state indices for acl graph[…
yingxudeng Apr 7, 2026
ee22666
refactor: speed up compilation on cuda device. (#1200)
XuZhang99 Apr 7, 2026
918f99c
refactor: rename .agent to .agents and refine AGENTS.md. (#1208)
XuZhang99 Apr 7, 2026
f57028e
feat: add logging for server process lifecycle. (#1206)
XuZhang99 Apr 7, 2026
7e04865
feat: add cuda block copy kernel support. (#1169)
RobbieLeung Apr 8, 2026
b3e1c16
feat: adapt MooncakeTransferEngine for AscendDirectTransport. (#1201)
Clement-Wang26 Apr 8, 2026
a4bc532
refactor: remove zero page for xtensor. (#1205)
Clement-Wang26 Apr 8, 2026
d4490cf
feat: add onerec model implement[4/N]. (#1051)
DragonFive Apr 8, 2026
990393f
feat: add onerec in supported model docs and align rec utility style.…
DragonFive Apr 8, 2026
535b3bc
docs: add bilingual generative recommendation design docs. (#1182)
DragonFive Apr 8, 2026
0f7d258
feat: support flux model on mlu device. (#1138)
phantomlei3 Apr 8, 2026
ef7c18e
bugfix: fix mtp sampling mixed issue. (#1128)
phantomlei3 Apr 8, 2026
fb683db
bugfix: fix DeepSeek-3.2 failures when ACL Graph is enabled. (#1172)
DongheJin Apr 8, 2026
966b813
feat: add constrain decodeing for onerec. (#1158)
DragonFive Apr 9, 2026
eaf90ac
feat: route REC defaults by pipeline for different model. (#1183)
DragonFive Apr 9, 2026
df61ba7
feat: support ds v3 prefix cache and chunked prefill. (#1239)
phantomlei3 Apr 9, 2026
ad32b18
refactor: eliminate FLAGS_backend from APIService with ServingMode an…
liutongxuan Apr 10, 2026
b98b9b9
refactor: optimize beam search sequence reuse. (#1223)
RobbieLeung Apr 10, 2026
d18b16a
bugfix: fallback default attn_mask when chunked prefill yields empty …
yq33victor Apr 10, 2026
3f46e30
bugfix: preserve ND format for OneRec dual embedding. (#1234)
DragonFive Apr 10, 2026
5c44d8a
refactor: move rec model utils and align OneRec style. (#1236)
DragonFive Apr 10, 2026
47acd9f
bugfix: forward REC tokenizer methods through proxy. (#1249)
DragonFive Apr 10, 2026
8a4cc1a
bugfix: fix build error on cuda and ilu device. (#1252)
XuZhang99 Apr 10, 2026
9a3da8c
refactor: unify mooncake kv cache transfer naming and helper APIs. (#…
Clement-Wang26 Apr 10, 2026
713031b
feat: add OneRec REC logprobs and multi-item outputs. (#1184)
DragonFive Apr 10, 2026
d2f8822
refactor: simplify xllm server startup routing and lifecycle helpers.…
liutongxuan Apr 13, 2026
6b311e0
bugfix: correctly reuse residuals in FBCache. (#1265)
z-jun03 Apr 14, 2026
c326cb7
feat: add fused_gdn_gating kernel in tilelang-ascend. (#1267)
zhang-minchao Apr 14, 2026
eae1623
feat: add mlu mooncake pd push support. (#1246)
phantomlei3 Apr 14, 2026
a7a52bb
feat: improve cuda shared memory tensor handling. (#1222)
RobbieLeung Apr 14, 2026
f1816bc
feat: add gemma_rms_norm and fused_qkvzba_split_reshape_cat operator.…
fems14 Apr 14, 2026
fe14046
feat: support embedding interface for all generate VLM models. (#1136)
xanecdotex Apr 14, 2026
2f46b55
bugfix: align HTTP content types with vLLM. (#1275)
RobbieLeung Apr 14, 2026
17f4427
refactor: change torch::empty to torch::zeros when allocating kvcache…
XuZhang99 Apr 15, 2026
43b9833
refactor: clean up repository root layout. (#1279)
RobbieLeung Apr 15, 2026
d1fd2dd
refactor: separate kv_cache_transfer from kv_cache. (#1283)
XuZhang99 Apr 15, 2026
7474fe0
feat: support Qwen down_proj fallback for compressed-tensors ignored …
yingxudeng Apr 15, 2026
460bdd9
bugfix: fix llm default stream execution without overlap. (#1274)
RobbieLeung Apr 15, 2026
7f42ffa
add glm5.0 cp配置
ltdo111 Apr 15, 2026
324510d
add GLM/CP handle longseq benchmark data
ltdo111 Apr 15, 2026
c4418c6
update some data
ltdo111 Apr 15, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
115 changes: 115 additions & 0 deletions .agents/skills/code-review/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
---
name: code-review
description: Review code changes for quality, security, performance, and correctness following project-specific standards. Use when reviewing pull requests, examining git diffs, or when the user asks for a code review. This skill should be used proactively — when the user asks for a review without specifying commits, automatically detect the current branch and diff against the main branch.
---

# Code Review

## Workflow

### Step 1: Determine the diff

If the user provides explicit SHAs or a PR link, use those. Otherwise, **auto-detect**:

```bash
# Fetch latest remote state
git fetch origin main --quiet

# Detect current branch
CURRENT_BRANCH=$(git branch --show-current)

# Find the merge base with origin/main
MERGE_BASE=$(git merge-base origin/main HEAD)

# Show what changed
git diff --stat $MERGE_BASE..HEAD
git diff $MERGE_BASE..HEAD
```

If `CURRENT_BRANCH` is `main`, warn the user and ask which commits to review.

### Step 2: Read project standards

Read [custom-code-style.md](references/custom-code-style.md) for project-specific coding style.

### Step 3: Review against the checklist

**Correctness:**
- Logic handles edge cases and boundary conditions
- Error handling is comprehensive (no silent failures)
- Type safety maintained (no unsafe casts, proper use of `std::optional`)
- Resource lifecycle correct (RAII, no leaks, proper cleanup order)

**Architecture:**
- Clean separation of concerns, no layer violations
- Dependencies flow in the correct direction
- Changes align with existing patterns in the codebase
- No unnecessary coupling introduced

**Performance & Concurrency:**
- No performance regressions on hot paths
- Thread safety: proper locking, no data races
- CUDA/NPU kernels: memory coalescing, occupancy, sync correctness
- No unnecessary copies of large objects (tensors, vectors)

**Testing:**
- Tests verify actual logic, not just mock wiring
- Edge cases and error paths covered
- Integration tests for cross-component changes

**Production Readiness:**
- Backward compatibility maintained (or breaking changes documented)
- Migration strategy for schema/config changes
- No hardcoded values that should be configurable

### Step 4: Output findings

Use the format below.

## Output Format

### Strengths
[Specific things done well, with file:line references]

### Issues

#### Critical (Must Fix)
[Bugs, security holes, data loss risks, broken functionality]

#### Important (Should Fix)
[Architecture problems, missing error handling, test gaps, performance issues]

#### Minor (Nice to Have)
[Style, optimization opportunities, documentation improvements]

**Each issue must include:**
- **File:line** reference
- **What** is wrong
- **Why** it matters
- **How** to fix (if not obvious)

### Recommendations
[Broader improvements for code quality, architecture, or process]

### Assessment

**Ready to merge?** [Yes / No / With fixes]

**Reasoning:** [1-2 sentence technical assessment]

## Rules

**DO:**
- Apply project-specific style from [custom-code-style.md](references/custom-code-style.md)
- Follow DDD (Domain Driven Design) principles, and keep the codebase clean and maintainable
- Categorize by actual severity (not everything is Critical)
- Be specific with file:line references
- Explain WHY issues matter
- Acknowledge strengths
- Give a clear verdict

**DON'T:**
- Approve without thorough review
- Mark nitpicks as Critical
- Give feedback on code not in the diff
- Be vague (e.g., "improve error handling" without specifics)
Loading