[KYUUBI #7445] Add AGENTS.md to guide AI coding agents#7452
Open
wangzhigang1999 wants to merge 14 commits into
Open
[KYUUBI #7445] Add AGENTS.md to guide AI coding agents#7452wangzhigang1999 wants to merge 14 commits into
wangzhigang1999 wants to merge 14 commits into
Conversation
Verified all commands, file paths, class names, and Maven profiles against the actual repository. Corrected the following: - Thrift IDL is upstream (Hive TCLIService via kyuubi-relocated-thrift); rewrote the wire-format references accordingly. Only Kyuubi-side delta is python/scripts/thrift-patches/. - SessionManager and OperationManager live in kyuubi-common, not kyuubi-server. Corrected high-sensitivity area paths. - Replaced KyuubiBaseConnectionTest (does not exist) with the actual KyuubiStatementTest example. - Replaced PlanOnlyStatementSuite (does not exist) with PlanOnlyOperationSuite. - Replaced the broad "KYUUBI_UPDATE=1 build/mvn clean test" with dev/gen/gen_all_config_docs.sh, which is the scoped wrapper that actually regenerates settings.md. - Rewrote the CI failure investigation section: Kyuubi CI does not emit per-test check-run annotations the way Spark CI does, so the recipe now uses gh run --log-failed and artifact download. - Enumerated the externals/* engine modules explicitly instead of relying on a glob pattern that did not cover kyuubi-trino-engine.
Verified against root pom.xml, kyuubi-common/pom.xml, ConfigBuilder.scala, and live source. Corrections: - Removed externals/kyuubi-chat-engine reference; the CHAT engine was removed in Kyuubi 1.12 (per docs/deployment/migration-guide.md). - Rephrased the engine profile matrix: only Spark and Scala have version profiles in pom.xml; Flink (1.20.3), Hive (3.1.3), and Trino client (411) are pinned to single versions. - Java tests use JUnit 4 (no AssertJ dependency in any pom.xml); corrected the testing section accordingly. - kyuubi-common does not contain in-tree Thrift IDL; described it as hosting relocated Hive Thrift RPC types instead. - Distinguished integration-style tests from unit tests for build/dist prerequisite (per docs/contributing/code/testing.md). - Server-only configs are declared via .serverOnly on the ConfigBuilder; serverOnlyConfEntries is the private registry that tracks them. - Softened the "Never" rules for process.destroy() and SparkSession.active to "do not introduce new usage", since legacy usages exist in Spark plugin entry points that require a no-arg constructor.
Markdown reflows naturally on render; the hard wraps at ~70 chars were a stylistic choice from the initial draft that made the source harder to edit. Code fences, tables, nested list items, and the license header are preserved unchanged.
- Server boundary: rephrase from "no Spark/Flink/Hive/Trino dependency" to "no Kyuubi engine module dependency". The server legitimately uses io.trino:trino-client for the Trino frontend and similar protocol/client libraries are allowed. - Engine profile matrix: Flink also has version profiles (flink-1.17/1.18/1.19/1.20). Updated the table and the surrounding text; Hive and Trino remain pinned. - Spark SQL syntax: Kyuubi extensions already add project-specific clauses (OPTIMIZE ... ZORDER BY). Rephrased from "no SQL syntax that Spark doesn't support" to require maintainer agreement and version-gated parser support for new clauses. - Thrift dependencies: kyuubi-common pulls both kyuubi-relocated-thrift and kyuubi-relocated-hive-service-rpc; spelled both out. - Utils.isTesting: legacy usages remain in production (EngineRef, JdbcProcessBuilder, TrinoProcessBuilder, BatchesResource). Softened from "never" to "do not introduce new ones without reviewer agreement". - git config user.email: removed the "reviewers will block" framing, which is not documented as a hard policy; kept the practical fix-it guidance.
…d process.destroy primary rules - Pre-flight: warn that Apache RAT scans the working tree and only honors .rat-excludes (not .gitignore), so private ignored files outside the repo root are safer. - Scala specifics: rephrase the SparkSession.active rule to match the softened Never-list wording (allow Spark plugin/catalog entry points with no parameterized path). - Error handling: rephrase the process.destroy() rule to "do not introduce direct process.destroy() / destroyForcibly()" for consistency with the Never list.
… Spark/Iceberg style
pan3793
reviewed
May 17, 2026
Member
pan3793
left a comment
There was a problem hiding this comment.
@wangzhigang1999, thank you for drafting this proposal, I went through it and left some comments, some points might need more discussion.
- Use `apache` remote; Kyuubi scripts assume the name - Engine list trails with `etc.`; drop Hive from the example set - Add `-am` to single-module Maven examples - Spell out "dependency" instead of "dep" - Drop the "Prefer Java" stance; both languages are first-class - Soften `Option[T]` rule with perf exception - Remove Python thrift-patches reference - Note Thrift Operation reuse with operation-level config keys - Cite fixed version when referencing upstream issues - Keep PR prose precise and concise - Switch AI disclosure to `Assisted-by: <AGENT>:<MODEL>` - Loosen Java test note to defer to surrounding module style - Mention GitHub Discussions alongside the dev mailing list
f9f67d4 to
d0c685c
Compare
…NT>:<MODEL_ID> - AGENTS.md: pin the example to the canonical model id (`claude-opus-4-7`). - PULL_REQUEST_TEMPLATE: rename the field from `Generated-by:` to `Assisted-by:`, broaden the prompt to cover review/exploration, and align the example.
aajisaka
reviewed
May 19, 2026
aajisaka
approved these changes
May 20, 2026
pan3793
reviewed
May 20, 2026
|
|
||
| Allowed under [ASF generative tooling guidance](https://www.apache.org/legal/generative-tooling.html) and Kyuubi's PR template. AI is used for more than code generation (review, reading, solution exploration), so disclose assistance rather than implying the patch was machine-authored. | ||
|
|
||
| - Disclose assistance via `Assisted-by: <AGENT>:<MODEL_ID_OR_VERSION>` in the PR template field, following the [Linux kernel convention](https://docs.kernel.org/process/coding-assistants.html). Example: `Assisted-by: Claude:claude-opus-4-7`. |
Member
There was a problem hiding this comment.
since we don't strictly follow Linux policy, we don't need to mention that, otherwise, this gives the LLM a noise to have one additional fetch.
other community has a similar but a little different policy, e.g.
https://llvm.org/docs/AIToolPolicy.html
For instance, use a commit message trailer like `Assisted-by: `. This transparency helps the community develop best practices and understand the role of these new tools.
I think we can have a more relaxed value for the Assisted-by: tag, all of the following tags are acceptable.
Assisted-by: Claude:claude-opus-4-7Assisted-by: Claude Opus 4.7Assisted-by: OpenCode with DeepSeek V4 Pro
Contributor
Author
There was a problem hiding this comment.
+1 and fixed—this is indeed better.
…t per review - Drop the Linux kernel reference and the strict `<AGENT>:<MODEL>` form; state that the value is free-form, listing examples that cover the canonical model id, a human-readable variant, and a multi-tool form. - Move the "keep prose precise and concise" guidance from the source- Comments section to the PR Workflow section, where it belongs.
pan3793
reviewed
May 20, 2026
pan3793
reviewed
May 20, 2026
pan3793
approved these changes
May 20, 2026
Member
|
will leave it open for another day before merging, in case other reviewers may want to have a look. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why are the changes needed?
Resolves #7445. Adds
AGENTS.mdat the repo root — a terse, command-oriented guide for AI coding agents contributing to Kyuubi. ComplementsCONTRIBUTING.md. Content is distilled from review feedback on landed PRs and aligned with sibling Apache projects (Spark, Iceberg, Airflow, DataFusion); the methodology and underlying rule set are linked in #7445.How was this patch tested?
Documentation-only change. Every referenced command, Maven profile, script path, class name, and file path was verified against the current
master. Maven invocations (-Pfast,-DwildcardSuites,-Dtest,dev/reformatvia Spotless) were run on tiny modules to confirm the documented usage actually executes.Was this patch authored or co-authored using generative AI tooling?
Assisted-by: Claude:claude-opus-4-7