Skip to content

Add AI policy to CONTRIBUTING.md#452

Open
PikalaxALT wants to merge 14 commits intomasterfrom
contributing-ai-policy
Open

Add AI policy to CONTRIBUTING.md#452
PikalaxALT wants to merge 14 commits intomasterfrom
contributing-ai-policy

Conversation

@PikalaxALT
Copy link
Copy Markdown
Collaborator

@PikalaxALT PikalaxALT commented Feb 15, 2026

Commit and PR title authored by GitHub Copilot; the change itself, and this description, were authored by the signing contributor.

PR checklist

  • This comment contains a description of changes (with reason).
  • The ROM compiles to matching locally (make compare_heartgold && make compare_soulsilver).
  • All C code is correctly formatted (git clang-format).
  • This pull request is labeled according to the kind of work it represents.

Discord contact info

Commit title authored by GitHub Copilot; the change itself, and this description, were authored by the signing contributor.
red031000
red031000 previously approved these changes Feb 15, 2026
@welidev
Copy link
Copy Markdown

welidev commented Feb 15, 2026

I don't want to play devil's advocate but what is considered LLM usage? I think it's a spectrum more than a boolean.

Is asking a conversational AI service to help you with an instruction LLM usage?
Is giving portions of the code to a conversational AI to understand it considered LLM usage?
Is tab/autocompletion considered LLM usage?

If no to any of the above how do you actually prove empirically that this was directly written by an AI and not simply AI-tangent? (AI helped but didn't write the file directly)

@red031000
Copy link
Copy Markdown
Member

generally speaking anything that uses an llm in any way constitutes llm usage

@welidev
Copy link
Copy Markdown

welidev commented Feb 15, 2026

generally speaking anything that uses an llm in any way constitutes llm usage

Just to be clear, if at any point during my PR I ask an LLM about anything related to the PR, e.g

hey what does the instruction [foo] do in arm7

That would automatically mark the entire PR as LLM-assisted? That seems a bit harsh.

@red031000
Copy link
Copy Markdown
Member

I believe the point is about llm generated content being used
asking an llm a question for explanation doesn't mean that the llm actually created anything in the PR

@PikalaxALT
Copy link
Copy Markdown
Collaborator Author

generally speaking anything that uses an llm in any way constitutes llm usage

Just to be clear, if at any point during my PR I ask an LLM about anything related to the PR, e.g

hey what does the instruction [foo] do in arm7

That would automatically mark the entire PR as LLM-assisted? That seems a bit harsh.

The policy covers any WRITE operations on the repository, including contents, commits, issues, and pull requests. Simply asking a question does not fall into that scope.

adrienntindall
adrienntindall previously approved these changes Feb 16, 2026
Copy link
Copy Markdown
Collaborator

@adrienntindall adrienntindall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge on approval from @tgsm

@tgsm
Copy link
Copy Markdown
Collaborator

tgsm commented Feb 16, 2026

The policy covers any WRITE operations on the repository, including contents, commits, issues, and pull requests. Simply asking a question does not fall into that scope.

This should be stated in the policy.

Expanded AI policy to clarify disclosure requirements and exceptions.

darn copilot prefilling commit messages and descriptions... where do i turn this off?
@PikalaxALT PikalaxALT dismissed stale reviews from red031000 and adrienntindall via 37f3ae0 March 1, 2026 18:13
Comment thread CONTRIBUTING.md Outdated
- Pull request reviews
- Issues
- Issue comments
- Messages to maintaners
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maintainers

@tkolarik
Copy link
Copy Markdown

tkolarik commented Mar 6, 2026

A couple thoughts on the implementation side of this policy.

It might help to even further clarify the boundary of what counts as LLM usage. The policy already distinguishes between contributing text and “asking questions about the code”, but there are a few potential gray areas that might still come up in practice. For example:

  • simple grammar correction (especially relevant for non-native English speakers)
  • pseudocode generated by an LLM, implemented by human.

Explicitly stating whether those fall under disclosure could help avoid confusion.

The disclosure requirement itself might benefit from being a bit more specific. Right now it says LLM usage must be disclosed in full, but it doesn’t say where or how. For example, should that go in the PR description, commit message, or somewhere else? It is not clear to what extent of granularity needs to be provided as well regarding extent of LLM involvement. A small guideline there could make compliance easier.

Since detection may not always be straightforward, it might be worth considering a warning for first-time violations where intent to conceal isn’t clear. That would give contributors a chance to correct behavior while still allowing maintainers to take stronger action in cases of deliberate nondisclosure.

Overall the intent of the policy makes sense, these are just a few areas where a little extra clarity might help with implementation.

@red031000
Copy link
Copy Markdown
Member

this is already pretty clear imo
grammar changes modifies content therefore is included (suggestion: use a non-llm spellchecker)
pseudocode doesn't introduce anything thats committed so it's not required (idk if pseudocode could even be useful here???)
in the PR body is fine, maybe clarification might be an idea, but imo not required

@Laquinh
Copy link
Copy Markdown
Contributor

Laquinh commented Mar 24, 2026

I agree with red's interpretation.

Basically, if we imagine copyright could be assigned to AI-generated contents, then the contents subject to that hypothetical copyright are the ones that must be disclosed.

So just copy-pasting AI-generated code must be disclosed, modifying AI-generated code would be a derivative work so it also must be disclosed, letting AI modify your own code would be an AI-generated derivative work and it also must be disclosed. When asking the AI how to do something and then doing it yourself, the AI has no authorship over your implementation, so it doesn't need to be disclosed.

All the same with text in issues instead of code.

I agree with tkolarik in that first-time violations should result in a warn instead of an instant ban, and in that the form of disclosure should be specified. I believe it makes more sense for disclosure about code to be in the commit description as well as the PR. Pull Requests are a GitHub thing separate from the actual Git repository, the commit description is guaranteed to remain for as long as the repository exists.

Though I'm personally not too fond of accepting usage of AI, for what it's worth. And from what I've seen, AI-written PR summaries and the like are kind of weird.

@PikalaxALT PikalaxALT requested a review from red031000 April 12, 2026 00:47
red031000
red031000 previously approved these changes Apr 12, 2026
tgsm
tgsm previously approved these changes Apr 12, 2026
@adrienntindall
Copy link
Copy Markdown
Collaborator

I've been mulling over what the AI policy should be, and I think I want to draw a stricter boundary than what's already in here.

Mainly, I think that the use of a LLM to generate c from assembly or to document existing code should be outright banned. Other use cases, like using it to automate tedious tasks like function name changes for example, are permissible.

These AI PRs only attempt to take low hanging fruit, which can be done by a skilled maintainer in a matter of minutes. The use of AI to decompile these files therefore doesn't significantly speed up the actual decompilation efforts since we would still need to go back and document everything anyways.
The alternative is saving the smaller files for cleanup towards the end of the project or getting to them when we are already working on related code, and in the mean them allow them to be available to newer contributors who needs an easy access point to learn the basics of asm -> c and the style of the code in the repository.

I'm also concerned after viewing one of these accounts that made an AI contribution that these "contributors" are doing reputation farming via opening easy and inoffensive PRs that don't have any glaring problems with them. Not to mention that one of them outright stole the code of a separate, already open PR.

Then in terms of documentation, it should be obvious why an AI shouldn't document code that is meant to be used by humans, especially since from what I've seen it only recognizes how to document generic functions (alloc, accessors, setters) and not anything more intricate than that.

@PikalaxALT PikalaxALT dismissed stale reviews from tgsm and red031000 via b359e4d April 12, 2026 04:22
adrienntindall
adrienntindall previously approved these changes Apr 12, 2026
Comment thread .github/pull_request_template.md
Comment thread .github/pull_request_template.md Outdated
-->
- [ ] This work adheres to the [AI policy](CONTRIBUTING.md#ai-policy).

## **Discord contact info**
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that discord verification is important but I'm hesitant to require it as part of a PR for everyone since it'll be public information, and discord stalkers are real (speaking from experience). It might just be worth having them join and verify by sending a message in #pokeheartgold

We can add the following the the above checklist instead

[ ] The author has joined the pret discord community and sent a message in #pokeheartgold to verify that they are human

Reasonably though any human contributor would do this anyways, so it might be redundant.

Comment thread CONTRIBUTING.md Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants