Conversation
Commit title authored by GitHub Copilot; the change itself, and this description, were authored by the signing contributor.
|
I don't want to play devil's advocate but what is considered LLM usage? I think it's a spectrum more than a boolean. Is asking a conversational AI service to help you with an instruction LLM usage? If no to any of the above how do you actually prove empirically that this was directly written by an AI and not simply AI-tangent? (AI helped but didn't write the file directly) |
|
generally speaking anything that uses an llm in any way constitutes llm usage |
Just to be clear, if at any point during my PR I ask an LLM about anything related to the PR, e.g
That would automatically mark the entire PR as LLM-assisted? That seems a bit harsh. |
|
I believe the point is about llm generated content being used |
The policy covers any WRITE operations on the repository, including contents, commits, issues, and pull requests. Simply asking a question does not fall into that scope. |
adrienntindall
left a comment
There was a problem hiding this comment.
Will merge on approval from @tgsm
This should be stated in the policy. |
Expanded AI policy to clarify disclosure requirements and exceptions. darn copilot prefilling commit messages and descriptions... where do i turn this off?
37f3ae0
| - Pull request reviews | ||
| - Issues | ||
| - Issue comments | ||
| - Messages to maintaners |
|
A couple thoughts on the implementation side of this policy. It might help to even further clarify the boundary of what counts as LLM usage. The policy already distinguishes between contributing text and “asking questions about the code”, but there are a few potential gray areas that might still come up in practice. For example:
Explicitly stating whether those fall under disclosure could help avoid confusion. The disclosure requirement itself might benefit from being a bit more specific. Right now it says LLM usage must be disclosed in full, but it doesn’t say where or how. For example, should that go in the PR description, commit message, or somewhere else? It is not clear to what extent of granularity needs to be provided as well regarding extent of LLM involvement. A small guideline there could make compliance easier. Since detection may not always be straightforward, it might be worth considering a warning for first-time violations where intent to conceal isn’t clear. That would give contributors a chance to correct behavior while still allowing maintainers to take stronger action in cases of deliberate nondisclosure. Overall the intent of the policy makes sense, these are just a few areas where a little extra clarity might help with implementation. |
|
this is already pretty clear imo |
|
I agree with red's interpretation. Basically, if we imagine copyright could be assigned to AI-generated contents, then the contents subject to that hypothetical copyright are the ones that must be disclosed. So just copy-pasting AI-generated code must be disclosed, modifying AI-generated code would be a derivative work so it also must be disclosed, letting AI modify your own code would be an AI-generated derivative work and it also must be disclosed. When asking the AI how to do something and then doing it yourself, the AI has no authorship over your implementation, so it doesn't need to be disclosed. All the same with text in issues instead of code. I agree with tkolarik in that first-time violations should result in a warn instead of an instant ban, and in that the form of disclosure should be specified. I believe it makes more sense for disclosure about code to be in the commit description as well as the PR. Pull Requests are a GitHub thing separate from the actual Git repository, the commit description is guaranteed to remain for as long as the repository exists. Though I'm personally not too fond of accepting usage of AI, for what it's worth. And from what I've seen, AI-written PR summaries and the like are kind of weird. |
|
I've been mulling over what the AI policy should be, and I think I want to draw a stricter boundary than what's already in here. Mainly, I think that the use of a LLM to generate c from assembly or to document existing code should be outright banned. Other use cases, like using it to automate tedious tasks like function name changes for example, are permissible. These AI PRs only attempt to take low hanging fruit, which can be done by a skilled maintainer in a matter of minutes. The use of AI to decompile these files therefore doesn't significantly speed up the actual decompilation efforts since we would still need to go back and document everything anyways. I'm also concerned after viewing one of these accounts that made an AI contribution that these "contributors" are doing reputation farming via opening easy and inoffensive PRs that don't have any glaring problems with them. Not to mention that one of them outright stole the code of a separate, already open PR. Then in terms of documentation, it should be obvious why an AI shouldn't document code that is meant to be used by humans, especially since from what I've seen it only recognizes how to document generic functions (alloc, accessors, setters) and not anything more intricate than that. |
| --> | ||
| - [ ] This work adheres to the [AI policy](CONTRIBUTING.md#ai-policy). | ||
|
|
||
| ## **Discord contact info** |
There was a problem hiding this comment.
I think that discord verification is important but I'm hesitant to require it as part of a PR for everyone since it'll be public information, and discord stalkers are real (speaking from experience). It might just be worth having them join and verify by sending a message in #pokeheartgold
We can add the following the the above checklist instead
[ ] The author has joined the pret discord community and sent a message in #pokeheartgold to verify that they are human
Reasonably though any human contributor would do this anyways, so it might be redundant.
Commit and PR title authored by GitHub Copilot; the change itself, and this description, were authored by the signing contributor.
PR checklist
make compare_heartgold && make compare_soulsilver).git clang-format).Discord contact info