Add codespell: config + action (to detect new typos) and make it fix a good number of them#11195
Add codespell: config + action (to detect new typos) and make it fix a good number of them#11195yarikoptic wants to merge 3 commits into
Conversation
🔍 Existing Issues For ReviewYour pull request is modifying functions with the following pre-existing issues: 📄 File: readthedocs/api/v2/views/integrations.py
Did you find this useful? React with a 👍 or 👎 |
humitos
left a comment
There was a problem hiding this comment.
Thanks for the PR 👍🏼
I didn't take a full review, but I noted a few issues that I see will generate some conflicts. In particular, those that change Python code and require some manual interaction (e.g. a Django migration to update the help_text)
I'm not opposed to this work, but I find it hard to prioritize by the core team at this point.
|
We use pre-commit for all other linting checks. I feel if we implement this it should be at pre-commit, not GHA. |
|
Hmm, interesting, my script should've detected presence of pre-commit config... Will have a look wherever get a chance. You do have ci job running the pre-commit I assume? (I don't see report from pre-commit service) |
|
that explains it: ❯ ls -ld .pre-commit-config.yaml
lrwxrwxrwx 1 yoh yoh 29 Mar 7 18:00 .pre-commit-config.yaml -> common/pre-commit-config.yaml
❯ cat .pre-commit-config.yaml
cat: .pre-commit-config.yaml: No such file or directory
❯ git submodule
-4af0fffd2cbeeb40f0a71b875beb99d6dc88a355 commonLet's do the submodules dance now... |
51b4bb1 to
ef04ef4
Compare
ef04ef4 to
120e693
Compare
|
This PR is great! Thanks for opening it! I'd like to move forward with it. I've already reviewed readthedocs/common#212 We need to make some adjustments here and we can move forward 👍🏼 . @yarikoptic are you able to continue with this work?
Let's exclude the changelog for now. It has a lot of noise.
We should implement this, if it's not already. |
I am able as soon as there is cycles on your end so PR doesn't amass conflicts to address again. |
120e693 to
40091fd
Compare
|
FWIW, I have rebased and redone codespell fixing (automated and interactive). Let's see if no side effects. Please review the diff. |
humitos
left a comment
There was a problem hiding this comment.
This is great! I suppose there may be a few test failing because there are some words that shouldn't return search results.
| "project": project.slug, | ||
| "version": version.slug, | ||
| "q": "indx", | ||
| "q": "index", |
There was a problem hiding this comment.
This was probably an invalid word on purpose.
There was a problem hiding this comment.
so we need to annotate this line to be skipped and ok to contain indx?
There was a problem hiding this comment.
I replaced it with explicit "abracadabra" , or it does have to be "indx"? (let's see if tests feel better)
There was a problem hiding this comment.
The test is failing. I think it needs to be indx probably for fuzzy search testing. I would leave as it was to avoid this type of issues and probably mark that line to be skipped 👍🏼
40091fd to
c5202bd
Compare
This is indeed a typo that I made when I created the model's field |
| [codespell] | ||
| # Ref: https://github.com/codespell-project/codespell#using-a-config-file | ||
| skip = .git,*.svg,locale,package-lock.json,*.css,*.min.*,vendor,*.ai,setup.cfg,migrations,CHANGELOG.rst,common | ||
| check-hidden = true | ||
| # some names and abbreviations and very long lines (minimized?) | ||
| # TODO: fixup help_text in readthedocs/builds/models.py : "to perfom an" | ||
| ignore-regex = \b(Manuel|DED|Wile E. Coyote|Couldn\\u2019t|to perfom an)\b|.{300,}|"pyton\b|\|(ative|ment)\||"Hel" will match\b|ative: ''|help_text *=.* | ||
| # TODO: fix syntaxt -- would require transition? | ||
| ignore-words-list = fo,te,astroid,requestor,syntaxt,ore |
There was a problem hiding this comment.
Is it possible to put all this configuration on a specific codespell file? Like .codespell.cfg or similar?
We will need to put this file into common/ repository (readthedocs/common#212) because we want to share this configuration across multiple repositories, eg. https://github.com/readthedocs/ext-theme
There was a problem hiding this comment.
It seems .codespellrc https://github.com/codespell-project/codespell#using-a-config-file
We want that file to be in common/ and symlinked from here 👍🏼
- .codespellrc symlinked from common/.codespellrc - codespell added to common/pre-commit-config.yaml - Updated common submodule to include codespell config - Skips: locale, migrations, CHANGELOG, media/javascript, search test data, SVG/CSS/AI/minified files - Regex ignores: camelCase/PascalCase, URLs, hyphenated compounds, long lines, test fixture patterns - Word ignores: fo/te (ISO 639), astroid, DED, syntaxt (DB migration), requestor, indx, ore Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>
Fixed typos where codespell offered multiple suggestions: - neeed → needed/need (model_views.py, tasks/search.py) - tha → that (permissions.py) - explotation → exploitation (middleware.py) - interating → iterating (utils/__init__.py) - behing → being (sphinx.py) - achived → achieved (sphinx.py) - clonned → cloned (sphinx.py, models.py) - tupe → tuple (version_handling.py) - toke → token (test_oauth.py) - Thi → This, fom → from (git.py) - parm → param (rclone.py) - containe → container (test_api.py) - Custome → Custom (settings/base.py) - trailling → trailing, Stripe → Strip (querysets.py) - Bulds → Builds, verion → version (tasks.py) - treshold → threshold (views/base.py) Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>
… typos
=== Do not change lines below ===
{
"chain": [],
"cmd": "codespell -w",
"exit": 0,
"extra_inputs": [],
"inputs": [],
"outputs": [],
"pwd": "."
}
^^^ Do not change lines above ^^^
Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>
c5202bd to
4c99a8e
Compare
| @@ -0,0 +1 @@ | |||
| common/.codespellrc No newline at end of file | |||
There was a problem hiding this comment.
FWIW: frankly, I personally would have delayed moving it to common until making sure all is good here
original
codespell was used occasionally on some files, so not a new tool here. But now it would guard RTD from having typos introduced to begin with.
Note that some fixes could affect (fix) functionality.
There is an odd "variable"
syntaxtwhich I didn't fix since seems would require transition and smells like it is on purpose:❯ git grep syntaxt -- readthedocs/ readthedocs/projects/migrations/0106_add_addons_config.py: ("syntaxt", models.CharField(max_length=128)), readthedocs/projects/models.py: syntaxt = models.CharField(max_length=128) readthedocs/projects/tasks/builds.py: grab them by using glob syntaxt between other files that could be garbage.or was it intended to be
syntax?📚 Documentation previews 📚
docs): https://docs--11195.org.readthedocs.build/en/11195/dev): https://dev--11195.org.readthedocs.build/en/11195/Add codespell configuration and fix existing typos.
More about codespell: https://github.com/codespell-project/codespell
I personally introduced it to over a hundred of projects already mostly with a positive feedback
(see the "improveit-dashboard").
Rebased and reworked from #11195, addressing all review feedback:
.codespellrc(notsetup.cfg) — incommon/submodule, symlinked from rootsyntaxtleft as-is — Django model field, changing requires DB migrationindxin search test left as-is — intentional non-matching search query, added to ignore-words-listDepends on the
commonsubmodule update (branchenh-codespellon yarikoptic/common fork).See also readthedocs/common#212.
Changes
Configuration & Infrastructure (20efc84)
.codespellrcincommon/, symlinked as.codespellrcin rootcommon/pre-commit-config.yamlas pre-commit hookre-used,re-declare,pre-selected), long lines, test fixture patterns (pyton,Hel,Ore,Wile E. Coyote)fo/te(ISO 639 language codes),astroid(library),DED(Django Elasticsearch DSL),syntaxt(DB migration),requestor,indx,oreAmbiguous Typo Fixes (f217b0b)
Manually reviewed and fixed 21 typos where codespell offered multiple suggestions:
neeed→ needed/need,explotation→ exploitation,interating→ iteratingbehing→ being,achived→ achieved,clonned→ cloned (×3)tupe→ tuple,toke→ token,Thi→ This,fom→ fromparm→ param,containe→ container,Custome→ Customtrailling→ trailing (andStripe→Stripin same docstring)Bulds→ Builds,verion→ version,treshold→ threshold,tha→ thatNon-ambiguous Typo Fixes (4c99a8e)
50 automatic fixes via
codespell -wacross 38 files, including:acccess→ access,sligtly→ slightly,becuase→ becauseinnecessary→ unnecessary,pacakge→ package (template variable bug fix!)intergrations→ integrations,susbcription→ subscriptionrunnig→ running,overwritting→ overwriting,settigs→ settingsPotential Bug Fix
{{pacakge}}→{{package}}inconfig/notifications.py:161— the template variable name was misspelled, likely causing it to not render. Line 149 already uses the correct{{package}}spelling.Historical Context
This project has had ~289 prior commits mentioning typos/spelling, demonstrating the value of automated spell-checking.
Testing
✅ Codespell passes with zero errors after all fixes
🤖 Generated with Claude Code
TODOs