Skip to content

Add codespell: config + action (to detect new typos) and make it fix a good number of them#11195

Draft
yarikoptic wants to merge 3 commits into
readthedocs:mainfrom
yarikoptic:enh-codespell
Draft

Add codespell: config + action (to detect new typos) and make it fix a good number of them#11195
yarikoptic wants to merge 3 commits into
readthedocs:mainfrom
yarikoptic:enh-codespell

Conversation

@yarikoptic
Copy link
Copy Markdown
Contributor

@yarikoptic yarikoptic commented Mar 7, 2024

original

codespell was used occasionally on some files, so not a new tool here. But now it would guard RTD from having typos introduced to begin with.

Note that some fixes could affect (fix) functionality.

There is an odd "variable" syntaxt which I didn't fix since seems would require transition and smells like it is on purpose:

❯ git grep syntaxt -- readthedocs/
readthedocs/projects/migrations/0106_add_addons_config.py:                ("syntaxt", models.CharField(max_length=128)),
readthedocs/projects/models.py:    syntaxt = models.CharField(max_length=128)
readthedocs/projects/tasks/builds.py:        grab them by using glob syntaxt between other files that could be garbage.

or was it intended to be syntax ?


📚 Documentation previews 📚

Add codespell configuration and fix existing typos.

More about codespell: https://github.com/codespell-project/codespell

I personally introduced it to over a hundred of projects already mostly with a positive feedback
(see the "improveit-dashboard").

Rebased and reworked from #11195, addressing all review feedback:

  • Config in .codespellrc (not setup.cfg) — in common/ submodule, symlinked from root
  • Pre-commit integration instead of standalone GitHub Actions workflow
  • syntaxt left as-is — Django model field, changing requires DB migration
  • indx in search test left as-is — intentional non-matching search query, added to ignore-words-list

Depends on the common submodule update (branch enh-codespell on yarikoptic/common fork).
See also readthedocs/common#212.

Changes

Configuration & Infrastructure (20efc84)

  • .codespellrc in common/, symlinked as .codespellrc in root
  • Codespell added to common/pre-commit-config.yaml as pre-commit hook
  • Comprehensive skip patterns: locale, migrations, CHANGELOG, media/javascript, search test data, SVG/CSS/AI/minified files
  • Regex ignores: camelCase/PascalCase identifiers, URLs, hyphenated compounds (re-used, re-declare, pre-selected), long lines, test fixture patterns (pyton, Hel, Ore, Wile E. Coyote)
  • Word ignores: fo/te (ISO 639 language codes), astroid (library), DED (Django Elasticsearch DSL), syntaxt (DB migration), requestor, indx, ore

Ambiguous Typo Fixes (f217b0b)

Manually reviewed and fixed 21 typos where codespell offered multiple suggestions:

  • neeed → needed/need, explotation → exploitation, interating → iterating
  • behing → being, achived → achieved, clonned → cloned (×3)
  • tupe → tuple, toke → token, Thi → This, fom → from
  • parm → param, containe → container, Custome → Custom
  • trailling → trailing (and StripeStrip in same docstring)
  • Bulds → Builds, verion → version, treshold → threshold, tha → that

Non-ambiguous Typo Fixes (4c99a8e)

50 automatic fixes via codespell -w across 38 files, including:

  • acccess → access, sligtly → slightly, becuase → because
  • innecessary → unnecessary, pacakge → package (template variable bug fix!)
  • intergrations → integrations, susbcription → subscription
  • runnig → running, overwritting → overwriting, settigs → settings

Potential Bug Fix

  • {{pacakge}}{{package}} in config/notifications.py:161 — the template variable name was misspelled, likely causing it to not render. Line 149 already uses the correct {{package}} spelling.

Historical Context

This project has had ~289 prior commits mentioning typos/spelling, demonstrating the value of automated spell-checking.

Testing

✅ Codespell passes with zero errors after all fixes


🤖 Generated with Claude Code

TODOs

@yarikoptic yarikoptic requested review from a team as code owners March 7, 2024 23:19
@sentry
Copy link
Copy Markdown

sentry Bot commented Mar 7, 2024

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: readthedocs/api/v2/views/integrations.py

Function Unhandled Issue
get_closed_external_version_response TypeError: 'NoneType' object is not subscriptable /api/v2/webhook/{project_slug}/{integration_pk}...
Event Count: 1

Did you find this useful? React with a 👍 or 👎

Copy link
Copy Markdown
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR 👍🏼

I didn't take a full review, but I noted a few issues that I see will generate some conflicts. In particular, those that change Python code and require some manual interaction (e.g. a Django migration to update the help_text)

I'm not opposed to this work, but I find it hard to prioritize by the core team at this point.

Comment thread readthedocs/builds/migrations/0007_add-automation-rules.py Outdated
Comment thread CHANGELOG.rst Outdated
Comment thread .github/workflows/codespell.yml Outdated
Comment thread .github/workflows/codespell.yml Outdated
@agjohnson
Copy link
Copy Markdown
Contributor

We use pre-commit for all other linting checks. I feel if we implement this it should be at pre-commit, not GHA.

@yarikoptic
Copy link
Copy Markdown
Contributor Author

Hmm, interesting, my script should've detected presence of pre-commit config... Will have a look wherever get a chance. You do have ci job running the pre-commit I assume? (I don't see report from pre-commit service)

@yarikoptic
Copy link
Copy Markdown
Contributor Author

that explains it:

❯ ls -ld .pre-commit-config.yaml
lrwxrwxrwx 1 yoh yoh 29 Mar  7 18:00 .pre-commit-config.yaml -> common/pre-commit-config.yaml
❯ cat .pre-commit-config.yaml
cat: .pre-commit-config.yaml: No such file or directory
❯ git submodule
-4af0fffd2cbeeb40f0a71b875beb99d6dc88a355 common

Let's do the submodules dance now...

@humitos
Copy link
Copy Markdown
Member

humitos commented Jul 24, 2025

This PR is great! Thanks for opening it!

I'd like to move forward with it. I've already reviewed readthedocs/common#212

We need to make some adjustments here and we can move forward 👍🏼 . @yarikoptic are you able to continue with this work?

Final decision on fixing or not CHANGELOG.rst

Let's exclude the changelog for now. It has a lot of noise.

let me exclude all such lines with help_text *=.*

We should implement this, if it's not already.

@yarikoptic
Copy link
Copy Markdown
Contributor Author

We need to make some adjustments here and we can move forward 👍🏼 . @yarikoptic are you able to continue with this work?

I am able as soon as there is cycles on your end so PR doesn't amass conflicts to address again.

yarikoptic added a commit to yarikoptic/readthedocs.org that referenced this pull request Jul 24, 2025
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented Jul 24, 2025

Documentation build overview

📚 dev | 🛠️ Build #32080278 | 📁 Comparing 4c99a8e against latest (7c6865c)

  🔍 Preview build  

No files changed.

@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented Jul 24, 2025

Documentation build overview

📚 docs | 🛠️ Build #32080280 | 📁 Comparing 4c99a8e against latest (7c6865c)

  🔍 Preview build  

Show files changed (1 files in total): 📝 1 modified | ➕ 0 added | ➖ 0 deleted
File Status
custom-script.html 📝 modified

@yarikoptic
Copy link
Copy Markdown
Contributor Author

FWIW, I have rebased and redone codespell fixing (automated and interactive). Let's see if no side effects. Please review the diff.

Copy link
Copy Markdown
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I suppose there may be a few test failing because there are some words that shouldn't return search results.

Comment thread readthedocs/config/notifications.py
Comment thread readthedocs/search/tests/test_api.py Outdated
"project": project.slug,
"version": version.slug,
"q": "indx",
"q": "index",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was probably an invalid word on purpose.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we need to annotate this line to be skipped and ok to contain indx?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced it with explicit "abracadabra" , or it does have to be "indx"? (let's see if tests feel better)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is failing. I think it needs to be indx probably for fuzzy search testing. I would leave as it was to avoid this type of issues and probably mark that line to be skipped 👍🏼

yarikoptic added a commit to yarikoptic/common that referenced this pull request Jul 24, 2025
@humitos
Copy link
Copy Markdown
Member

humitos commented Jul 25, 2025

There is an odd "variable" syntaxt which I didn't fix since seems would require transition and smells like it is on purpose:

This is indeed a typo that I made when I created the model's field AddonSearchFilter.syntaxt, yes 😓 . Let's leave as-is for now because it will require a migration. We can do this in a different PR. It should be easy since this field it's not being used yet.

Comment thread setup.cfg Outdated
Comment on lines +48 to +56
[codespell]
# Ref: https://github.com/codespell-project/codespell#using-a-config-file
skip = .git,*.svg,locale,package-lock.json,*.css,*.min.*,vendor,*.ai,setup.cfg,migrations,CHANGELOG.rst,common
check-hidden = true
# some names and abbreviations and very long lines (minimized?)
# TODO: fixup help_text in readthedocs/builds/models.py : "to perfom an"
ignore-regex = \b(Manuel|DED|Wile E. Coyote|Couldn\\u2019t|to perfom an)\b|.{300,}|"pyton\b|\|(ative|ment)\||"Hel" will match\b|ative: ''|help_text *=.*
# TODO: fix syntaxt -- would require transition?
ignore-words-list = fo,te,astroid,requestor,syntaxt,ore
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to put all this configuration on a specific codespell file? Like .codespell.cfg or similar?

We will need to put this file into common/ repository (readthedocs/common#212) because we want to share this configuration across multiple repositories, eg. https://github.com/readthedocs/ext-theme

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems .codespellrc https://github.com/codespell-project/codespell#using-a-config-file

We want that file to be in common/ and symlinked from here 👍🏼

yarikoptic and others added 3 commits April 1, 2026 00:11
- .codespellrc symlinked from common/.codespellrc
- codespell added to common/pre-commit-config.yaml
- Updated common submodule to include codespell config
- Skips: locale, migrations, CHANGELOG, media/javascript, search test
  data, SVG/CSS/AI/minified files
- Regex ignores: camelCase/PascalCase, URLs, hyphenated compounds,
  long lines, test fixture patterns
- Word ignores: fo/te (ISO 639), astroid, DED, syntaxt (DB migration),
  requestor, indx, ore

Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>
Fixed typos where codespell offered multiple suggestions:
- neeed → needed/need (model_views.py, tasks/search.py)
- tha → that (permissions.py)
- explotation → exploitation (middleware.py)
- interating → iterating (utils/__init__.py)
- behing → being (sphinx.py)
- achived → achieved (sphinx.py)
- clonned → cloned (sphinx.py, models.py)
- tupe → tuple (version_handling.py)
- toke → token (test_oauth.py)
- Thi → This, fom → from (git.py)
- parm → param (rclone.py)
- containe → container (test_api.py)
- Custome → Custom (settings/base.py)
- trailling → trailing, Stripe → Strip (querysets.py)
- Bulds → Builds, verion → version (tasks.py)
- treshold → threshold (views/base.py)

Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>
… typos

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "codespell -w",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^

Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>
Comment thread .codespellrc
@@ -0,0 +1 @@
common/.codespellrc No newline at end of file
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: frankly, I personally would have delayed moving it to common until making sure all is good here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants