Skip to content

Add ignoreScopeForBehaviorLinks to crawlconfig#3430

Open
tw4l wants to merge 3 commits into
mainfrom
issue-3369-add-behavior-links-backend
Open

Add ignoreScopeForBehaviorLinks to crawlconfig#3430
tw4l wants to merge 3 commits into
mainfrom
issue-3369-add-behavior-links-backend

Conversation

@tw4l

@tw4l tw4l commented Jun 23, 2026

Copy link
Copy Markdown
Member

Fixes #3369

Backend support for the new crawler --ignoreScopeForBehaviorLinks option, with test coverage for updates.

The second commit adds a minimum crawler version check for this flag and sets the default value to 1.14.0. Feel free to drop this commit if this seems excessive but since we may set this value to true in the frontend (at least for social media sites, if not more broadly) and it'll break crawls running <1.14.0, it seemed like a wise precaution.

Testing

  1. Checkout this branch locally and set default for new field to True in RawCrawlConfig model (since we don't have a way to set it from frontend yet)
  2. Add a crawler channel that will fail the minimum version check to your local config, e.g. (assuming you also have latest crawler image built from main locally):
crawler_channels:
  - id: default
    image: "docker.io/webrecorder/browsertrix-crawler:latest"
    imagePullPolicy: Never
  - id: previous
    image: "docker.io/webrecorder/browsertrix-crawler:1.13.2"
    imagePullPolicy: IfNotPresent
  1. Run a crawl with default crawler channel and verify setting is applied to the crawl
  2. Run a crawl with previous crawler channel and verify setting is not applied to the crawl and that the warn message is in operator logs

@tw4l tw4l requested review from SuaYoo, ikreymer and mistydemeo June 23, 2026 20:50

@mistydemeo mistydemeo left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, and good call logging when this gets dropped.

@SuaYoo

SuaYoo commented Jun 24, 2026

Copy link
Copy Markdown
Member

Per Discord thread we'll be renaming this field to alwaysAddBehaviorLinks so that it doesn't give the impression that scopeType will be ignored if this is this true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Task]: API support for add links from behaviors

3 participants