- Install prek:
uv tool install prek - Enable commit hooks:
prek install - Never run pytest, python, or airflow commands directly on the host — always use
breeze. - Place temporary scripts in
dev/(mounted as/opt/airflow/dev/inside Breeze).
<PROJECT> is folder where pyproject.toml of the package you want to test is located. For example, airflow-core or providers/amazon.
<target_branch> is the branch the PR will be merged into — usually main, but could be v3-1-test when creating a PR for the 3.1 branch.
- Run a single test:
uv run --project <PROJECT> pytest path/to/test.py::TestClass::test_method -xvs - Run a test file:
uv run --project <PROJECT> pytest path/to/test.py -xvs - Run all tests in package:
uv run --project <PROJECT> pytest path/to/package -xvs - If uv tests fail with missing system dependencies, run the tests with breeze:
breeze run pytest <tests> -xvs - Run a Python script:
uv run --project <PROJECT> python dev/my_script.py - Run core or provider tests suite in parallel:
breeze testing <test_group> --run-in-parallel(test groups:core-tests,providers-tests) - Run core or provider db tests suite in parallel:
breeze testing <test_group> --run-db-tests-only --run-in-parallel(test groups:core-tests,providers-tests) - Run core or provider non-db tests suite in parallel:
breeze testing <test_group> --skip-db-tests --use-xdist(test groups:core-tests,providers-tests) - Run single provider complete test suite:
breeze testing providers-tests --test-type "Providers[PROVIDERS_LIST]"(e.g.,Providers[google]orProviders[amazon]or "Providers[amazon,google]") - Run Helm tests in parallel with xdist
breeze testing helm-tests --use-xdist - Run Helm tests with specific K8s version:
breeze testing helm-tests --use-xdist --kubernetes-version 1.35.0 - Run specific Helm test type:
breeze testing helm-tests --use-xdist --test-type <type>(types:airflow_aux,airflow_core,apiserver,dagprocessor,other,redis,security,statsd,webserver) - Run other suites of tests
breeze testing <test_group>(test groups:airflow-ctl-tests,docker-compose-tests,task-sdk-tests) - Run scripts tests:
uv run --project scripts pytest scripts/tests/ -xvs - Run Airflow CLI:
breeze run airflow dags list - Type-check (non-providers): first run
uv sync --frozen --project <PROJECT>to align the local virtualenv withuv.lock(the dependency set CI uses), thenuv run --frozen --project <PROJECT> --with "apache-airflow-devel-common[mypy]" mypy path/to/code - Type-check (providers):
breeze run mypy path/to/code - Lint with ruff only:
prek run ruff --from-ref <target_branch> - Format with ruff only:
prek run ruff-format --from-ref <target_branch> - Run regular (fast) static checks:
prek run --from-ref <target_branch> --stage pre-commit - Run manual (slower) checks:
prek run --from-ref <target_branch> --stage manual - Build docs:
breeze build-docs - Determine which tests to run based on changed files:
breeze selective-checks --commit-ref <commit_with_squashed_changes>
SQLite is the default backend. Use --backend postgres or --backend mysql for integration tests that need those databases. If Docker networking fails, run docker network prune.
UV workspace monorepo. Key paths:
airflow-core/src/airflow/— core scheduler, API, CLI, modelsmodels/— SQLAlchemy models (DagModel, TaskInstance, DagRun, Asset, etc.)jobs/— scheduler, triggerer, Dag processor runnersapi_fastapi/core_api/— public REST API v2, UI endpointsapi_fastapi/execution_api/— task execution communication APIdag_processing/— Dag parsing and validationcli/— command-line interfaceui/— React/TypeScript web interface (Vite)
task-sdk/— lightweight SDK for Dag authoring and task execution runtimesrc/airflow/sdk/execution_time/— task runner, supervisor
providers/— 100+ provider packages, each with its ownpyproject.tomlairflow-ctl/— management CLI toolchart/— Helm chart for Kubernetes deploymentdev/— development utilities and scripts used to bootstrap the environment, releases, breeze dev envscripts/— utility scripts for CI, Docker, and prek hooks (workspace distributionapache-airflow-scripts)ci/prek/— prek (pre-commit) hook scripts; shared utilities incommon_prek_utils.pytests/— pytest tests for the scripts; run withuv run --project scripts pytest scripts/tests/
The uv.lock file is generated by uv lock, uv sync and is committed to the repo - it contains snapshot of
versions of all dependencies used for development of Airflow. If at any point in time you have a conflict
with uv.lock, simply delete it and run uv lock to regenerate it.
- Users author Dags with the Task SDK (
airflow.sdk). - Dag File Processor parses Dag files in separate processes and stores serialized Dags in the metadata DB. Software guards prevent individual parsing processes from accessing the database directly and enforce use of the Execution API, but these guards do not protect against intentional bypassing by malicious or misconfigured code.
- Scheduler reads serialized Dags — never runs user code — and creates Dag runs / task instances.
- Workers execute tasks via Task SDK and communicate with the API server through the Execution API — never access the metadata DB directly. Each task receives a short-lived JWT token scoped to its task instance ID.
- API Server serves the React UI and handles all client-database interactions.
- Triggerer evaluates deferred tasks/sensors in separate processes. Like the Dag File Processor, software guards steer it through the Execution API rather than direct database access, but these guards do not protect against intentional bypassing by malicious or misconfigured code.
- Shared libraries that are symbolically linked to different Python distributions are in
sharedfolder. - Airflow uses
uv workspacefeature to keep all the distributions sharing dependencies and venv - Each of the distributions should declare other needed distributions:
uv --project <FOLDER> synccommand acts on the selected project in the monorepo with only dependencies that it has
When reviewing code, writing security documentation, or performing security research, keep in
mind the following aspects of Airflow's security model. The authoritative reference is
airflow-core/docs/security/security_model.rst
and airflow-core/docs/security/jwt_token_authentication.rst.
Particularly, the intentional design choices, that are not security vulnerabilities and should not be reported as such are described in "What is NOT considered a security vulnerability" chapter of the security model.
When flagging security concerns, distinguish between:
- Actual vulnerabilities — code that violates the documented security model (e.g., a worker gaining database access it shouldn't have, a Scheduler executing user code, an unauthenticated user accessing protected endpoints).
- Known limitations — documented gaps where the current implementation doesn't provide full isolation (e.g., DFP/Triggerer database access, shared Execution API resources, multi-team not enforcing task-level isolation). These are tracked for improvement in future versions and should not be reported as new findings.
- Deployment hardening opportunities — measures a Deployment Manager can take to improve isolation beyond what Airflow enforces natively (e.g., per-component configuration, asymmetric JWT keys, network policies). These belong in deployment guidance, not as code-level issues.
- shared libraries provide implementation of some common utilities like logging, configuration where the code should be reused in different distributions (potentially in different versions)
- we have a number of shared libraries that are separate, small Python distributions located under
sharedfolder - each of the libraries has it's own src, tests, pyproject.toml and dependencies
- sources of those libraries are symbolically linked to the distributions that are using them (
airflow-core,task-sdkfor example) - tests for the libraries (internal) are in the shared distribution's test and can be run from the shared distributions
- tests of the consumers using the shared libraries are present in the distributions that use the libraries and can be run from there
- Always format and check Python files with ruff immediately after writing or editing them:
uv run ruff format <file_path>anduv run ruff check --fix <file_path>. Do this for every Python file you create or modify, before moving on to the next step. - No
assertin production code. time.monotonic()for durations, nottime.time().- In
airflow-core, functions with asessionparameter must not callsession.commit(). Use keyword-onlysessionparameters. - Imports at top of file. Valid exceptions: circular imports, lazy loading for worker isolation,
TYPE_CHECKINGblocks. - Guard heavy type-only imports (e.g.,
kubernetes.client) withTYPE_CHECKINGin multi-process code paths. - Define dedicated exception classes or use existing exceptions such as
ValueErrorinstead of raising the broadAirflowExceptiondirectly. Each error case should have a specific exception type that conveys what went wrong. - Apache License header on all new files (prek enforces this).
- Newsfragments are only added if a major change or breaking change is applied. This is usually coordinate during review. Please do not add newsfragments per default as in most cases this needs a reversion during review.
- Add tests for new behavior — cover success, failure, and edge cases.
- Use pytest patterns, not
unittest.TestCase. - Use
spec/autospecwhen mocking. - Use
time_machinefor time-dependent tests. Do not usedatetime.now() - Use
@pytest.mark.parametrizefor multiple similar inputs. - Use
@pytest.mark.db_testfor tests that require database access. - Test fixtures:
devel-common/src/tests_common/pytest_plugin.py. - Test location mirrors source:
airflow/cli/cli_parser.py→tests/cli/test_cli_parser.py. - Do not use
caplogin tests, prefer checking logic and not log output.
Write commit messages focused on user impact, not implementation details.
- Good:
Fix airflow dags test command failure without serialized Dags - Good:
UI: Fix Grid view not refreshing after task actions - Bad:
Initialize DAG bundles in CLI get_dag function
Add a newsfragment for user-visible changes:
echo "Brief description" > airflow-core/newsfragments/{PR_NUMBER}.{bugfix|feature|improvement|doc|misc|significant}.rst
- NEVER add Co-Authored-By with yourself as co-author of the commit. Agents cannot be authors, humans can be, Agents are assistants.
Always push to the user's fork, not to the upstream apache/airflow repo. Never push
directly to main.
Before pushing, determine the fork remote. Check git remote -v — if origin does not
point to apache/airflow, use origin (it's the user's fork). If origin points to
apache/airflow, look for another remote that points to the user's fork. If no fork remote
exists, create one:
gh repo fork apache/airflow --remote --remote-name forkBefore pushing, perform a self-review of your changes following the Gen-AI review guidelines
in contributing-docs/05_pull_requests.rst and the
code review checklist in .github/instructions/code-review.instructions.md:
- Review the full diff (
git diff main...HEAD) and verify every change is intentional and related to the task — remove any unrelated changes. - Read
.github/instructions/code-review.instructions.mdand check your diff against every rule — architecture boundaries, database correctness, code quality, testing requirements, API correctness, and AI-generated code signals. Fix any violations before pushing. - Confirm the code follows the project's coding standards and architecture boundaries described in this file.
- Run regular (fast) static checks (
prek run --from-ref <target_branch> --stage pre-commit) and fix any failures. This includes mypy checks for non-provider projects (airflow-core, task-sdk, airflow-ctl, dev, scripts, devel-common). - Run manual (slower) checks (
prek run --from-ref <target_branch> --stage manual) and fix any failures. - Run relevant individual tests and confirm they pass.
- Find which tests to run for the changes with selective-checks and run those tests in parallel to confirm they pass and check for CI-specific issues.
- Check for security issues — no secrets, no injection vulnerabilities, no unsafe patterns.
Before pushing, always rebase your branch onto the latest target branch (usually main)
to avoid merge conflicts and ensure CI runs against up-to-date code:
git fetch <upstream-remote> <target_branch>
git rebase <upstream-remote>/<target_branch>If there are conflicts, resolve them and continue the rebase. If the rebase is too complex, ask the user for guidance.
Then push the branch to the fork remote and open the PR creation page in the browser with the body pre-filled (including the generative AI disclosure already checked):
git push -u <fork-remote> <branch-name>
gh pr create --web --title "Short title (under 70 chars)" --body "$(cat <<'EOF'
Brief description of the changes.
closes: #ISSUE (if applicable)
---
##### Was generative AI tooling used to co-author this PR?
- [X] Yes — <Agent Name and Version>
Generated-by: <Agent Name and Version> following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)
EOF
)"The --web flag opens the browser so the user can review and submit. The --body flag
pre-fills the PR template with the generative AI disclosure already completed.
Remind the user to:
- Review the PR title — keep it short (under 70 chars) and focused on user impact.
- Add a brief description of the changes at the top of the body.
- Reference related issues when applicable (
closes: #ISSUEorrelated: #ISSUE).
- Ask first
- Large cross-package refactors.
- New dependencies with broad impact.
- Destructive data or migration changes.
- Never
- Commit secrets, credentials, or tokens.
- Edit generated files by hand when a generation workflow exists.
- Use destructive git operations unless explicitly requested.