Skip to content

feat: make conda-pypi-map additive and add pypi-conda-map build overrides#6333

Draft
tdejager wants to merge 10 commits into
prefix-dev:mainfrom
tdejager:feat-additive-conda-pypi-map
Draft

feat: make conda-pypi-map additive and add pypi-conda-map build overrides#6333
tdejager wants to merge 10 commits into
prefix-dev:mainfrom
tdejager:feat-additive-conda-pypi-map

Conversation

@tdejager

@tdejager tdejager commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Description

We have a way to map conda packages to PyPI packages (used during the solve to figure out which conda packages already satisfy pypi-dependencies) and the other way around in pixi-build-python. The problem was that the user could only replace the whole mapping: as soon as you set conda-pypi-map for a channel, you lost the entire default prefix.dev mapping for it and had to replicate thousands of entries just to fix one name. The reverse direction had no override at all, only an on/off toggle.

The tricky part is that the mapping shapes are different (the default forward mapping is even keyed on repodata sha256 hashes), so this deliberately does not merge any data. Instead the user mapping is just consulted first, and on a miss we fall through to the default chain (hash → compressed → conda-forge verbatim). Additive becomes resolver layering instead of data merging.

Before, fixing one name meant hosting a full mapping file:

[workspace]
conda-pypi-map = { conda-forge = "9000-line-mapping.json" }

After:

[workspace.conda-pypi-map]
conda-forge = { mapping = { pytorch = "torch", not-on-pypi = false } }
my-company = { location = "https://internal.example.com/map.json", cache-ttl = "24h" }
internal = false   # no lookups for this channel

[package.build.config]            # pixi-build-python
ignore-pypi-mapping = false
pypi-conda-map = { torch = "pytorch", my-internal-pkg = false }

Concretely this PR:

  1. Makes conda-pypi-map entries per-channel configurable: a bare location string, false, or a table with location, inline mapping entries (inline wins over the file), mode = "extend" | "replace" and cache-ttl.
  2. Adds conda-pypi-map = false as the canonical global disable. conda-pypi-map = {} still works but is soft-deprecated with a warning.
  3. Adds cache-ttl for URL locations: the fetched map is cached on disk and only re-fetched once it is older than the TTL. If the re-fetch fails we fall back to the stale copy with a warning, so solves keep working offline. This also makes it practical to pin the full parselmouth mapping from the raw GitHub URL (documented).
  4. Adds pypi-conda-map to pixi-build-python: overrides consulted before the mapping service in both passes (project.dependencies → run and build-system.requires → host), false drops a dependency silently, per-key merge for target-specific config.
  5. Gives network failures an actionable error that points at mode = "replace", <channel> = false or conda-pypi-map = false, so firewalled setups know how to opt out.

Breaking behavior

This is breaking on purpose, in two ways:

Existing manifest Before After To get the old behavior
conda-forge = "mapping.json" (bare string) Exclusive: only packages in your file got purls, everything else from that channel got none. Additive: your entries win, everything else falls back to the default mapping. conda-forge = { location = "mapping.json", mode = "replace" }
A mapping for channel A, while also using channel B Configuring any mapping suppressed the conda-forge verbatim fallback for all channels, including B. Channel B behaves exactly as if no mapping were configured. B = false if you really want B's lookups off
conda-pypi-map = {} Disabled all mapping. Same, but emits a deprecation warning. conda-pypi-map = false

The second one I'm fairly convinced was accidental: the suppression was keyed on the global mode instead of the record's channel, so mapping one channel silently degraded purl coverage everywhere else.

How Has This Been Tested?

Automated tests for the whole behavior matrix: extend/replace/disabled per channel, inline-overrides-file, the cache-ttl fresh/expired/stale/no-cache paths (the offline ones are proven network-free with a blocking middleware), parse errors with snapshots, and the reverse-direction override/skip/marker/merge cases. The extend-miss fallthrough and the unmapped-channel-keeps-verbatim cases are online tests that I verified against the live mapping. Schema is regenerated and pixi run test-schema passes.

I still want to user-test this end-to-end, therefore it's in draft.

User-test checklist

  • A Inline override + explicit false: pixi lock with manifest A, check numpy has pkg:pypi/my-renamed-numpy, boltons has no purl, and python still has its normal purl (proves extend falls through).
  • B The breaking flip: manifest B with the bare string → boltons still gets a purl (old behavior: none). Switch the entry to mode = "replace"boltons loses it.
  • C Global disable: manifest C → pytables gets the verbatim pkg:pypi/pytables instead of the real pkg:pypi/tables; remove the false line → pkg:pypi/tables comes back. Also try conda-pypi-map = {} and check the deprecation warning.
  • D cache-ttl + parselmouth pin: manifest D → first pixi lock fetches (file appears under <cache>/conda-pypi-mapping/project-defined/), delete pixi.lock and lock again with networking off → still works; set cache-ttl = "0s" with networking off → stale-copy warning, still works.
  • Firewall story: with no warm cache, block conda-mapping.prefix.dev and lock a plain manifest → the error suggests the escape hatches.
  • Same channel by name and URL in the map → clear "configured more than once" error.
  • http:// location → warning about tampering shows.
  • E pixi-build-python: manifest E with PIXI_BUILD_BACKEND_OVERRIDE="pixi-build-python=/path/to/locally/built/pixi-build-python"pixi build, check the rendered recipe has pytorch in run deps and my-internal-helper is gone; add the linux-64 target table and check the per-key merge.
  • pypi-conda-map without ignore-pypi-mapping = false → inert warning shows.

Test manifests

A — inline override + explicit false (extend)
[workspace]
name = "map-test-inline"
channels = ["conda-forge"]
platforms = ["osx-arm64"]

[workspace.conda-pypi-map]
conda-forge = { mapping = { numpy = "my-renamed-numpy", boltons = false } }

[dependencies]
python = "3.12.*"
numpy = "*"
boltons = "*"

# any pypi dep, just so the purl amending runs
[pypi-dependencies]
rich = "*"
pixi lock && grep -B 3 "my-renamed-numpy" pixi.lock; grep -A 5 "name: boltons" pixi.lock
B — the breaking bare-string flip vs replace
echo '{ "numpy": "my-renamed-numpy" }' > mapping.json
[workspace]
name = "map-test-flip"
channels = ["conda-forge"]
platforms = ["osx-arm64"]

[workspace.conda-pypi-map]
# additive now; boltons still gets a purl from the default mapping.
conda-forge = "mapping.json"
# then switch to the old exclusive behavior; boltons loses its purl:
# conda-forge = { location = "mapping.json", mode = "replace" }

[dependencies]
python = "3.12.*"
numpy = "*"
boltons = "*"

[pypi-dependencies]
rich = "*"
C — global disable keeps the verbatim heuristic

pytables is a nice probe because its real PyPI name is tables: with lookups disabled the verbatim fallback yields pkg:pypi/pytables, with the default mapping you get pkg:pypi/tables.

[workspace]
name = "map-test-disable"
channels = ["conda-forge"]
platforms = ["osx-arm64"]
conda-pypi-map = false   # also try {} to see the deprecation warning

[dependencies]
python = "3.12.*"
pytables = "*"

[pypi-dependencies]
rich = "*"
D — pin the parselmouth mapping with cache-ttl
[workspace]
name = "map-test-ttl"
channels = ["conda-forge"]
platforms = ["osx-arm64"]

[workspace.conda-pypi-map]
conda-forge = { location = "https://raw.githubusercontent.com/prefix-dev/parselmouth/main/files/compressed_mapping.json", mode = "replace", cache-ttl = "24h" }

[dependencies]
python = "3.12.*"
pytables = "*"

[pypi-dependencies]
rich = "*"
E — pixi-build-python overrides
# pixi.toml
[workspace]
name = "map-test-build"
channels = ["conda-forge"]
platforms = ["osx-arm64"]
preview = ["pixi-build"]

[dependencies]
map-test-build = { path = "." }

[package]
name = "map-test-build"
version = "0.1.0"

[package.build]
backend = { name = "pixi-build-python", version = "*" }

[package.build.config]
ignore-pypi-mapping = false
pypi-conda-map = { torch = "pytorch", my-internal-helper = false }

# for the per-key merge check:
# [package.build.target.linux-64.config]
# pypi-conda-map = { nvidia-cublas-cu12 = false }

[package.host-dependencies]
python = "*"
# pyproject.toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "map-test-build"
version = "0.1.0"
dependencies = ["torch>=2.0", "requests", "my-internal-helper"]
PIXI_BUILD_BACKEND_OVERRIDE="pixi-build-python=$PWD/target/debug/pixi-build-python" pixi build

AI Disclosure

  • This PR contains AI-generated content.
    • I have tested any AI-generated content in my PR.
    • I take responsibility for any AI-generated content in my PR.

Tools: Claude Code with Fable 5

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added sufficient tests to cover my changes.
  • I have verified that changes that would impact the JSON schema have been made in schema/model.py.

tdejager added 10 commits June 10, 2026 12:25
…eplace modes, inline mappings and cache-ttl

- conda-pypi-map now accepts false (global disable), and per-channel
  values can be a bare string, false, or a table with location,
  inline mapping entries, mode = "extend"|"replace" and cache-ttl.
- BREAKING: bare location strings now use the additive extend mode
  (overlay over the prefix.dev chain) instead of the exclusive
  replace mode. The old behavior is available via mode = "replace".
- conda-pypi-map = {} is soft-deprecated in favor of false and emits
  a deprecation warning.
- pixi_core wires the manifest entries into the per-channel mapping
  configuration; inline keys are lowercased to match normalized conda
  names, and cache-ttl is validated to require an http(s) location.
- Move all mapping/purl tests from solve_group_tests.rs into a new
  conda_pypi_map_tests.rs integration module.
- Split CondaPypiMapEntry::Map into CondaPypiMapSpec with a dedicated
  MappingLocationSpec { location, cache_ttl } so the TTL is structurally
  tied to the location source it applies to.
- Clarify in the Disabled doc comments that the offline conda-forge
  verbatim fallback still applies when lookups are disabled.
- Deduplicate the offline help text into a shared MAPPING_OFFLINE_HELP
  const used by both the prefix.dev and project-defined fetch errors, and
  mention pointing at a custom mapping location (with cache-ttl) as an
  escape hatch.
- Document why the TTL cache cannot reuse the http-cache middleware
  (header-driven freshness, client-global max_ttl, no stale-on-error).
- Docs: add a parselmouth raw-URL pinning recipe (and a note that blob
  URLs serve HTML).
- pixi_toml: add a custom_error(message, span) constructor and use it for
  the conda-pypi-map validation errors.
- pixi_core: extract the conda-pypi-map manifest conversion out of
  workspace/mod.rs into a workspace::conda_pypi_map module with named,
  unit-testable functions (incl. the channel-membership validation).
- pixi_core: classify mapping locations with rattler_lock::UrlOrPath
  instead of hand-rolled starts_with checks; file:// urls normalize to
  paths and non-http(s) remote schemes are rejected with a clear error.
- pypi_mapping: make the per-record fallback policy explicit with a
  Fallback enum (PrefixThenVerbatim | Verbatim | None) instead of a
  mutable suppression flag.
- pixi-build-python: dedupe the requirement version conversion into
  convert_requirement_version, shared by the user-map and service paths.
- test: pin that a mapping for one channel no longer suppresses the
  verbatim fallback for records from other, unmapped channels (online).
- TTL cache: treat a future mtime (clock skew) as age zero instead of
  making the cached copy invisible to the freshness check and the stale
  fallback; write cache files atomically via tempfile + persist; unit
  tests for the age computation.
- pypi-conda-map: an invalid conda name in an override now falls through
  to the mapping service instead of silently dropping the dependency.
- Split the offline help text: failures fetching a user-configured
  location now suggest checking the URL / adding cache-ttl instead of
  the firewall-framed prefix.dev advice; clearer HTTP status error.
- Warn when a mapping location uses plain http://, since a tampered
  mapping influences dependency resolution.
- Encode the manifest-mode to MappingMode conversion in a documented
  convert_mode function (a From impl is impossible: neither crate
  depends on the other, so the orphan rule forces it into pixi_core).
- Error wording: cache-ttl duration errors show example values;
  cache-ttl-without-location message no longer implies location must be
  a URL; {} deprecation help reworded; stale Disabled doc hedge fixed;
  duplicated doc comment removed.
- Docs: warning box now also covers the verbatim-fallback scope change
  for unmapped channels; cache-ttl docs state the no-cache hard-failure;
  inline-mapping example no longer reuses 'pytorch' as both channel and
  package name.
- New tests: mixed-case inline keys, cache-ttl on a local path rejected,
  file:// table-form location works (pins UrlOrPath normalization),
  Skip entries with markers, vacuous purls assertion fixed, unit tests
  for parse_mapping_location/convert_entry; re-documented what the
  fresh-cache TTL test actually pins (cache layout + no network).
- typos: reword 'mis-mapped' in the conda/PyPI concepts page.
- basedpyright: no implicit string concatenation in the new
  schema/model.py field descriptions (schema output unchanged).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant