-
Notifications
You must be signed in to change notification settings - Fork 242
IPIP-518: URIs in Routing V1 API via Generic Schema #518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,345 @@ | ||
| --- | ||
| title: "IPIP-0518: URIs in Routing V1 API via Generic Schema" | ||
| date: 2026-02-11 | ||
| ipip: proposal | ||
| editors: | ||
| - name: Marcin Rataj | ||
| github: lidel | ||
| url: https://lidel.org/ | ||
| affiliation: | ||
| name: Shipyard | ||
| url: https://ipshipyard.com | ||
| thanks: | ||
| - name: Adin Schmahmann | ||
| github: aschmahmann | ||
| affiliation: | ||
| name: Shipyard | ||
| url: https://ipshipyard.com | ||
| relatedIssues: | ||
| - https://github.com/ipfs/specs/issues/192 | ||
| - https://github.com/ipfs/specs/issues/496 | ||
| - https://github.com/multiformats/multiaddr/issues/63 | ||
| - https://github.com/multiformats/multiaddr/issues/87 | ||
| - https://github.com/ipshipyard/roadmaps/issues/15 | ||
| - https://github.com/ipfs/specs/pull/518 | ||
| order: 518 | ||
| tags: ['ipips'] | ||
| xref: | ||
| - rfc3986 | ||
| --- | ||
|
|
||
| ## Summary | ||
|
|
||
| Introduce a `generic` record schema for the Delegated Routing V1 HTTP API that supports URIs alongside multiaddrs in the `Addrs` field. Unlike the `peer` schema, which is tied to libp2p PeerIDs and multiaddrs, `generic` supports arbitrary identifiers and address formats including HTTP(S) URLs and other URI schemes. This enables HTTP-only providers, WebSeeds, and other non-libp2p use cases without breaking existing clients. | ||
|
|
||
| ## Motivation | ||
|
|
||
| The Delegated Routing V1 HTTP API currently requires all provider records to use the `peer` schema, which mandates a libp2p PeerID as the identifier and multiaddrs as the address format. | ||
|
|
||
| Many IPFS services are primarily accessible via HTTP(S) and do not use libp2p: | ||
|
|
||
| - IPFS Gateways (path and subdomain) | ||
| - HTTP-based content providers and pinning services | ||
| - WebSeed providers | ||
|
|
||
| Converting HTTP(S) URLs to multiaddrs is lossy and error-prone: | ||
|
|
||
| - HTTP URLs must be encoded as `/dns4/example.com/tcp/80/http` or `/dns4/example.com/tcp/443/https` | ||
| - URL-to-multiaddr round-trips are not lossless (see [multiaddr#63](https://github.com/multiformats/multiaddr/issues/63)) | ||
| - Multiple implementations handle edge cases differently (default ports, paths, fragments, HTTP basic-auth) | ||
| - A single `https://example.com` URL supports HTTP/1.1, HTTP/2, and HTTP/3, but multiaddr requires separate entries per transport | ||
| - Requiring multiaddr libraries raises the barrier for lightweight HTTP-only clients | ||
|
|
||
| A new schema decouples provider records from libp2p, allowing the ecosystem to experiment with HTTP-only providers, WebSeeds, alternative protocols, and other novel concepts without vendor lock-in -- no need for explicit entries in [multicodec table.csv](https://github.com/multiformats/multicodec/blob/master/table.csv) or being blocked by ecosystem-wide adoption of a new addressing scheme. Existing clients remain unaffected. | ||
|
|
||
| ## Detailed design | ||
|
|
||
| ### Generic Schema | ||
|
|
||
| A new `generic` schema is added to the [Known Schemas](https://specs.ipfs.tech/routing/http-routing-v1/#known-schemas) section of the Routing V1 spec. | ||
|
|
||
| ```json | ||
| { | ||
| "Schema": "generic", | ||
| "ID": "did:key:z6Mkm1...", | ||
| "Addrs": ["https://trustless-gateway.example.com", "/ip4/1.2.3.4/tcp/5000"], | ||
| "Protocols": ["transport-ipfs-gateway-http"] | ||
| } | ||
| ``` | ||
|
|
||
| Fields: | ||
|
|
||
| - `ID`: a string identifier for the provider. Unlike the `peer` schema, this is not restricted to libp2p PeerIDs. Implementations SHOULD use identifiers that are self-authenticating (e.g. `did:key`), sufficiently unique, and less than 100 bytes. | ||
| - `Addrs`: an optional list of addresses as strings. Addresses are duck-typed by prefix: | ||
| - If a string starts with `/`, it is parsed as a [multiaddr](https://github.com/multiformats/multiaddr) | ||
| - Otherwise, it is parsed as a URI per :cite[rfc3986] | ||
| - Clients MUST skip addresses they cannot parse or do not support and continue with remaining entries. This includes URIs with unrecognized schemes, unsupported multiaddrs, or all multiaddrs if the client only supports URIs. | ||
| - `Protocols`: an optional list of transfer protocol names associated with this record. Protocol names are opaque strings with a max length of 63 characters, established by rough consensus across compatible implementations per the [robustness principle](https://specs.ipfs.tech/architecture/principles/#robustness). This is a deliberate departure from the `peer` schema, which suggested protocol names require registration in [multicodec table.csv](https://github.com/multiformats/multicodec/blob/master/table.csv), creating an IANA-like chokepoint for adopting new protocols. The `generic` schema removes this gatekeeping: anyone can return novel addresses and protocol names without external approval, and clients that do not recognize them simply skip them without breaking. | ||
|
|
||
| Servers and caching proxies MUST act as pass-through and return `Addrs` and `Protocols` as-is, unless explicitly filtered by the client via `?filter-addrs` or `?filter-protocols` query parameters. | ||
|
|
||
| To allow for protocol-specific fields and future-proofing, the parser MUST allow unknown fields, and clients MUST ignore fields they do not recognize. | ||
|
|
||
| The total serialized size of a single `generic` record MUST be less than 10 KiB. | ||
|
|
||
| ### Supported URI Schemes | ||
|
|
||
| Initially, `https://` SHOULD be supported as the primary URI scheme. | ||
|
|
||
| Other URI schemes (e.g. `magnet:`, `foo://`, or any future scheme) MAY appear in `Addrs`. Clients MUST skip URIs with schemes they do not support. This ensures new URI schemes can be introduced over time without breaking existing clients or requiring central coordination. | ||
|
|
||
| ### URI Requirements | ||
|
|
||
| URIs in the `Addrs` field: | ||
|
|
||
| - MUST be absolute URIs (not relative references) | ||
| - MUST include the scheme (e.g. `https://`, `magnet:`) | ||
| - MAY include paths, query parameters, or fragments, but clients MUST handle their presence defensively | ||
| - SHOULD point to endpoints that support protocols listed in the `Protocols` field | ||
|
|
||
| ### Interaction with `filter-addrs` | ||
|
|
||
| The `filter-addrs` query parameter from [IPIP-0484](https://specs.ipfs.tech/ipips/ipip-0484/) applies to `generic` records the same way it applies to `peer` records: | ||
|
|
||
| - Multiaddr addresses (strings starting with `/`) are filtered by multiaddr protocol name. | ||
| - URI addresses (strings not starting with `/`) are filtered by URI scheme name. For example, `?filter-addrs=https` matches `https://example.com`. | ||
| - This is naturally consistent: `https` is both a multiaddr protocol name (matching `/dns/example.com/tcp/443/https`) and a URI scheme (matching `https://example.com`). | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be a good idea to be explicit here about how filtering previously related to multiaddr components would apply to others (e.g. does must |
||
| - `?filter-addrs=unknown` includes `generic` records with no known addresses. | ||
| - If no addresses remain after filtering, the `generic` record is omitted from the response. | ||
|
|
||
| ### Relationship to Peer Schema | ||
|
|
||
| The `peer` schema remains unchanged. It represents a libp2p node identified by PeerID with multiaddr addresses. The `generic` schema is complementary: | ||
|
|
||
| | | `peer` schema | `generic` schema | | ||
| |---|---|---| | ||
| | `ID` | libp2p PeerID | any string (e.g. `did:key`) | | ||
| | `Addrs` | multiaddrs only | multiaddrs and/or URIs | | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be worth calling out that in the peer schema the multiaddrs all had an implied |
||
| | use case | libp2p-native providers | HTTP-only, WebSeed, custom protocols | | ||
|
|
||
| Routing servers MAY emit both schema types for the same provider: | ||
|
|
||
| ```json | ||
| { | ||
| "Providers": [ | ||
| { | ||
| "Schema": "peer", | ||
| "ID": "12D3KooW...", | ||
| "Addrs": ["/ip4/192.168.1.1/tcp/4001"], | ||
| "Protocols": ["transport-bitswap"] | ||
| }, | ||
| { | ||
| "Schema": "generic", | ||
| "ID": "did:key:z6Mkm1...", | ||
| "Addrs": ["https://trustless-gateway.example.com"], | ||
| "Protocols": ["transport-ipfs-gateway-http"] | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| ## Design rationale | ||
|
|
||
| ### Why a new schema instead of modifying Peer | ||
|
|
||
| The `peer` schema has a hard dependency on libp2p: `ID` is a PeerID and `Addrs` are multiaddrs. Existing clients parse every entry in `Addrs` as a multiaddr. Introducing URIs into the `Addrs` field of the `peer` schema would cause parse errors in all third-party clients that have not been updated, breaking backward compatibility. | ||
|
|
||
| Previous rollouts of new multiaddr protocols (`/quic-v1`, `/webtransport`, `/webrtc-direct`) did not break clients because those strings still parsed as valid multiaddrs, even when the client could not dial them. URIs are not multiaddrs and will fail multiaddr parsing. | ||
|
|
||
| By introducing a new schema, we leverage the existing requirement that clients MUST skip records with unknown schemas: | ||
|
|
||
| - Existing clients continue to work, only seeing `peer` records they already understand | ||
| - Updated clients opt in to `generic` records at their own pace | ||
| - No flag day or coordinated upgrade required | ||
|
|
||
| ### Incremental migration | ||
|
|
||
| Libp2p-native peers continue using the `peer` schema as-is. The migration only impacts providers that are not actual libp2p peers -- such as HTTP-only Trustless Gateways that today must be shoehorned into the `peer` schema with a synthetic PeerID. During the transition period, routing servers can return both `peer` and `generic` records for the same provider. Clients that understand `generic` use the richer address information; others fall back to `peer` records with the synthetic PeerID. | ||
|
|
||
| ### Decoupling from libp2p | ||
|
|
||
| The `generic` schema removes the hard requirement on libp2p PeerIDs and multiaddrs. This lowers the barrier for building lightweight IPFS clients that only speak HTTP, and enables experimentation with new provider types (WebSeeds, S3-backed storage) without requiring changes to the libp2p specification or multiaddr registry. | ||
|
|
||
| ## User benefit | ||
|
|
||
| ### For developers | ||
|
|
||
| - HTTP-only providers and HTTP-only stacks can be built without multiaddr encoding/decoding libraries. Lower cognitive overhead: everyone familiar with `https://` URIs knows how to work with them. | ||
| - Alternative URI schemes are also easier to integrate than new multiaddr protocols | ||
| - Lightweight HTTP-only IPFS clients become feasible without re-implementing libp2p concepts | ||
|
|
||
| ### For service providers | ||
|
|
||
| - HTTP(S) endpoints advertised directly as URLs | ||
| - Custom address formats supported without multiaddr registry changes | ||
| - Protocol-specific metadata via extra fields | ||
|
|
||
| ### For end users | ||
|
|
||
| - Lower barrier for new client implementations increases ecosystem diversity | ||
| - HTTP-only providers improve compatibility with web-based IPFS implementations | ||
|
|
||
| ## Compatibility | ||
|
|
||
| ### Backward compatibility | ||
|
|
||
| Fully backward compatible. Existing clients skip `generic` records because they use an unknown schema. The `peer` schema is unchanged. | ||
|
|
||
| ### Forward compatibility | ||
|
|
||
| Unknown fields MUST be ignored by clients. New address formats and protocol-specific fields can be added without breaking existing implementations. | ||
|
|
||
| URIs in `Addrs` are not limited to a specific scheme. Clients parsing a `generic` record MUST skip addresses with unrecognized URI schemes, which allows the ecosystem to introduce addressing beyond `https://` without requiring coordination or simultaneous upgrades. | ||
|
|
||
| ### Migration path | ||
|
|
||
| 1. Routing servers emit `generic` records alongside existing `peer` records | ||
| 2. Clients add support for `generic` schema at their own pace | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| 3. HTTP-only providers that previously required multiaddr conversion can switch to `generic` with native URI addresses | ||
|
Comment on lines
+197
to
+198
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note: The providers don't necessarily have to switch anything the routing-v1 servers would need to switch something, the providers might only if the routing-v1 server, or the routing system(s) behind it care |
||
|
|
||
| ## Security | ||
|
|
||
| ### URI validation | ||
|
|
||
| Implementations SHOULD validate URIs: | ||
|
|
||
| - Verify the URI scheme is supported (e.g. `https://`) | ||
| - Validate URI length limits (practical limit: 2048-8192 characters) | ||
| - Apply scheme-specific rate limits where appropriate (e.g. rate-limiting HTTP requests to URIs returning non-success responses) | ||
|
|
||
| ### HTTPS preference | ||
|
|
||
| For HTTP-based URIs, implementations SHOULD prefer `https://`. The `http://` scheme SHOULD only be allowed for testing and private LAN deployments, gated behind an explicit opt-in flag. | ||
|
|
||
| ### DNS considerations | ||
|
|
||
| HTTP(S) URIs rely on DNS resolution. The same security considerations that apply to `/dns`, `/dns4`, and `/dns6` multiaddrs apply here: | ||
|
|
||
| - DNS responses can be spoofed without DNSSEC | ||
| - Clients SHOULD use secure DNS transports where available | ||
| - Certificate validation MUST be performed for HTTPS URIs on the public internet | ||
|
|
||
| ### ID trust | ||
|
|
||
| The `generic` schema `ID` field is self-reported. Clients SHOULD use self-authenticating identifiers (e.g. `did:key`) and verify signatures where applicable. Reputation and resource allocation decisions SHOULD be tied to `ID`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might be worth clarifying what's even being trusted here. IIUC from the perspective of routing-v1 clients no trust has changed at all.
|
||
|
|
||
| ## Alternatives | ||
|
|
||
| ### URIs in Peer Schema Addrs field | ||
|
|
||
| Adding URIs directly to the `Addrs` field of the existing `peer` schema was considered. The `peer` schema was introduced in [IPIP-0337](https://specs.ipfs.tech/ipips/ipip-0337/) and has been used in production by multiple independent implementations for years. Changing the semantics of `Addrs` from multiaddr-only to a mixed format would break all third-party clients that parse entries as multiaddrs. Unlike new multiaddr protocols which still parse as valid multiaddrs, URIs are a fundamentally different format and cause parse errors. A new schema avoids this by leveraging the existing unknown-schema-skipping behavior. | ||
|
|
||
| ### URI-to-multiaddr conversion | ||
|
|
||
| The status quo requires converting HTTP URLs to multiaddrs like `/dns4/example.com/tcp/443/https`. This conversion is lossy: URI paths, fragments, query parameters, and HTTP/3 transport information are lost. Multiple implementations handle edge cases differently, leading to interoperability issues (see [multiaddr#63](https://github.com/multiformats/multiaddr/issues/63)). It also means libp2p-specific address libraries and parsers have to be implemented by every new client, increasing complexity and raising the barrier for new implementations. | ||
|
|
||
| ### Custom multiaddr keyword arguments | ||
|
|
||
| Adding keyword arguments to multiaddr protocols was proposed in [multiaddr#87](https://github.com/multiformats/multiaddr/issues/87). This would increase complexity for all multiaddr implementers without addressing the fundamental desire to use standard URIs. | ||
|
|
||
| ### Separate URI field in Peer Schema | ||
|
|
||
| Adding a separate `URIs` field to the `peer` schema would complicate the schema and create ambiguity about which field to check for addresses. A new schema is a cleaner separation: `peer` stays focused on libp2p peers, `generic` handles everything else. | ||
|
|
||
| ## Test fixtures | ||
|
|
||
| ### HTTPS-only provider | ||
|
|
||
| ```json | ||
| { | ||
| "Providers": [ | ||
| { | ||
| "Schema": "generic", | ||
| "ID": "did:key:z6Mkm1...", | ||
| "Addrs": ["https://trustless-gateway.example.com"], | ||
| "Protocols": ["transport-ipfs-gateway-http"] | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| ### Provider with protocol-specific metadata and custom URI scheme | ||
|
|
||
| ```json | ||
| { | ||
| "Providers": [ | ||
| { | ||
| "Schema": "generic", | ||
| "ID": "did:key:z6Mkm1...", | ||
| "Addrs": ["foo://custom-storage.example.com/bucket"], | ||
| "Protocols": ["example-future-protocol"], | ||
| "example-future-protocol": {"version": 2, "features": ["foo"]} | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| Clients that do not recognize the `foo://` URI scheme MUST skip that address. | ||
|
|
||
| ### Provider with opaque identifier | ||
|
|
||
| The `ID` field is not restricted to `did:key`. Any string identifier can be used: | ||
|
|
||
| ```json | ||
| { | ||
| "Providers": [ | ||
| { | ||
| "Schema": "generic", | ||
| "ID": "550e8400-e29b-41d4-a716-446655440000", | ||
| "Addrs": ["https://cdn.example.com"], | ||
| "Protocols": ["transport-ipfs-gateway-http", "example-future-protocol"] | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| ### Mixed response with both schemas | ||
|
|
||
| ```json | ||
| { | ||
| "Providers": [ | ||
| { | ||
| "Schema": "peer", | ||
| "ID": "12D3KooW...", | ||
| "Addrs": [ | ||
| "/ip4/192.168.1.1/tcp/4001", | ||
| "/ip4/192.168.1.1/udp/4001/quic-v1" | ||
| ], | ||
| "Protocols": ["transport-bitswap"] | ||
| }, | ||
| { | ||
| "Schema": "generic", | ||
| "ID": "did:key:z6Mkm1...", | ||
| "Addrs": ["https://trustless-gateway.example.com"], | ||
| "Protocols": ["transport-ipfs-gateway-http"] | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| ### Filtering with `filter-addrs` | ||
|
|
||
| Given a response containing: | ||
|
|
||
| ```json | ||
| { | ||
| "Providers": [ | ||
| { | ||
| "Schema": "generic", | ||
| "ID": "did:key:z6Mkm1...", | ||
| "Addrs": ["https://provider.example.com", "/ip4/1.2.3.4/tcp/443/https"], | ||
| "Protocols": ["transport-ipfs-gateway-http"] | ||
| } | ||
| ] | ||
| } | ||
| ``` | ||
|
|
||
| A request with `?filter-addrs=https` returns both addresses, because `https` matches the URI `https://provider.example.com` by URI scheme and the multiaddr `/ip4/1.2.3.4/tcp/443/https` by multiaddr protocol name. | ||
|
|
||
| A request with `?filter-addrs=tcp` returns only the multiaddr `/ip4/1.2.3.4/tcp/443/https`, because `tcp` does not match the URI scheme `https`. | ||
|
|
||
| A request with `?filter-addrs=!https` omits the record entirely, because all addresses are removed by the negative filter. | ||
|
|
||
| ## Copyright | ||
|
|
||
| Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a comment about allowing people to define known protocols here similar to how we have known schemas (either way we'll need a place to specify the names, metadata, and meaning associated with different protocol names)?