Releases: roboflow/inference
v1.3.1
🚀 Added
- Workflows: point prompting —
labeled_pointskind + SAM 3 Interactive block, and SAM3 PVS multi-prompt bug fixes — #2436 (@hansent) - Models: SAM 3 streaming video concept tracker (model class + Workflow block + docs) — #2439 (@hansent)
- Workflows: Track Class Lock block — #2447 (@alexeialexandrovich)
- Workflows: Claude Fable 5 as a Workflow model — #2433 (@Erol444)
- Workflows: enable
"best"confidence for semantic segmentation — #2425 (@leeclemnet) - WebRTC SDK: surface per-frame server errors to the client — #2438 (@sberan)
- Serverless: enable SAM 3 3D and
embed_imageroutes — #2424 (@hansent) - Batch Processing: Asset Library adjustments — #2399 (@PawelPeczek-Roboflow)
- Gateway: support scheme-qualified
SECURE_GATEWAYURLs — #2429 (@alexnorell)
🐛 Fixes
- Workflows (OBB): shift oriented bounding box corners in
detections_stitch— #2427 (@kounelisagis) - Workflows (OBB): translate oriented bounding box corners in
dynamic_crop— #2430 (@kounelisagis) - SAM 3: stop
IndexErroron single-point visual prompts inserialize_prompt— #2423 (@bigbitbus) - WebRTC: accept
strorlist[str]for ICE server URLs — #2418 (@ecarrara) - Depth estimation: honor query API key for model loads — #2445 (@hansent)
- Depth estimation: route depth model id through the inference path — #2446 (@hansent)
- Core: convert masks to
boolbeforenp.arraywrapping (avoids uint8 deprecation warnings) — #2432 (@lou-roboflow) - Core: remove mutable default arguments — #2441 (@kounelisagis)
- Caching: hash-truncate long cache path segments — #2279 (@hansent)
- UI: fix mobile drawer/menu showing on the index page — #2444 (@Erol444)
🔒 Security
- Address security issues — #2449 (@PawelPeczek-Roboflow)
📦 Build & Platform
- Shrink JetPack 6.2 image ~20% (6.56 → 5.22 GB); fix Orin (sm87) flash-attn + bitsandbytes — #2440 (@alexnorell)
📚 Docs
- RF-DETR Keypoint Detection docs — #2421 (@sergii-bond)
- SAM 3: document exemplar and PVS prompting — #2431 (@leeclemnet)
🔧 Dependencies
- Bump
shell-quote1.8.2 → 1.8.4 in/theme_build(npm/yarn group) — #2426 (@dependabot)
👋 New Contributors
- @kounelisagis — first contribution in #2427
- @alexeialexandrovich — first contribution in #2447
Full Changelog: v1.3.0...v1.3.1
v1.3.0
🚀 Added
🦾 RF-DETR Keypoints — pose estimation joins the RF-DETR family
The big one this release: thanks to @sergii-bond, RF-DETR now supports keypoint detection alongside the existing detection head — a single model architecture across detection and pose. Thanks to the contribution (#2401, #2416) you can pull a fine-tuned RF-DETR keypoints model and run it through the standard inference + workflows path with no extra plumbing.
🧬 YOLO26 Semantic Segmentation — fine-tuned models + binary head
Following YOLO26's earlier landing, @leeclemnet rounded out the segmentation story this release: fine-tuned YOLO26 sem-seg models are now first-class in inference (#2407, #2419).
🔥 New Workflows blocks
| Block | Type Slug | What it does |
|---|---|---|
| current_time/v1.py | roboflow_core/current_time@v1 |
Inject the current wall-clock time into the workflow graph as a typed step output |
| Vision Events (local mode) | enterprise | Run the Vision Events block in an in-process event-store mode instead of round-tripping through Roboflow infra |
roboflow_core/current_time@v1— by @patricknihranz in #2410. Drop it before any block that needs a timestamp (audit trails, time-windowed aggregations, freshness gates) without writing a custom block.- Vision Events block — local event-store mode (ENT-1192) — by @rvirani1 in #2402. Useful for on-prem and isolated-network deployments where the central event sink isn't reachable.
🧰 Workflow block improvements
A theme this release: a handful of existing blocks gained selector inputs so you can drive their parameters from upstream step outputs instead of hard-coding at the block level.
- GLM-OCR — accepts a selector for
task_type(@nathan-marraccini, #2409). Switch OCR mode dynamically based on prior workflow signals. - Qwen3.5-VL — accepts selectors for
promptandsystem_prompt(@nathan-marraccini, #2408). Compose prompts from prior steps without an intermediate Python block. - NumberInRange operator is now exposed in the Workflow Builder UI (@patricknihranz, #2229) — previously only reachable by hand-editing the YAML.
🌟 Other additions
- Gemini 2.5 native object-detection format is now parsed by
vlm_as_detector, so you can route Gemini 2.5 outputs through the same downstream blocks as any other detector (@dkosowski87, #2400). - Volume support added by @nkuneman in #2413 — see the PR for the mount conventions.
roboflow/inference-server-experimentalimage published (@grzegorz-roboflow, #2406) — an opt-in track for early bits before they hit the main image.
🔒 Security — please review your deployment
This release ships security enhancements for local deployments (#2417 by @PawelPeczek-Roboflow) and, alongside it, a new dedicated documentation page that walks through how to harden a self-hosted Inference server:
👉 inference.roboflow.com/install/security
Important
If you run Inference outside of localhost — in a container, on a shared host, on a private network, or anywhere reachable beyond a single developer machine — please take a few minutes to read the new guide. You own the security posture of your deployment. A default-configured server is adjusted to work in development-friendly mode and should not be deployed as is in production grade environments, due to exposing unauthenticated endpoints and ability to run Custom Python Blocks in Workflows Execution Engine.
The guide covers, in short:
- Restrict network access — bind to localhost, keep on a private network, or place behind a firewall. Never expose the inference port directly to the public internet without authentication and TLS.
- Enforce authentication — use
WORKSPACES_WHITELISTED_FOR_LOCAL_DEPLOYMENTto require valid API keys, or place your own auth layer (OAuth, mTLS) in front. - Enable TLS — terminate HTTPS at a reverse proxy or set
ENABLE_HTTPS=trueon the server itself. - Disable custom Python execution — set
ALLOW_CUSTOM_PYTHON_EXECUTION_IN_WORKFLOWS=falseunless you specifically need it.
If you have a public-facing or multi-tenant deployment, these are not optional. The new docs page is the canonical reference going forward.
🔧 Fixed
- Core models — forward
countinference/service_secretwhen downloading weights by @iurisilvio in #2398 — keeps usage attribution and gated-weights flows working when models are pulled at runtime. - Batch processing fix by @digaobarbosa in #2411.
- Workflows / Data Aggregator — corrected
values_differenceaggregation by @madhavcodez in #2388. First-time contribution — thank you! - Graceful fallback on ephemeral cache failure by @dkosowski87 in #2387 — the cache layer no longer takes the whole request down when its store is unavailable.
- Server-side TTL on model-monitoring zset writes by @bigbitbus in #2390 — model-monitoring entries now expire on the cache server even if a client never cleans up.
🚧 Maintenance
- Bump
inferenceto 1.2.13 by @dkosowski87 in #2396. - Update dependencies to fix main by @PawelPeczek-Roboflow in #2415.
- CI: concurrency cancellation on PR-triggered test workflows by @bigbitbus in #2392 — newer pushes to a PR cancel stale CI runs.
- Updated runtime-compatibility docs by @rafel-roboflow in #2391.
- Docs build sets
LOAD_ENTERPRISE_BLOCKS=TRUEby @rvirani1 in #2386 — enterprise blocks now show up in the rendered docs.
👋 New contributors
A warm welcome to two first-time contributors landing in this release:
- @nkuneman — Volume support (#2413)
- @madhavcodez —
values_differenceaggregation fix in Data Aggregator (#2388)
Full Changelog: v1.2.13...v1.3.0
v1.2.13
What's Changed
- fix(workflows): select v0 API for hosted semantic-segmentation remote execution by @leeclemnet in #2393
- fix(aliases): resolve public yolo26-sem model aliases by @leeclemnet in #2394
- refactor(gemini): remove deprecated model versions and update default to 2.5-flash by @dkosowski87 in #2395
Full Changelog: v1.2.12...v1.2.13
v1.2.12
What's Changed
- fix: emit RLE masks from instance segmentation v4 block by @leeclemnet in #2381
- feat(ent-1188): Brenner camera_focus for mono and multi-channel frames by @NVergunst-ROBO in #2368
- Add EditImageMetadata workflow block (DATAMAN-338) by @digaobarbosa in #2353
- Fix inner workflow dynamic block setup by @dkosowski87 in #2380
- Feat/workflow block runtime compat by @rafel-roboflow in #2374
- Bump the Execution Engine to v1.10.1 by @dkosowski87 in #2383
- Add change to enforce dense representation of instance segmentation mask when used in inference models, enabled by default for old versions of IS block by @PawelPeczek-Roboflow in #2384
- feat(yolo26-sem): YOLO26 semantic segmentation via inference_models (ONNX + TorchScript + TRT) by @leeclemnet in #2372
- Enforce transformers < 5.9 for jetsons 6 and 7 by @PawelPeczek-Roboflow in #2385
- [codex] Add configurable API proxy base URL by @hansent in #2366
Full Changelog: v1.2.11...v1.2.12
v1.2.11
What's Changed
- Update requirements on inference_models by @grzegorz-roboflow in #2370
- fix: serialize TorchScript load/script behind a global lock by @grzegorz-roboflow in #2373
- fix: harden auth middleware against Starlette BadHost (CVE-2026-48710) by @yeldarby in #2375
- fix: drop flat point sentinel in SAM3 visual_segment adapter by @grzegorz-roboflow in #2376
Full Changelog: v1.2.10...v1.2.11
v1.2.10
What's Changed
- Fix wrong step name rewrite when inner and outer workflow have the same step name, and outer step provided in bindings by @dkosowski87 in #2352
- Add Gemini 3.5 Flash and 3.1 Flash-Lite to Gemini block by @Erol444 in #2355
- Only push usage payloads when WebRTC connection was established by @rafel-roboflow in #2359
- Detections list rollup optimizations by @lou-roboflow in #2356
- Port SAM3 from inference/models to inference-models by @grzegorz-roboflow in #1946
- Log serverless request receipt by @hansent in #2303
- Feature/segmentation overlap box by @PawelPeczek-Roboflow in #2362
- Fix RLE mask parsing for instance segmentation remote execution by @SkalskiP in #2361
- eject leaked bf16 autocast from Sam3TrackerPredictor by @grzegorz-roboflow in #2363
- feat(batch-processing-cli): add --max-image-failure-rate flag by @maxschridde1494 in #2360
- Record modal usage with workflow_id as resource_id by @grzegorz-roboflow in #2364
Full Changelog: v1.2.9...v1.2.10
v1.2.9
What's Changed
- Fix/usage collector cache miss by @grzegorz-roboflow in #2345
- Add Qwen3.5 4b, Disable thinking, fix RGB bug by @Matvezy in #2319
- docs: modernize Models tab to use inference-sdk by default by @Erol444 in #2340
- [INC-309] Build CPU and GPU images once and push to both registries by @dkosowski87 in #2326
- fix(dynamic_crop): missing output by @PawelPeczek-Roboflow in #2346
- Fix issue with inference-models docs build by @PawelPeczek-Roboflow in #2347
- Add OpenRouter passthrough + unified Qwen-VL workflow block by @Erol444 in #2330
- YOLOs clamp boxes to image dims (ultralytics parity) by @leeclemnet in #2344
- Jetson dockerfiles: OpenCV with GStreamer on JP62, add BuildKit cache mounts on JP62/JP71, bump JP51 OpenCV by @alexnorell in #2321
- Implement defensive safe-guard against empty class name by @PawelPeczek-Roboflow in #2348
- fix: handle empty SDK request headers by @immanuwell in #2343
- Try to fix inference docs CI by @PawelPeczek-Roboflow in #2342
- chore(logging): add codeql comments for sensitive data handling acros… by @dkosowski87 in #2327
- feat/core: BoT-SORT block by @AlexBodner in #2349
- Fix/lack of init by @PawelPeczek-Roboflow in #2351
New Contributors
- @immanuwell made their first contribution in #2343
- @AlexBodner made their first contribution in #2349
Full Changelog: v1.2.8...v1.2.9
v1.2.8
⚠️ Deprecated
🚫 Gaze (L2CS-Net / MediaPipe) detection
Gaze detection — the roboflow_core/gaze@v1 workflow block, the POST /gaze/gaze_detection HTTP route, and the InferenceHTTPClient.detect_gazes() / detect_gazes_async() SDK helpers — has been deprecated in this release. Calling any of these now raises a new FeatureDeprecatedError that surfaces as HTTP 410 Gone with error_type: "FeatureDeprecatedError".
The Gaze pipeline was built on top of MediaPipe, which has dropped support for parts of the hardware matrix Roboflow ships against (notably some Linux/aarch64 and Jetson configurations no longer have compatible wheels), and a transitive protobuf CVE2026-0994 meant we could no longer carry the dependency alongside the rest of the inference stack.
# Old (still loads, returns 410 at execute time):
from inference_sdk import InferenceHTTPClient
client = InferenceHTTPClient(api_url="...", api_key="...")
client.detect_gazes(inference_input="image.jpg")
# → raises inference_sdk.http.errors.FeatureDeprecatedErrorImportant
The last release supporting Gaze is v1.2.7. If you currently rely on Gaze detection locally — pin to inference==1.2.7 (or the matching Docker image tag) as a short-term bridge while you plan ahead, keeping in mind that vulnerability exists in the build.
Note
The POST /gaze/gaze_detection route remains registered as a 410-Gone stub through end of Q2 2026 so existing client integrations get a structured error rather than a 404. Set CORE_MODEL_GAZE_ENABLED=False to disable it immediately.
If Gaze is important to your workflow and you'd like to discuss bringing it back paired with a different face detector, reach out at support@roboflow.com — we're happy to chat.
PR: #2334
💪 Added
🖼️ Image Stack workflow block
A new Image Stack workflow block lets you accumulate frames across executions — useful for building temporal pipelines, sliding-window inference, or any flow that needs to reason over a buffer of recent images rather than a single frame.
PR: #2307
🤖 OpenAI-compatible LLM block for custom endpoints
A new workflow block that lets you point at any OpenAI-compatible HTTP endpoint — your own self-hosted vLLM/Ollama/LM Studio deployment, a private gateway, or a third-party provider that mirrors the OpenAI API. Combined with the extra_body follow-up, you can also pass provider-specific extensions (reasoning effort, sampling tweaks, custom routing flags) through to the upstream call.
🔐 Opt-in HTTPS for the inference server
The HTTP server now supports terminating TLS directly via SSL environment variables — no reverse proxy required for simple self-hosted setups. Off by default; opt in by setting the relevant SSL env vars.
PR: #2308
🔒 Security & Hardening
- 🚫 Block legacy fine-tuned SAM3 loads — tighten the SAM3 model loading boundary so legacy fine-tuned artefacts that no longer match the supported package shape are rejected up front. #2278
- 📦 OpenTelemetry stack bumped to
1.41.x/0.62b1to clear a transitiveprotobufCVE and unlock alignment with the newerinference-modelsrelease. #2335 - 🛡️ Additional security fixes across the stack. #2336
🚧 Maintenance
- Align rfdetr preproc/postproc to rf-detr-internal by @leeclemnet in #2298
- Fix workflow block display names for acronym segments by @SolomonLake in #2320
- Fix owlv2 / rf-instant forward for best/default confidence modes by @leeclemnet in #2324
- Drop default/best confidence parametrization from OWLv2HF test by @leeclemnet in #2337
- Fix import for Gaze model by @PawelPeczek-Roboflow in #2338
- Constraint
transformerson Jetson with JP5 by @PawelPeczek-Roboflow in #2339 - Update serverless docs to v2 and prune deprecated pages by @Erol444 in #2311
- chore(docs): add GEO/SEO schema, robots.txt, git-revision-date plugin by @Borda in #2312
👋 New Contributors
Full Changelog: v1.2.7...v1.2.8
v1.2.7
💪 Added
- Add structured JSON output task to GLM-OCR workflow block by @Erol444 in #2276
- Add gpt-5.5 to OpenAI workflow block by @Erol444 in #2272
- Add Google Gemma workflow block via OpenRouter by @Erol444 in #2274
- Mask edge snap v1 by @lou-roboflow in #2286
- Add Qwen 3.5, Qwen 3.6, and MoonshotAI Kimi workflow blocks via OpenRouter by @Erol444 in #2280
- Add Per-Class Confidence Filter workflow block by @dcaroboflow in #2283
🔌 Improved
- Accelerate mask conversion to RLE by @sergii-bond in #2269
📖 Documentation improvements
🔧 Bug fixes
- Fix naming of github workflow by @dkosowski87 in #2268
- Fix spec for workflows to 1.9.0 by @dkosowski87 in #2267
- Fix missing CI *.so libs by @PawelPeczek-Roboflow in #2294
- Fix: widen confidence type in legacy HTTP API by @leeclemnet in #2281
- Fix lambda builds by @PawelPeczek-Roboflow in #2301
- Fix workflow iframe embed collapsing on resize by @Erol444 in #2295
- Catch RoboflowAPITimeoutError and RoboflowAPIConnectionError in pipeline init by @jeku46 in #2285
- Forward service_secret and countinference from InferenceConfig to v1 API by @leeclemnet in #2282
- [codex] Align preloaded model aliases by @hansent in #2284
- perf(polygon-annotator): crop mask to bbox before findContours to reduce scan area by @jeku46 in #2275
🚧 Maintenance
- Make linters happy by @PawelPeczek-Roboflow in #2273
- Try to address security vulnerabilities with versions bump by @PawelPeczek-Roboflow in #2290
- npm dependency updates (theme_build, landing) by @grzegorz-roboflow in #2293
- bump postcss to 8.5.10 (security) by @grzegorz-roboflow in #2299
- npm update inference/landing by @grzegorz-roboflow in #2300
- Resolve remaining vulnerabilities by @PawelPeczek-Roboflow in #2297
🏅 New Contributors
- @dcaroboflow made their first contribution in #2283
Full Changelog: v1.2.5...v1.2.7
v1.2.6
What's Changed
- Fix: widen confidence type in legacy HTTP API by @leeclemnet in #2281
Full Changelog: v1.2.5...v1.2.6