Skip to content

feat: YOLO object detection support (v8/9/10/11/12/26, TFLite, AutoNmsCalculator)#6279

Open
luolingfengflare wants to merge 15 commits into
google-ai-edge:masterfrom
luolingfengflare:dev/extend-capabilities
Open

feat: YOLO object detection support (v8/9/10/11/12/26, TFLite, AutoNmsCalculator)#6279
luolingfengflare wants to merge 15 commits into
google-ai-edge:masterfrom
luolingfengflare:dev/extend-capabilities

Conversation

@luolingfengflare
Copy link
Copy Markdown

Summary

  • TfLiteModelMetadata proto — new side packet emitted by TfLiteInferenceCalculator after Open(), carrying input/output tensor shapes, types, and quantization params for downstream consumers
  • YoloTensorsToDetectionsCalculator — decodes both Family A (Ultralytics cx/cy/w/h + argmax, v8/9/11/12) and Family B (end-to-end x1/y1/x2/y2/score/class_id, v10-e2e/v26-e2e) with AUTO shape inference from tensor dims
  • AutoNmsCalculator — reads MODEL_METADATA or resolves at graph build time; zero-cost NMS pass-through for end-to-end models, greedy IoU NMS for Ultralytics models
  • YoloObjectDetectorGraph — full ModelTaskGraph wiring ImagePreprocessingGraph → InferenceCalculator → YoloTensorsToDetectionsCalculator → AutoNmsCalculator → DetectionProjection/Transformation

Architecture

The two YOLO output families are handled by a single decode_mode enum (AUTO, ULTRALYTICS_DETECTION_HEAD, END_TO_END). In AUTO mode the calculator detects the family at runtime from tensor shape (dim==6 → END_TO_END). NMS is applied or skipped accordingly — no runtime overhead for models that already include NMS.

Test Plan

  • YoloTensorsToDetectionsCalculator: 9 unit tests (Family A FF/BF, Family B, AUTO detection, INT8 quantized, error paths)
  • AutoNmsCalculator: 5 unit tests (metadata-driven skip, metadata-driven apply, no-metadata fallback, explicit mode, empty input)
  • YoloObjectDetectorGraph utils: 8 unit tests (InferDecodeMode, ExtractModelInputShape/OutputDims logic)
  • Integration tests: DISABLED pending TFLite model files in testdata/ — enable by placing yolov8n_float32.tflite and yolo26n_e2e_float32.tflite and removing DISABLED_ prefix
  • Bazel build verification: blocked by pre-existing macOS/zlib toolchain issue in this environment

Future PRs

  • Multi-backend inference (ONNX/PyTorch/TensorRT) via InferenceRunnerFactory
  • Batch inference (B > 1) — MODEL_METADATA.inputs[0].shape[0] already carries B
  • SAHI / tiling
  • Python Task API wrapper
  • ByteTrack / BoTSORT tracking (consumes DETECTIONS output)

🤖 Generated with Claude Code

magiclingfeng and others added 15 commits April 24, 2026 09:07
Covers v8/9/10/11/12/26 support with TFLite backend, TfLiteInferenceCalculator
MODEL_METADATA enhancement, AutoNmsCalculator for zero-cost NMS bypass on
end-to-end models, and YoloObjectDetectorGraph architecture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers TfLiteModelMetadata proto, TfLiteInferenceCalculator enhancement,
YoloTensorsToDetectionsCalculator, AutoNmsCalculator, and
YoloObjectDetectorGraph with full test scaffolding.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add optional MODEL_METADATA output side packet to TfLiteInferenceCalculator
that emits tensor names, shapes, types, and quantization parameters after
model load and delegate initialization are complete.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…or MODEL_METADATA

- Use MP_ASSERT_OK_AND_ASSIGN for GetOutputSidePacket (returns StatusOr<Packet>)
- Guard metadata emit block with interpreter_ != nullptr to prevent null dereference on advanced GPU path
- Add ABSL_CHECK for null tensor pointer returned by interpreter_->tensor()
- Add linkstatic = 1 to metadata test target to match sibling test target

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s comment

- Fix Auto_DetectsUltralyticsFromShape: shape [7,5] resolves to BOXES_FIRST
  (not FEATURES_FIRST); correct data population and assert label_id==0
- Add Int8_EndToEnd_QuantizationScaleOverride test covering INT8 quantized
  tensor path with scale=1.0/zero_point=0
- Add MismatchedQuantizationOverride_ReturnsError test asserting Open() fails
  with kInvalidArgument when scale is set without zero_point
- Add bbox assertions for dets[1] in FamilyB_EndToEnd_TwoDetections
- Add 512-class ceiling comment in ResolveTensorLayout explaining the AUTO
  heuristic limit and requirement for explicit layout on >512-class models

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… models)

Reads MODEL_METADATA side packet in Open(), sets skip_nms_ once (dim==6 →
skip), then either forwards detections unchanged or applies greedy IoU NMS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the options proto defining DecodeMode/TensorLayout/PostprocessMode
enums, and standalone graph utils (ExtractModelInputShape,
InferDecodeMode) with unit tests covering singleton-squeeze and
end-to-end vs ultralytics heuristic.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@google-cla
Copy link
Copy Markdown

google-cla Bot commented Apr 25, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants