Skip to content

fix(image-redactor): return rendered image when no text is detected in verify#2040

Open
AlexanderSanin wants to merge 3 commits into
microsoft:mainfrom
AlexanderSanin:fix/verify-empty-image-no-text-detected
Open

fix(image-redactor): return rendered image when no text is detected in verify#2040
AlexanderSanin wants to merge 3 commits into
microsoft:mainfrom
AlexanderSanin:fix/verify-empty-image-no-text-detected

Conversation

@AlexanderSanin
Copy link
Copy Markdown

Summary

  • Bug fixed: DicomImagePiiVerifyEngine.verify_dicom_instance() returned an empty/blank image when the DICOM file contained no burnt-in text PII (closes Returned verification image is empty if no text is detected #1034).
  • Root cause: ImageAnalyzerEngine.add_custom_bboxes() had an early-return branch for len(bboxes) == 0 that skipped the matplotlib rendering pipeline and returned the raw PIL image directly. This also ignored the use_greyscale_cmap parameter, so greyscale DICOM images were never rendered with cmap="gray".
  • Fix: Removed the early-return branch. An empty for-loop over bboxes is a no-op, so both the zero-bbox and non-zero-bbox paths now share the same matplotlib rendering code path, producing a consistent return type.
  • Bonus: Added plt.close(fig) after rendering to release the matplotlib figure and prevent a memory leak that existed in both paths.

Test plan

  • Added two new parametrised cases to test_add_custom_bboxes_happy_path in tests/test_image_analyzer_engine.py:
    • Empty bboxes list with use_greyscale_cmap=False
    • Empty bboxes list with use_greyscale_cmap=True
    • Both assert a non-None PIL.Image.Image of the correct dimensions is returned.
  • All 56 existing unit tests in test_image_analyzer_engine.py and test_dicom_image_pii_verify_engine.py continue to pass.

…n verify

When `add_custom_bboxes` was called with an empty bboxes list (i.e., no
text detected in the DICOM image), it returned the raw PIL image directly,
bypassing the matplotlib rendering pipeline. This caused two problems:

1. The `use_greyscale_cmap` parameter was ignored, so greyscale DICOM
   images were not rendered with `cmap="gray"`, resulting in a blank or
   colour-mapped image instead of the expected greyscale view.

2. The returned type was an unrendered PIL Image rather than the
   matplotlib-figure-derived PIL Image returned in every other code path,
   leading to inconsistent output and visually empty verification images.

Fix: remove the early-return branch for `len(bboxes) == 0`. The `for`
loop over bboxes is a no-op when the list is empty, so both paths now
share the same matplotlib rendering code. Also add `plt.close(fig)` to
release the figure after rendering to prevent a memory leak.

Add two new parametrised test cases (empty bboxes, greyscale and RGB)
that assert a valid, non-None PIL Image of the correct dimensions is
returned.

Closes microsoft#1034

Signed-off-by: Oleksandr Sanin <alexaaander.sanin@gmail.com>
@AlexanderSanin
Copy link
Copy Markdown
Author

Hey @omri374 @HammadSiddiqui @max-tarlov-infinitusai. Could you, please, have a look at this?

@AlexanderSanin
Copy link
Copy Markdown
Author

@AlexanderSanin please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree

@SharonHart
Copy link
Copy Markdown
Contributor

@AlexanderSanin can you take a look at the failing tests?

…oxes found

The previous attempt always routed through matplotlib for empty bboxes,
which broke the integration test
(test_given_image_without_text_and_pii_verify_then_image_does_not_change)
because the RGBA matplotlib output is not pixel-identical to the original
RGB input after resize.

Targeted fix: only bypass the raw-image-return path when
use_greyscale_cmap=True (i.e. greyscale DICOM images).  For those,
ax.imshow must receive cmap="gray" and go through the full matplotlib
rendering pipeline so the image is displayed correctly instead of as a
blank/wrong-colourmap image (the root cause of microsoft#1034).  For regular RGB
images with no bboxes the original PIL image is returned directly,
preserving backward compatibility and keeping the integration test green.

Also guard the pixel-colour loop in test_add_custom_bboxes_happy_path so
it only runs when there are bboxes (the loop expects multi-channel pixels
which a mode-L image does not provide).

Signed-off-by: Oleksandr Sanin <alexaaander.sanin@gmail.com>
Copilot AI review requested due to automatic review settings May 29, 2026 09:47
@AlexanderSanin
Copy link
Copy Markdown
Author

@SharonHart Thanks for the heads-up! The integration test failure was caused by my first attempt always routing through matplotlib even for RGB images with no bboxes, which made the returned image pixel-incompatible with the original.

I've pushed a corrected fix:

  • RGB images with no bboxes (e.g. no_ocr.png) → return image_custom directly, preserving the original behavior and keeping the integration test green.
  • Greyscale images with no bboxes (the DICOM case from Returned verification image is empty if no text is detected #1034) → render through matplotlib with cmap="gray", which is what fixes the blank/wrong-colormap display bug.
  • Also added plt.close(fig) in all paths to prevent a matplotlib figure memory leak.
  • Updated the unit test to guard the per-pixel color loop so it only runs when bboxes are present (mode-"L" images have scalar pixels, not multi-channel tuples).

The Analyzer test failures are CI timeouts during dependency install — unrelated to this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Returned verification image is empty if no text is detected

2 participants