[Quality-547] Update bert_tiny model to the retrained classifier#10529
[Quality-547] Update bert_tiny model to the retrained classifier#10529evelyn-with-warp wants to merge 1 commit intomasterfrom
Conversation
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR swaps the bert_tiny.onnx model with a retrained version (bert_tiny_2026May.onnx) to reduce input-classifier misfires reported in Quality-547. The change renames the binary model file and updates the single path reference in Model::model_path(). The model is embedded at compile time via RustEmbed, and no other code references the old filename.
Concerns
- The model file rename bakes the training date into the filename (
bert_tiny_2026May.onnx). This is fine as a one-off, but if models are swapped regularly a versioning scheme (e.g. a monotonic version number or hash suffix) would be easier to maintain. This is non-blocking. - The PR description sentence "Simply use retrained classifier to replace the previously" appears truncated. Minor, but worth fixing for posterity.
Security
No security concerns. The change swaps a binary model asset and updates a static path literal. No new inputs, dependencies, secrets, or network surfaces are introduced.
Verdict
Found: 0 critical, 0 important, 1 suggestions
Approve
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| fn model_path(&self) -> &'static str { | ||
| match self { | ||
| Model::BertTiny => "bert_tiny.onnx", | ||
| Model::BertTiny => "bert_tiny_2026May.onnx", |
There was a problem hiding this comment.
💡 [SUGGESTION] Consider a versioning scheme that doesn't encode calendar dates (e.g. bert_tiny_v2.onnx). Date-suffixed names become ambiguous when multiple retrains happen in the same month and make it harder to correlate with training metadata.
There was a problem hiding this comment.
@szgupta @vorporeal is this a real concern (as we won't refresh this often)? if so, could you suggest proper model versioning strategy for client app?
Description
Simply use retrained classifier to replace the previous one
The new classifier has the same same size as previous one, just further finetuned on prod data with user quickback as misfire groundtruth label
It could resolve the misfires reported in quality-547; For quality 543 on file paths misfires, some are still pending to resolve by improving heuristics.
Linked Issue
ready-to-specorready-to-implement.Testing
./script/runScreenshots / Videos
https://www.loom.com/share/94f99078f8614b2bbc05eab7f8d1a7d1
in this video we tested on the misfires listed in quality-547
Agent Mode
Reviewers:
@szgupta @vorporeal