fix(safaa): warn when threshold argument cannot be honored#52
Open
Valyrian-Code wants to merge 1 commit into
Open
fix(safaa): warn when threshold argument cannot be honored#52Valyrian-Code wants to merge 1 commit into
Valyrian-Code wants to merge 1 commit into
Conversation
SafaaAgent.predict accepts a `threshold` parameter, but only the predict_proba branch consults it. The shipped SGD classifier uses loss='hinge' and therefore has no predict_proba (sklearn's @available_if descriptor raises AttributeError on access), so the hasattr() check is False at runtime and execution falls to the binary predict() path that ignores the threshold entirely. Callers tuning the sensitivity get the default SVM decision boundary with no indication that their argument had no effect. This was working when the original model was trained with a probability-supporting loss; the regression slipped in when the SGD(hinge) model replaced it. Emit a UserWarning when a non-default threshold is passed but the loaded model cannot honor it, and document the constraint in the docstring. Default usage stays warning-free. Tests cover: no warning at default threshold, warning at non-default threshold, warning text mentions predict_proba, warnings at extreme threshold values, predictions remain valid alongside the warning, and a monkeypatched fake classifier proves the threshold actually controls output when predict_proba is available (with a boundary-inclusive check). Signed-off-by: RAJVEER42 <irajveer.bishnoi2310@gmail.com>
Author
|
Hi @GMishx & @Kaushl2208 Hi! I opened a small PR with bug fixe and regression test. I’d appreciate a review whenever you have time thanks for maintaining the project! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
SafaaAgent.predictdocuments a configurablethresholdparameter, but the shipped classifier (SGDClassifier(loss='hinge')) does not implementpredict_proba. At runtime:returns
False, execution falls through to the binarypredict()branch, andthresholdis silently ignored.This PR emits a
UserWarningwhen a non-default threshold is passed to a model that cannot honor it, and documents the limitation in the docstring. Default usage remains warning-free, so existing callers are unaffected.Changes
[Safaa.py](https://github.com/fossology/safaa/blob/main/Safaa/src/safaa/Safaa.py?utm_source=chatgpt.com)
import warningsUserWarningwhenthreshold != 0.5but the loaded model lackspredict_probapredict()docstring to document the constraint and workaround (loss='log_loss'or'modified_huber')tests/__init__.py,tests/test_safaa.py[pyproject.toml](https://github.com/fossology/safaa/blob/main/pyproject.toml?utm_source=chatgpt.com)
pytestto dev dependenciesTest coverage
TestPredictThresholdverifies:0.0,1.0,0.99,0.01)predict_probathreshold=0.5remains silentpredict_probais available>=threshold boundary is inclusiveHow to test
Manual reproduction
Output:
This closes #51.