Commit 684e738
committed
python3-chardet: update to 7.4.1
Update package to 7.4.1.
Changes since 7.2.0:
7.3.0:
- License changed from MIT to 0BSD (no attribution required)
- New mime_type field in all detection results -- identifies binary and text
file types via magic number matching (40+ formats supported)
- Performance: 4 additional modules compiled with mypyc; per-file detection
capped at 16 KB (worst-case time: 62ms -> 26ms)
- Added riscv64 prebuilt wheel support
- Bug fix: null-separated ASCII data was misdetected as UTF-16-BE
7.4.0:
- Accuracy improved from 98.6% to 99.3%; speed improved with new dense
zlib-compressed model format (cold start: ~75ms -> ~13ms with mypyc)
- Training data overhauled: added MADLAD-400 and Wikipedia sources,
eliminated train/test overlap, samples increased from 15K to 25K per
language/encoding pair
- Bug fix: dedicated structural analyzers added for CP932, CP949, and
Big5-HKSCS (previously sharing base encoding byte-range analyzer)
7.4.1:
- Bug fix: BOM-prefixed UTF-16/32 input now correctly returns utf-16/utf-32
instead of endian-specific variants (utf-16-le/utf-16-be/etc.), which
previously caused a stray U+FEFF character at the start of decoded text
Signed-off-by: Alexandru Ardelean <alex@shruggie.ro>1 parent 1616acb commit 684e738
1 file changed
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
0 commit comments