Skip to content

fix(feature): avoid IndexError when parsing a dtype with a trailing dot#3731

Open
devteamaegis wants to merge 1 commit into
laminlabs:mainfrom
devteamaegis:fix/parse-nested-brackets-trailing-dot
Open

fix(feature): avoid IndexError when parsing a dtype with a trailing dot#3731
devteamaegis wants to merge 1 commit into
laminlabs:mainfrom
devteamaegis:fix/parse-nested-brackets-trailing-dot

Conversation

@devteamaegis

Copy link
Copy Markdown

What's broken

parse_nested_brackets crashes with IndexError: string index out of range on any dtype segment ending in a dot, e.g. "bionty.". It's reachable from the public parse_dtype via a malformed categorical dtype:

parse_dtype("cat[bionty.]")
  File ".../lamindb/models/feature.py", line 300, in parse_nested_brackets
    if len(parts) == 2 and parts[1][0].isupper():
IndexError: string index out of range

Why it happens

After dtype_str.split("."), the code checks len(parts) == 2 but then reads parts[1][0] without checking that parts[1] is non-empty — a trailing dot makes it "".

Fix

Guard parts[1] != "" before indexing. The malformed input now falls through to the bare-registry branch, so parse_dtype("cat[bionty.]") raises a clear ValidationError ("invalid dtype") instead of an opaque IndexError.

Test

test_parse_nested_brackets_trailing_dot checks "bionty." no longer crashes and parse_dtype("cat[bionty.]") raises ValidationError.

parse_nested_brackets crashed with 'IndexError: string index out of
range' on inputs like 'bionty.' (e.g. via parse_dtype('cat[bionty.]'))
because it indexed parts[1][0] without checking the segment was
non-empty. Guard the access so malformed dtypes raise a clear
ValidationError instead.
@falexwolf

Copy link
Copy Markdown
Member

Thank you very much for the contribution! Will merge a version of this.

Given you ran into this: Have you considered passing Python objects instead of strings?

@falexwolf

Copy link
Copy Markdown
Member

I think what we should probably do is reason through a few more cases in which strings that a user passes could raise opaque errors.

This fix is good but it also feels a bit ad hoc; also testing strategy.

Thank you anyway, we'll mull over this for a bit more.

@devteamaegis devteamaegis force-pushed the fix/parse-nested-brackets-trailing-dot branch from e3fdb16 to c94ed3f Compare June 11, 2026 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants