Skip to content

Java API: invalid file path crashes batch processing (CLI handles gracefully) #430

@bundolee

Description

@bundolee

Problem

When using OpenDataLoaderPDF.processFile in a loop, an invalid file path throws an exception and stops the entire batch. The CLI handles this gracefully, but the Java API does not.

Originally reported in #375 by @pavanpai769.

Reproduction

for (String pdf : new String[]{"valid.pdf", "invalid.pdf"}) {
    OpenDataLoaderPDF.processFile(pdf, config);
}

If invalid.pdf doesn't exist, the exception kills the loop and valid.pdf (if after) never gets processed.

Expected behavior

The API should throw a clear exception (e.g. IllegalArgumentException for invalid paths) so callers can catch and skip, rather than silently swallowing errors. The throws IOException contract must be preserved.

Design direction

  • Add input validation (null/blank/non-existent/non-PDF) that throws IllegalArgumentException
  • Keep throws IOException on the signature — callers decide error handling
  • Document the batch pattern with try-catch in Javadoc

See review discussion on #375 for details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions