Problem
When using OpenDataLoaderPDF.processFile in a loop, an invalid file path throws an exception and stops the entire batch. The CLI handles this gracefully, but the Java API does not.
Originally reported in #375 by @pavanpai769.
Reproduction
for (String pdf : new String[]{"valid.pdf", "invalid.pdf"}) {
OpenDataLoaderPDF.processFile(pdf, config);
}
If invalid.pdf doesn't exist, the exception kills the loop and valid.pdf (if after) never gets processed.
Expected behavior
The API should throw a clear exception (e.g. IllegalArgumentException for invalid paths) so callers can catch and skip, rather than silently swallowing errors. The throws IOException contract must be preserved.
Design direction
- Add input validation (null/blank/non-existent/non-PDF) that throws
IllegalArgumentException
- Keep
throws IOException on the signature — callers decide error handling
- Document the batch pattern with try-catch in Javadoc
See review discussion on #375 for details.
Problem
When using
OpenDataLoaderPDF.processFilein a loop, an invalid file path throws an exception and stops the entire batch. The CLI handles this gracefully, but the Java API does not.Originally reported in #375 by @pavanpai769.
Reproduction
If
invalid.pdfdoesn't exist, the exception kills the loop andvalid.pdf(if after) never gets processed.Expected behavior
The API should throw a clear exception (e.g.
IllegalArgumentExceptionfor invalid paths) so callers can catch and skip, rather than silently swallowing errors. Thethrows IOExceptioncontract must be preserved.Design direction
IllegalArgumentExceptionthrows IOExceptionon the signature — callers decide error handlingSee review discussion on #375 for details.