-
Notifications
You must be signed in to change notification settings - Fork 2.1k
fix: error on CREATE EXTERNAL TABLE with no files and no explicit schema #21965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -309,6 +309,10 @@ mod tests { | |
| #[tokio::test] | ||
| async fn test_create_using_folder_with_compression() { | ||
| let dir = tempfile::tempdir().unwrap(); | ||
| // Schema inference now requires at least one file at the location. | ||
| // The file itself can be 0-byte — it will be filtered out before the | ||
| // format-specific inference runs, leaving an empty inferred schema. | ||
| fs::File::create_new(dir.path().join("placeholder.csv.gz")).unwrap(); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what do you think about using
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think 8e72ad8 is what you meant, lmk if not
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. perfect |
||
|
|
||
| let factory = ListingTableFactory::new(); | ||
| let context = SessionContext::new(); | ||
|
|
@@ -351,6 +355,9 @@ mod tests { | |
| #[tokio::test] | ||
| async fn test_create_using_folder_without_compression() { | ||
| let dir = tempfile::tempdir().unwrap(); | ||
| // See `test_create_using_folder_with_compression` — a placeholder file | ||
| // is required so schema inference does not error on an empty location. | ||
| fs::File::create_new(dir.path().join("placeholder.csv")).unwrap(); | ||
|
|
||
| let factory = ListingTableFactory::new(); | ||
| let context = SessionContext::new(); | ||
|
|
@@ -387,6 +394,8 @@ mod tests { | |
| let mut path = PathBuf::from(dir.path()); | ||
| path.extend(["odd.v1", "odd.v2"]); | ||
| fs::create_dir_all(&path).unwrap(); | ||
| // Placeholder so schema inference does not error on an empty location. | ||
| fs::File::create_new(path.join("placeholder.parquet")).unwrap(); | ||
|
|
||
| let factory = ListingTableFactory::new(); | ||
| let context = SessionContext::new(); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This just means we carry around memory for the ObjectMeta of zero sized files until a couple lines later. I think this is not a big problem.
The alternative is that we error even where there are 0 byte files present. I think that's an interesting discussion: e.g. a completely empty
data.csv. Or hive partitioned directories with no data. I think all of these should still require an explicit schema or error, but there are tests that check the opposite behavior: