Add Batched lookups for streaming GRPC endpoints and BigTable#5521
Add Batched lookups for streaming GRPC endpoints and BigTable#5521RustedBones merged 5 commits intospotify:mainfrom
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #5521 +/- ##
==========================================
+ Coverage 61.44% 61.50% +0.06%
==========================================
Files 312 312
Lines 11105 11122 +17
Branches 776 779 +3
==========================================
+ Hits 6823 6841 +18
+ Misses 4282 4281 -1 ☔ View full report in Codecov by Sentry. |
When getting an response for a batch request, return an UnmatchedRequestException for unmatched requests
I think we can keep the same API for that case. When receiving a batch response missing some entries form the batch request, we should fail fast and report those. It is then up to the user to filter out those exceptions if this is an expected behavior. See #5532 |
The rule of thump I tried to apply here is fail, if the client you're using is throwing an exception. BigTable not having an entry for a key is kind of something expected. But we can merge your PR. And the error handling can be added to the |
46805a9 to
5dcab72
Compare
|
@RustedBones I've gone ahead and rebased this brach based on your PR #5532 and dropped the |
Add batched version of grpcLookupStream and BigTableDoFn
a8368cf to
d39cba0
Compare
RustedBones
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the contribution. It makes a lot of sense to handle missing response in batch in a lenient way instead of failing the job.
| import com.google.cloud.bigtable.config.BigtableOptions; | ||
| import com.google.cloud.bigtable.grpc.BigtableSession; |
There was a problem hiding this comment.
As a heads-up, we will probably move to the BigtableDataClient in some next minor release, see here. Expect some breaking changes in that part.
|
Thanks! |
This aims to extend the usability for AsyncBatchLookupDoFn it adds:
grpcLookupBatchStreamfor batched GRPC endpoints with streaming response.BigTableBatchDoFnfor batch calls to BigTable including a cache.To make this happen the AsyncBatchLookupDoFn had to be changed to gracefully handle partial batched responses.