Skip to content

Implement a fallback storage configuration#335

Open
vtsao-openai wants to merge 1 commit into
buildbarn:mainfrom
vtsao-openai:dev/vtsao/storage-fallback-backend
Open

Implement a fallback storage configuration#335
vtsao-openai wants to merge 1 commit into
buildbarn:mainfrom
vtsao-openai:dev/vtsao/storage-fallback-backend

Conversation

@vtsao-openai
Copy link
Copy Markdown

This configuration allows you to specify a primary and secondary storage backend, where if the primary goes down, reads & writes will go to the secondary.

Writes to the primary are async, best-effort replicated to the secondary. Writes to the secondary are not replicated back to the primary. This backend provides more availability at the cost of consistency. Although most of the time we don't expect the secondary to be actually used.

The main use case of this backend, at least for us, is to be able to update our backend storage deployment without down time. We currently use a sharded backend storage on k8s via a stateful set. So any deployments currently cause down time as k8s has to take down a pod (storage shard) when updating it. This also helps in general if k8s decides to restart a pod for any reason.

@vtsao-openai vtsao-openai force-pushed the dev/vtsao/storage-fallback-backend branch from bd91eb3 to 2618d5f Compare May 3, 2026 23:47
Regenerate proto bindings.

Co-authored-by: Codex <noreply@openai.com>
@vtsao-openai vtsao-openai force-pushed the dev/vtsao/storage-fallback-backend branch from 2618d5f to 8849476 Compare May 3, 2026 23:57
@EdSchouten
Copy link
Copy Markdown
Member

The main use case of this backend, at least for us, is to be able to update our backend storage deployment without down time. We currently use a sharded backend storage on k8s via a stateful set. So any deployments currently cause down time as k8s has to take down a pod (storage shard) when updating it. This also helps in general if k8s decides to restart a pod for any reason.

Out of curiosity, why are you doing this? Why not run bb-storage as a simple deployment? That way Kubernetes is capable of spinning up replacements before shutting down the old pod.

@vtsao-openai
Copy link
Copy Markdown
Author

The main use case of this backend, at least for us, is to be able to update our backend storage deployment without down time. We currently use a sharded backend storage on k8s via a stateful set. So any deployments currently cause down time as k8s has to take down a pod (storage shard) when updating it. This also helps in general if k8s decides to restart a pod for any reason.

Out of curiosity, why are you doing this? Why not run bb-storage as a simple deployment? That way Kubernetes is capable of spinning up replacements before shutting down the old pod.

Hey @EdSchouten it's because we're using PVs for our disk because ephemeral storage isn't enough for us. So I think we'd run into the same issue whether we're using deployments or stateful sets - even with deployments the PVC can only be mounted to a single pod so I think we'd run into the same issue - at some point we have to switch the PVC mount over to the new pod (which will result in downtime). And the disk types we require (for latency reasons) are all ReadWriteOnce, so only one pod can be mounted to them.

So I'm happy to be wrong, but I'm not sure this can be solved purely with k8s. I think we need this kind of fallback mechanism natively in Buildbarn.

@artyrian
Copy link
Copy Markdown

artyrian commented May 6, 2026

We have a similar Buildbarn setup with multiple shards (and also tried additional replicas per shard), it doesn’t provide true HA in k8s terms.
I also reviewed the ADR (https://github.com/buildbarn/bb-adrs/blob/main/0002-storage.md#adding-fault-tolerance), but didn't find a straightforward way to achieve fast shard failover with PVCs without downtime.

@vtsao-openai
Copy link
Copy Markdown
Author

Yeah the mirrored backend does not actually provide HA, I think it's even documented that it does not in the proto comments.

@EdSchouten another benefit of this fallback approach is not just for deployments, but in case the storage shard actually goes down for whatever reason, builds will not fail. Yes it's at the cost of consistency (which in our case is probably fine - it should be no different than if the digests just didn't exist in the cache in the first place).

Also this allows people not using k8s to be able to achieve some more availability if they want to use this backend.

@moroten
Copy link
Copy Markdown
Contributor

moroten commented May 6, 2026

EdSchouten another benefit of this fallback approach is not just for deployments, but in case the storage shard actually goes down for whatever reason, builds will not fail. Yes it's at the cost of consistency (which in our case is probably fine - it should be no different than if the digests just didn't exist in the cache in the first place).

Consider the case where Bazel asks the remote cluster to execute an action. The output is stored in mirror A because B is down. 5 minutes later, Bazel wants to use the output but A is down and B doesn't have it. The difference from a blob missing from the start is that Bazel also knows it is missing. In this case, Bazel should be able to assume that the blob still exists, it did 5 minutes ago.

@artyrian
Copy link
Copy Markdown

artyrian commented May 7, 2026

The output is stored in mirror A because B is down. 5 minutes later, Bazel wants to use the output but A is down and B doesn't have it. The difference from a blob missing from the start is that Bazel also knows it is missing.

Isn't this inconsistency equivalent to a normal cache eviction? If a blob gets evicted between the time Bazel stores it and tries to reuse it, the action cache would similarly point to a missing CAS entry. Bazel handles by reexecuting, so why is the fallback case worse? Or is the concern about bazel clients that don't gracefully handle (AC hit + CAS miss) and fail hard?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants