Skip to content

Fix Spanner write mutation coder serialization#5919

Open
NA-V10 wants to merge 2 commits intospotify:mainfrom
NA-V10:fix-spanner-mutation-coder
Open

Fix Spanner write mutation coder serialization#5919
NA-V10 wants to merge 2 commits intospotify:mainfrom
NA-V10:fix-spanner-mutation-coder

Conversation

@NA-V10
Copy link
Copy Markdown

@NA-V10 NA-V10 commented Apr 5, 2026

Summary

This change explicitly sets spannerMutationCoder on the SCollection[Mutation] before applying the Beam Spanner write transform.

Previously, the write path relied on Beam/Kryo fallback serialization, which could lead to UnsupportedOperationException when serializing Mutation / MutationGroup.

Changes Made

  • Materialized spannerMutationCoder using CoderMaterializer.beam
  • Applied the coder to the input SCollection[Mutation]
  • Updated the write path to use data.setCoder(coder).applyInternal(transform)

Why

A spannerMutationCoder already exists in CoderInstances, but it was not being explicitly used in SpannerWrite. This change ensures that Beam uses the intended serializer for Mutation objects during writes.

@NA-V10
Copy link
Copy Markdown
Author

NA-V10 commented Apr 5, 2026

Thanks for reporting this issue.

I updated the Spanner write path to explicitly use spannerMutationCoder before applying the Beam write transform. Since the coder already existed in CoderInstances, the fix was mainly wiring it into SpannerWrite so Beam does not fall back to Kryo serialization for Mutation objects.

@clairemcginty
Copy link
Copy Markdown
Contributor

clairemcginty commented Apr 8, 2026

hey, thanks for the contribution!

I think we shouldn't have to set the Coder explicitly here - as long as com.spotify.scio.spanner.coders._ is imported, spannerMutationCoder should be selected, without it I'd expect Scio to throw an error even constructing the SCollection:

% sbt scio-examples/console
scala> import com.spotify.scio._
scala> val sc = ScioContext()
scala> val mutations = Seq(com.google.cloud.spanner.Mutation.newInsertBuilder("myTable").set("foo").to("bar").build())
scala> val selectedCoder = (sc.parallelize(mutations)).coder
                                ^
       error: 
       Cannot find an implicit Coder instance for type:

         >> com.google.cloud.spanner.Mutation

         This can happen for a few reasons, but the most common case is that a data
         member somewhere within this type doesn't have an implicit Coder instance in scope.

         Here are some hints:
           - For collections, ensure that a Coder instance is in scope for the element type.
           - For module specific types, you may need to explicitly import the coders, eg avro:
               import com.spotify.scio.avro._
           - For sealed traits and case classes, you can identify the missing member's coder:
               scala> com.spotify.scio.coders.Coder.gen[Foo]

                 error: magnolia: could not find Coder.Typeclass for type Bar
                   in parameter 'xxx' of product type Foo
           - For generic methods, you may need to add an implicit parameter so that:
               def foo[T](coll: SCollection[SomeClass], param: String): SCollection[T]

             may become:
               def foo[T](coll: SCollection[SomeClass],
                          param: String)(implicit c: Coder[T]): SCollection[T]
                                         ^
             Alternatively, you can use a context bound instead of an implicit parameter:
               def foo[T: Coder](coll: SCollection[SomeClass], param: String): SCollection[T]

# Add in coders import and verify spannerMutationCoder is selected
scala> import com.spotify.scio.spanner.coders._
scala> val selectedCoder = (sc.parallelize(mutations)).coder
val selectedCoder: com.spotify.scio.coders.Coder[com.google.cloud.spanner.Mutation] = Beam(SerializableCoder(com.google.cloud.spanner.Mutation))

Are you importing com.spotify.scio.coders.kryo._ in your job?

@NA-V10
Copy link
Copy Markdown
Author

NA-V10 commented Apr 11, 2026

That makes sense, thanks! I had com.spotify.scio.coders.kryo._ imported, which is probably why I wasn't seeing the missing coder error earlier.

I'll remove the explicit coder assignment and rely on com.spotify.scio.spanner.coders._ so that spannerMutationCoder is selected automatically.

@NA-V10
Copy link
Copy Markdown
Author

NA-V10 commented Apr 11, 2026

Updated this to remove the explicit mutation coder assignment and rely on the implicit Spanner coder import instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants