- 
                Notifications
    
You must be signed in to change notification settings  - Fork 705
 
Description
Current implementation of KryoCoder writes class for every object on the output stream. (
scalding/scalding-beam/src/main/scala/com/twitter/scalding/beam_backend/KryoCoder.scala
Line 16 in b0ba993
| val bytes = kryoPool.toBytesWithClass(value) | 
This was done because beam can split the stream in between and if registration is only in the beginning of the stream, the latter part of the stream will fail. However we don't want to write className for classes which are already registered.
We can set setRegistrationRequired(true) when creating the Instantiator (
scalding/scalding-beam/src/main/scala/com/twitter/scalding/beam_backend/BeamBackend.scala
Line 22 in b0ba993
| implicit val kryoCoder: KryoCoder = new KryoCoder(defaultKryoCoderConfiguration(config)) | 
Then in KryoCoder we can keep a mapping of classes which have registration available (We can do a Try {pool.hasRegistration} and save the output in a map for future) and for those we use kryoPool.toBytesWithoutClass and for others we do kryoPool.toBytesWithClass
Is there a better way to achieve this?