Skip to content

Custom record extractor implementation for VBVR data files, should it not use rawRecordExtractor? #412

@mark-weghorst

Description

@mark-weghorst

Background [Optional]

In #338 @yruslan added support for a custom record extractor interface so that users could write their own custom record extractors. I have implemented a custom record extractor to read variable block, variable record datasets as defined by the IBM documentation I referenced in #338

In implementing this record extractor my first attempt resulted in a bug where the last block would only return a single record.

I believe that this is because when I call RawRecordContext.next(n) to read the last block of records it advances the "pointer" to EOF, which makes hasNext() return false even if we haven't processed all of the records from that block.

My second attempt which I've attached, attempts to "decouple" hasNext() from the RawRecordContext by implementing a buffering queue and the logic to read blocks inside of hasNext() which I don't regard as an ideal solution.

The thing I don't like about my current implementation is that it has logic inside of hasNext() and generally seems in-elegant and complicated.

Question

Should support for VBVR record types be implemented as a separate block-aware reader type in Cobrix? If not, do you see a better and more elegant way than what I have implemented?

VariableBlockRecordExtractor.scala.zip

Metadata

Metadata

Assignees

Labels

acceptedAccepted for implementationenhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions