Skip to content

Java - Flight SQL - StackOverflowException caused by org.apache.arrow.flight.FlightStream.next #172

@amunra

Description

@amunra

Describe the bug, including details regarding any error messages, version, and platform.

We're experimenting with Arrow Flight SQL and wrote an initial basic prototype server.

I've tried hitting it with the Java library, but it's a little buggy so far.

The data set I'm querying is a few GBs of data across 10'000'000 rows and 21 columns.
There are 10 double columns and 10 symbol (Dictionary<Int32, Utf8>) columns.

See: https://github.com/timescale/tsbs to get an idea of the type of data being queried.

I think the Java Flight SQL client is struggling with the response sent back.

java.lang.StackOverflowError
	at java.base/java.util.Spliterator.getExactSizeIfKnown(Spliterator.java:414)
	at java.base/java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:526)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:150)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.IntPipeline.findFirst(IntPipeline.java:552)
	at java.base/java.text.DecimalFormatSymbols.findNonFormatChar(DecimalFormatSymbols.java:844)
	at java.base/java.text.DecimalFormatSymbols.initialize(DecimalFormatSymbols.java:815)
	at java.base/java.text.DecimalFormatSymbols.<init>(DecimalFormatSymbols.java:115)
	at java.base/sun.util.locale.provider.DecimalFormatSymbolsProviderImpl.getInstance(DecimalFormatSymbolsProviderImpl.java:85)
	at java.base/java.text.DecimalFormatSymbols.getInstance(DecimalFormatSymbols.java:182)
	at java.base/java.util.Formatter.zero(Formatter.java:2450)
	at java.base/java.util.Formatter$FormatSpecifier.getZero(Formatter.java:4450)
	at java.base/java.util.Formatter$FormatSpecifier.localizedMagnitude(Formatter.java:4466)
	at java.base/java.util.Formatter$FormatSpecifier.print(Formatter.java:3276)
	at java.base/java.util.Formatter$FormatSpecifier.print(Formatter.java:3261)
	at java.base/java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2957)
	at java.base/java.util.Formatter$FormatSpecifier.print(Formatter.java:2918)
	at java.base/java.util.Formatter.format(Formatter.java:2689)
	at java.base/java.util.Formatter.format(Formatter.java:2625)
	at java.base/java.lang.String.format(String.java:4141)
	at org.apache.arrow.memory.util.HistoricalLog.recordEvent(HistoricalLog.java:82)
	at org.apache.arrow.memory.BufferLedger.retain(BufferLedger.java:182)
	at org.apache.arrow.memory.BufferLedger.retain(BufferLedger.java:169)
	at org.apache.arrow.vector.ipc.message.ArrowRecordBatch.<init>(ArrowRecordBatch.java:92)
	at org.apache.arrow.vector.ipc.message.ArrowRecordBatch.<init>(ArrowRecordBatch.java:69)
	at org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeRecordBatch(MessageSerializer.java:438)
	at org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeDictionaryBatch(MessageSerializer.java:514)
	at org.apache.arrow.vector.ipc.message.MessageSerializer.deserializeDictionaryBatch(MessageSerializer.java:529)
	at org.apache.arrow.flight.ArrowMessage.asDictionaryBatch(ArrowMessage.java:273)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:264)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
	at org.apache.arrow.flight.FlightStream.next(FlightStream.java:280)
        ....

Looking at FlightStream.java:280, there's a recursive call to next().

public boolean next() {
    ...
          } else if (msg.getMessageType() == HeaderType.DICTIONARY_BATCH) {
            ...
            return next();      //  <------------- culprit
    ...
  }

This should be fixed to allow querying large datasets.

Before I forget, here's the version I'm using:

        <dependency>
            <groupId>org.apache.arrow</groupId>
            <artifactId>flight-sql</artifactId>
            <version>11.0.0</version>
        </dependency>

Component(s)

Java

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions