Skip to content

Conversation

@joellubi
Copy link
Member

@joellubi joellubi commented Oct 12, 2023

Rationale for this change

It was suggested in the discussion around supporting generic ingest for the Flight SQL ADBC driver that an "Ingest" command would be a helpful addition to the Flight SQL specification. This command would enable a Flight SQL client to provide a FlightData stream to the server without needing to know its SQL syntax, and have that stream loaded into a target table by whichever means the server deems appropriate.

What changes are included in this PR?

  • Add CommandStatementIngest message type to Flight SQL proto definition
  • Add INGEST_TRANSACTIONS_SUPPORTED option for SqlInfo

Are these changes tested?

Yes, see implementation PRs:

Are there any user-facing changes?

Yes, a new command is added to the Flight SQL spec.

@github-actions
Copy link

⚠️ GitHub issue #38255 has been automatically assigned in GitHub to PR creator.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to review .proto definition in this PR, right?
Then we don't need to generate go/arrow/flight/gen/flight/*.go in this PR. We can do it in a separated PR that implements Go version of this change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to use 1000 instead of 7 here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that options can be used for any parameters not otherwise supported by the spec, which makes it different from any one parameter. By giving this a high number, it leaves room for more explicit parameters to be added to the spec at lower numbers as the spec evolves.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Could you add a comment how to add a new explicit parameter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I've just added a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this rationale. What would it change to add new parameters at higher numbers, rather than at lower numbers?

Comment on lines 1817 to 1822
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

target_ prefix may be redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was loosely following the existing conventions in StatementOptions which uses the target_ prefix. I agree though that it doesn't actually improve clarity here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the target_ prefix as it doesn't add much information in this context.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Oct 17, 2023
@joellubi
Copy link
Member Author

We want to review .proto definition in this PR, right? Then we don't need to generate go/arrow/flight/gen/flight/*.go in this PR. We can do it in a separated PR that implements Go version of this change.

Thanks for your review @kou. I would be happy to split this up into multiple PRs if that would be an acceptable path. I am new to the repo and saw other similar PRs including 2 implementations and integration tests along with the .proto changes, so I wanted to make sure I was following the standards. My intention is to add C++ and integration tests to this PR once there is some consensus around the .proto definition. If it is possible to get approval for .proto and implementations separately, that may be a smoother review process. Would that be a reasonable approach to take here?

@kou
Copy link
Member

kou commented Oct 18, 2023

We can use either approaches ((1) one PR that includes .proto change, 2 implementations and integration tests or (2) 3 PRs for (2.1) .proto, (2.2) 1 implementation and integration tests and (2.3) another implementation and integration tests). We don't have a requirement for this. See also: https://arrow.apache.org/docs/format/Changing.html

As you said, the latter will be smooth to review. The former may be easy to implement because the PR author can work on only one branch. (We may find a .proto problem when we implement the feature and integration tests. In the case, we need to fix .proto and 2 implementations in one branch or three branches.)

@joellubi joellubi force-pushed the add-flight-sql-ingestion branch from 9369655 to d962c22 Compare October 19, 2023 10:16
@github-actions github-actions bot added awaiting change review Awaiting change review and removed Component: Go awaiting changes Awaiting changes labels Oct 19, 2023
@joellubi
Copy link
Member Author

(2) 3 PRs for (2.1) .proto, (2.2) 1 implementation and integration tests and (2.3) another implementation and integration tests).

Thanks for clarifying. I've elected to go with this approach in which the format and implementation changes are in separate PRs. Just so I understand the next steps in the process, is the procedure to review and merge the 3 PRs one-by-one or will they all need to be open and reviewed at the same time?

@joellubi joellubi changed the title GH-38255: [Format][Go] Add Flight SQL Command for Bulk Ingestion GH-38255: [Format] Add Flight SQL Command for Bulk Ingestion Oct 22, 2023
@joellubi joellubi marked this pull request as ready for review October 22, 2023 10:32
@joellubi joellubi requested a review from kou October 22, 2023 10:32
@joellubi
Copy link
Member Author

@kou I've opened a second PR containing the go implementation and integration tests (#38385) which should help provide a better idea of how these changes will function. I would appreciate any feedback on the format/implementation up to this point.

@kou
Copy link
Member

kou commented Oct 22, 2023

Just so I understand the next steps in the process, is the procedure to review and merge the 3 PRs one-by-one or will they all need to be open and reviewed at the same time?

The latter.
We need at least 2 implementations and integration tests with them before we vote this proposal.
We can't merge this proposal (all 3 PRs) without a passed vote.

See also: https://arrow.apache.org/docs/format/Changing.html

@joellubi
Copy link
Member Author

joellubi commented Nov 3, 2023

Superseded by #38385. I've decided to consolidate all implementations in a single PR.

@joellubi joellubi closed this Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting change review Awaiting change review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Format][Flight SQL] Add Flight SQL Command for Bulk Ingestion

3 participants