-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-14421: [C++] Implement Flight SQL #11507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename pull request title in the following format? or See also: |
|
@lidavidm - was just about to ping you on this but see that you already noticed it. Note that the C++ Server is missing a small bit of functionality but the client is there, so we should be in good shape to be near a vote. |
|
@lidavidm , just saw the linter pointed some code style issues, already fixed them |
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for also implementing the server & a server example!
Most of these comments are just style/naming/Arrow codebase conventions. The main thing I'm hoping we can resolve is the templating of the client - I added some suggestions there. The other thing is whether we want to wrap Protobuf types, which I realize would be tedious, but so far we've avoided leaking Protobuf as part of the Arrow APIs and I think we may want to preserve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no new gRPC definitions, right? We don't need to generate gRPC sources (and we don't need to depend on grpc_cpp_plugin above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks unused here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should guard these definitions behind ARROW_BUILD_TESTS or ARROW_BUILD_EXAMPLES or similar, so that we don't require everyone to have SQLite3 installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to guard these definitions because flight_sql_test_app uses Googletest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these not already in ARROW_TEST_LINK_LIBS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems odd that the test has to be separately linked to the generated Protobuf sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this go in ARROW_FLIGHT_SQL_SRCS above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The analogous files are omitted for Flight since they get #included into another file, but that doesn't apply here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A very strange bug occurred when we tried to link them together and also when we tried following the same approach as Flight (through protocol_internal.cc), the protobuf objects were not behaving correctly when a message had multiple string fields, my understanding of what went wrong is this:
The problem was that as Arrow Flight implicitly static-links Protobuf (when compiling protocol_internal.cc), Arrow Flight SQL was linking it too by the same means and this ended up causing a one definition rule violation on an internal symbol of protobuf (google::protobuf::internal::GetEmptyStringAlreadyInited())
Other person reported the exactly same issue here: protocolbuffers/protobuf#6031
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, unfortunate. Arrow can build against both static and dynamic protobuf. Looking at it, even when dynamically linking to Protobuf, the problematic GetEmptyStringAlreadyInited symbol mentioned in that issue still appears in libarrow_flight.so.
One way would be to fold FlightSQL into libarrow_flight.so/libarrow_flight.a, that way there is no need to have two separate libraries that can both link Protobuf. (I wonder how you are supposed to be able to build multiple things relying on Protobuf though…)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, or really, just link all the FlightSQL sources into libarrow_flight.
Note that the FlightSQL protobuf sources need not be #included like protocol_internal.cc does - that's only done because we need to override things. In this case, they can just be specified in CMake like all the other sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @lidavidm , I tried folding FlightSql into libarrow_flight, but it didn't work... I tried these two approaches:
1 - include FlightSql on protocol_internal:
#include "arrow/flight/Flight.grpc.pb.cc" // NOLINT
#include "arrow/flight/Flight.pb.cc" // NOLINT
+ #include "arrow/flight/FlightSql.pb.cc" // NOLINTThis doesn't compile due to redefinition errors:
error: redefinition of 'const google::protobuf::internal::MigrationSchema schemas []'
2 - include FlightSql generated sources on CMakeLists:
set(ARROW_FLIGHT_SRCS
client.cc
client_cookie_middleware.cc
client_header_internal.cc
internal.cc
protocol_internal.cc
+ "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc"
serialization_internal.cc
server.cc
server_auth.cc
types.cc)This compiles, but causes the same error as before (duplicate protobuf Symbol GetEmptyStringAlreadyInited)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, was this merging only the generated Protobuf into libarrow_flight, or all the new files? I think it needs to be the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, as it stands it looks like we have to compile the generated sources into the test executable - so what does that mean for future library users? Will they have to generate and compile the sources into their own executables too? (It seems so, given the example CLI and server.) I don't think we can ship something like that - so we need to figure out the root issue. I'll try to take a look at this today as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to fix it for me, by avoiding linking Protobuf in multiple times:
@rafael-telles does this work for you?
diff --git a/cpp/src/arrow/flight/CMakeLists.txt b/cpp/src/arrow/flight/CMakeLists.txt
index 309e5a968..66776424e 100644
--- a/cpp/src/arrow/flight/CMakeLists.txt
+++ b/cpp/src/arrow/flight/CMakeLists.txt
@@ -27,11 +27,14 @@ endif()
if(ARROW_TEST_LINKAGE STREQUAL "static")
set(ARROW_FLIGHT_TEST_LINK_LIBS
- arrow_flight_static arrow_flight_testing_static ${ARROW_FLIGHT_STATIC_LINK_LIBS}
+ arrow_flight_static
+ arrow_flight_testing_static
+ ${ARROW_FLIGHT_STATIC_LINK_LIBS}
+ gRPC::grpc++
${ARROW_TEST_LINK_LIBS})
else()
set(ARROW_FLIGHT_TEST_LINK_LIBS arrow_flight_shared arrow_flight_testing_shared
- ${ARROW_TEST_LINK_LIBS})
+ gRPC::grpc++ ${ARROW_TEST_LINK_LIBS})
endif()
# TODO(wesm): Protobuf shared vs static linking
@@ -165,6 +168,7 @@ add_arrow_lib(arrow_flight
${ARROW_VERSION_SCRIPT_FLAGS} # Defined in cpp/arrow/CMakeLists.txt
SHARED_LINK_LIBS
arrow_shared
+ SHARED_PRIVATE_LINK_LIBS
${ARROW_FLIGHT_LINK_LIBS}
STATIC_LINK_LIBS
arrow_static
diff --git a/cpp/src/arrow/flight/flight_sql/CMakeLists.txt b/cpp/src/arrow/flight/flight_sql/CMakeLists.txt
index fd453ccc5..8fd852909 100644
--- a/cpp/src/arrow/flight/flight_sql/CMakeLists.txt
+++ b/cpp/src/arrow/flight/flight_sql/CMakeLists.txt
@@ -36,7 +36,8 @@ set_source_files_properties(${FLIGHT_SQL_GENERATED_PROTO_FILES} PROPERTIES GENER
add_custom_target(flight_sql_protobuf_gen ALL DEPENDS ${FLIGHT_SQL_GENERATED_PROTO_FILES})
-set(ARROW_FLIGHT_SQL_SRCS server.cc client.cc client_internal.cc)
+set(ARROW_FLIGHT_SQL_SRCS server.cc client.cc client_internal.cc
+ "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc")
add_arrow_lib(arrow_flight_sql
CMAKE_PACKAGE_NAME
@@ -51,10 +52,13 @@ add_arrow_lib(arrow_flight_sql
"$<$<COMPILE_LANGUAGE:CXX>:arrow/flight/flight_sql/pch.h>"
DEPENDENCIES
flight_sql_protobuf_gen
+ toolchain
SHARED_LINK_FLAGS
${ARROW_VERSION_SCRIPT_FLAGS} # Defined in cpp/arrow/CMakeLists.txt
SHARED_LINK_LIBS
arrow_flight_shared
+ SHARED_PRIVATE_LINK_LIBS
+ ${ARROW_PROTOBUF_LIBPROTOBUF}
STATIC_LINK_LIBS
arrow_flight_static)
@@ -75,7 +79,6 @@ target_link_libraries(flight_sql_test_app PRIVATE arrow_flight_sql_static
add_arrow_test(flight_sql_client_test
SOURCES
client_test.cc
- "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc"
STATIC_LINK_LIBS
${ARROW_FLIGHT_SQL_TEST_LINK_LIBS}
LABELS
@@ -85,7 +88,6 @@ add_arrow_test(flight_sql_server_test
SOURCES
server_test.cc
client_internal.cc
- "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc"
STATIC_LINK_LIBS
${ARROW_FLIGHT_SQL_TEST_LINK_LIBS}
LABELS
@@ -102,8 +104,7 @@ if(ARROW_BUILD_TESTS OR ARROW_BUILD_EXAMPLES)
example/sqlite_statement.cc
example/sqlite_statement_batch_reader.cc
example/sqlite_server.cc
- example/sqlite_tables_schema_batch_reader.cc
- "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc")
+ example/sqlite_tables_schema_batch_reader.cc)
target_link_libraries(flight_sql_test_server
PRIVATE arrow_flight_sql_static ${GFLAGS_LIBRARIES}
${SQLite3_LIBRARIES})
diff --git a/cpp/vcpkg.json b/cpp/vcpkg.json
index 11ce5250d..1aee71eb8 100644
--- a/cpp/vcpkg.json
+++ b/cpp/vcpkg.json
@@ -17,7 +17,9 @@
"benchmark",
"boost-filesystem",
"boost-multiprecision",
+ "boost-process",
"boost-system",
+ "boost-uuid",
"brotli",
"bzip2",
"c-ares",
@@ -40,8 +42,5 @@
"zlib",
"zstd"
],
- "overrides": [
- { "name": "gtest", "version": "1.10.0", "port-version": 4 }
- ],
- "builtin-baseline": "b5865bfeba430cd44063fc54a45a8861195a715f"
+ "builtin-baseline": "85b31152df8da18be5ec2b0dbd7b39b0d2404db4"
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sqlite3_stmt* sqlite_statement() const;?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this file has extension cpp, should have extension cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ARROW_ASSIGN_OR_RAISE is probably what you want here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also you can do something like ARROW_ASSIGN_OR_RAISE(*batch, first_batch->AddColumn(...));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ticket_parsed
|
Sorry, just a general note: I think we prefer underscores to dashes in the file naming so ideally it would be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to guard these definitions because flight_sql_test_app uses Googletest.
cpp/src/arrow/flight/flight-sql/example/sqlite_statement_batch_reader.cc
Outdated
Show resolved
Hide resolved
cpp/src/arrow/flight/flight-sql/example/sqlite_statement_batch_reader.cc
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use ARROW_FLIGHT_SQL_TEST_LINK_LIBS instead of hardcoding arrow_flight_sql_static
|
So far I've not been able to reproduce the problem with Protobuf, this is with a dynamically linked build where the generated protobuf sources are compiled into libarrow_flight_sql.so. (A fully statically linked build fails for me while building gRPC; a build where only libarrow_XXX.a are statically linked does pass tests.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on CMake 3.14+; also we should set REQUIRED to prevent errors later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of our CMake files don't require anything that new, so we might want to provide cmake_modules/FindSqlite3Alt.cmake or similar. (At the very least we need to declare that dependency here.)
|
Alright, I got a fully statically linked build and can't reproduce the Protobuf issue, with this diff: diff --git a/cpp/src/arrow/flight/flight-sql/CMakeLists.txt b/cpp/src/arrow/flight/flight-sql/CMakeLists.txt
index 7491896fe..f4b6a0546 100644
--- a/cpp/src/arrow/flight/flight-sql/CMakeLists.txt
+++ b/cpp/src/arrow/flight/flight-sql/CMakeLists.txt
@@ -36,7 +36,7 @@ set_source_files_properties(${FLIGHT_SQL_GENERATED_PROTO_FILES} PROPERTIES GENER
add_custom_target(flight_sql_protobuf_gen ALL DEPENDS ${FLIGHT_SQL_GENERATED_PROTO_FILES})
-set(ARROW_FLIGHT_SQL_SRCS server.cc)
+set(ARROW_FLIGHT_SQL_SRCS server.cc "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc")
add_arrow_lib(arrow_flight_sql
CMAKE_PACKAGE_NAME
@@ -69,8 +69,7 @@ else()
${ARROW_TEST_LINK_LIBS})
endif()
-add_executable(flight_sql_test_app test_app_cli.cc
- ${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc)
+add_executable(flight_sql_test_app test_app_cli.cc)
target_link_libraries(flight_sql_test_app PRIVATE arrow_flight_sql_static
${GFLAGS_LIBRARIES})
@@ -78,7 +77,6 @@ add_arrow_test(flight_sql_test
SOURCES
client_test.cc
server_test.cc
- "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc"
STATIC_LINK_LIBS
${ARROW_FLIGHT_SQL_TEST_LINK_LIBS}
LABELS
@@ -95,8 +93,7 @@ if(ARROW_BUILD_TESTS OR ARROW_BUILD_EXAMPLES)
example/sqlite_statement.cc
example/sqlite_statement_batch_reader.cc
example/sqlite_server.cc
- example/sqlite_tables_schema_batch_reader.cpp
- "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc")
+ example/sqlite_tables_schema_batch_reader.cpp)
target_link_libraries(flight_sql_test_server
PRIVATE arrow_flight_sql_static ${GFLAGS_LIBRARIES}
${SQLite3_LIBRARIES}) |
|
Hey @lidavidm ! |
|
I see, thanks. I'll try with some more build configurations. Do you have the output from CMake to share (if it's not sensitive)? It might help make sure I have my setup/dependencies the same way as yours. |
|
Sure! Here it is: https://gist.github.com/rafael-telles/14e39ae4d8856e78b5517bf69960843d Thank you so much for helping us with this issue :) |
|
Hey @lidavidm , we managed to remove the templating on client! Can you please take a look? Still need to fix the protobuf issue |
|
Yes - sorry, I've been tied up this week but will get back to this ASAP. Thanks for addressing things. I also want to look at the Protobuf issue still. |
|
That's OK! Now that we addresses most of the other comments I will try to solve this too. Thank you so much! |
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the updates. I think we're getting there. Here's a second round of feedback.
I didn't look too closely at a lot of the more boilerplate-y parts - I'll give those a closer look when I can. (For instance: lots of the any.Is<Foo>() dispatching).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable doesn't appear to ever be set - is it needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Or was it meant to be ARROW_FLIGHT_LINK_LIBS?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: try not to include "arrow/api.h" in a public header as it can slow compilation significantly - either include the headers needed, using type_fwd.h where possible, or forward-declare things as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you should need only arrow/status.h from a quick glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Though, please add <string>, <memory>, etc. for std::string, std::shared_ptr, std::unique_ptr, et. al.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should note that it's best to explicitly close the statement or else errors can't be caught.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we should parse the schemas up front in the constructor, then make these const getters like std::shared_ptr<Schema> result_set_schema() const;. Since we need to handle errors, we could pImpl PreparedStatement and handle it in a static factory, or do the parsing in Prepare.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also if we do the parsing in Prepare I think we can get rid of having to pass the Protobuf to the constructor here, which obviates the comment about not exposing the Protobuf types in the public header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, hmm. Is this legal? It feels like we're violating ODR…though it certainly appears to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @lidavidm ! Can you please clarify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're defining FlightClientImpl twice and just using the fact that only one definition is visible at a time, trusting that the compiler/linker do the right thing. I'm not sure we can rely on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect that the object files for client.cc and client_test.cc both contain symbols for FlightClientImpl, and it just so happens that the linker picks the mock when compiling the test binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, my understanding of this is shaky and maybe there is a good justification for why this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry - I want to try some alternative approaches but haven't found the time. At this rate it may be Friday before I can get into things…)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sorry, can we avoid using FlightClientImpl like this? I don't think this will be maintainable/I think this will be brittle. We can do something like the following:
-
Make
FlightSqlClientan abstract base class, with an implementationtemplate <typename Client> class FlightSqlClientImpl : public FlightSqlClient { ... };inclient_internal.h. Then inclient.ccwe can define a factory method that instantiates and returnsFlightSqlClientImpl<FlightClient>and inclient_test.ccwe can instantiateFlightSqlClientImpl<MockedFlightClient>.This way the parameterization on the client impl is explicit and we are not relying on linker magic.
-
Alternatively, instead of mocking, we can just implement in-process dummy servers to provide the expected reply. This is definitely more code but avoids having to parameterize the client.
-
Alternatively, we can extract a
FlightClientInterfaceor refactorFlightClientso it has virtual methods and is directly mockable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @lidavidm , thank you for the suggestions!
I think we can do similar to the first suggestion:
Make FlightSqlClient as...
class ARROW_EXPORT FlightSqlClient {
private:
std::shared_ptr<FlightClient> client_;
public:
explicit FlightSqlClient(std::shared_ptr<FlightClient> client);
~FlightSqlClient();
virtual Status GetFlightInfo(const FlightCallOptions& options,
const FlightDescriptor& descriptor,
std::unique_ptr<FlightInfo>* info) {
return client_->GetFlightInfo(options, descriptor, info);
}
// Do the same with other FlightClient methods calledThen we can mock these virtual methods(only the ones we simply forward to FlightClient) from FlightSqlClient instead of FlightClient/FlightClientImpl on client_test.cc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea! They could be made protected methods too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done @lidavidm !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^still wondering about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry if it wasn't clear - we need to do delete or use unique_ptr, since AddressSanitizer-enabled builds will fail here since we're pairing new and free.
|
It looks like we'll have to merge the Java PR first when time comes since the current Java Protobuf setup can't handle the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that Doxygen complains here since only one of the parameters is documented.
|
Regarding the Protobuf issue: I switched to vcpkg as that appears to be what you use, but I can't replicate still. What OS and compiler are you using? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: sql_client
@lidavidm , yes, I am using vcpkg. Thanks for your help @lidavidm , please let me know if any other info is helpful! Working on the other comments |
* Make FlightSqlClient::GetFlightInfo return a Result instead of Status * Make most methods on FlightSqlServerBase to return a Result instead of Status * Move private functions on server.cc to anonymous namespace * Fixes on doxygen and better readability on server_test.cc * Rename fields on client_test.cc to follow the convention
* Make other methods from SQLite server example to return arrow::Result instead of Status * Fix bug for null values in numeric columns on SQLite server example * Add comment regarding to performance on sqlite_statement_batch_reader * Separate GetSqlInfoResultMap from sqlite_server.h * Remove unused parameter on DoPutCommandStatementUpdate
…#220) * Make sure to wait for server to be ready before running tests * Start server independently for each test * Use unique_ptr for server thread on server_test.cc
* Make other methods from SQLite server example to return arrow::Result instead of Status * Fix bug for null values in numeric columns on SQLite server example * Structure catalog-schema-table tuple on a TableRef struct on client * Rename 'schema' to 'db_schema' * Use TableRef struct on server.cc * Undo renaming db_schema_filter_pattern
480c769 to
2b3839a
Compare
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think things are pretty much done here. @pitrou anything else?
| TEST_F(TestFlightSqlServer, TestCommandGetSqlInfoNoInfo) { | ||
| FlightCallOptions call_options; | ||
| ASSERT_OK_AND_ASSIGN(auto flight_info, sql_client->GetSqlInfo(call_options, {999999})); | ||
| auto result = sql_client->DoGet(call_options, flight_info->endpoints()[0].ticket); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: you can make this ASSERT_RAISES_WITH_MESSAGE_THAT(KeyError, ::testing::HasSubstr(...), sql_client->DoGet(...));
…dGetSqlInfoNoInfo
… SQLiteFlightSqlServer::Impl and remove dependency of boost-uuid (apache#221) * Use pimpl idiom on sqlite_server and add comments on protobuf file * Correctly implement impl pattern
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question from me.
| {"acos", "acosh", "asin", "asinh", "atan", "atan2", "atanh", "ceil", | ||
| "ceiling", "cos", "cosh", "degrees", "exp", "floor", "ln", "log", | ||
| "log", "log10", "log2", "mod", "pi", "pow", "power", "radians", | ||
| "sin", "sinh", "sqrt", "tan", "tanh", "trunc"}))}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious: any particular reason these are lowercase, unlike the other entries here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particular reason! Changed to keep consistency
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits, then let's start the vote. Also, don't forget to sync Protobuf changes to the Java PR.
format/FlightSql.proto
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @rafael-telles this and https://github.com/apache/arrow/pull/11507/files#r756031574 should be added to the docstrings here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raising this again - iequals is used below, but not lexical_cast
|
Done @lidavidm ! Sorry I forgot those two. Will update the Java PR now |
|
Looks like all tests are passing here and on the Java PR! Thanks to Rafael & co. for iterating on things and to @pitrou, @emkornfield, &co. for reviews. Unless there are further comments, I'll submit the vote to the ML later today (EST). |
Implement Flight SQL in C++ and Java, and add the protocol definitions. This is a combination of multiple pull requests, merged into one branch before merging into master. Closes apache#10906 (ARROW-12922). Closes apache#11507 (ARROW-14421). Closes apache#11989 (ARROW-15112). Closes apache#12021 (ARROW-15187). Closes apache#12035 (ARROW-15198). Closes apache#12013 from apache/flight-sql Lead-authored-by: Rafael Telles <[email protected]> Co-authored-by: Abner Eduardo Ferreira <[email protected]> Co-authored-by: James Duong <[email protected]> Co-authored-by: Jose Almeida <[email protected]> Co-authored-by: Juscelino Junior <[email protected]> Co-authored-by: Kyle Porter <[email protected]> Co-authored-by: Ryan Nicholson <[email protected]> Co-authored-by: Vinicius Fraga <[email protected]> Co-authored-by: tifflhl <[email protected]> Signed-off-by: David Li <[email protected]>
Introduction
This experimental PR implements Flight SQL for C++, which formalizes SQL semantics on top of Flight. This follows the same functionalities as in the Java implementation (#10906)
An Overview of this PR
This PR adds a new module within Flight called flight-sql (located on
cpp/src/arrow/flight/flight-sql).flight-sql is a new Flight API that provides a standard way for clients and servers to communicate with SQL-like semantics.
Like other Flight APIs, flight-sql does not provide implementation details that dictate how a client and server communicates with each other, it simply provides the SQL semantics and apply them onto the Flight API.
A Walkthrough of the New Module
FlightSql.proto introduces new SQL protobuf objects (the same as in #10906).
FlightSqlClient (client.h) introduces a new wrapper for a FlightClient that adds the Flight SQL semantics on the client side.
FlightSqlServerBase (server.h) introduces a new FlightServerBase API that adapts classic Flight requests into SQL operations.
SQLiteFlightSqlServer (example/sqlite_server.h) is a sample Flight SQL server implementation using an in-memory SQLite as the underlying database.
FlightSqlClient is fully implemented.
FlightSqlServerBase and the example server are in progress (only missing the GetSqlInfo command) and will be finished very soon.