Skip to content

Conversation

@rafael-telles
Copy link
Contributor

Introduction
This experimental PR implements Flight SQL for C++, which formalizes SQL semantics on top of Flight. This follows the same functionalities as in the Java implementation (#10906)

An Overview of this PR
This PR adds a new module within Flight called flight-sql (located on cpp/src/arrow/flight/flight-sql).
flight-sql is a new Flight API that provides a standard way for clients and servers to communicate with SQL-like semantics.
Like other Flight APIs, flight-sql does not provide implementation details that dictate how a client and server communicates with each other, it simply provides the SQL semantics and apply them onto the Flight API.

A Walkthrough of the New Module
FlightSql.proto introduces new SQL protobuf objects (the same as in #10906).
FlightSqlClient (client.h) introduces a new wrapper for a FlightClient that adds the Flight SQL semantics on the client side.
FlightSqlServerBase (server.h) introduces a new FlightServerBase API that adapts classic Flight requests into SQL operations.
SQLiteFlightSqlServer (example/sqlite_server.h) is a sample Flight SQL server implementation using an in-memory SQLite as the underlying database.

FlightSqlClient is fully implemented.
FlightSqlServerBase and the example server are in progress (only missing the GetSqlInfo command) and will be finished very soon.

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW

Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename pull request title in the following format?

ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@rafael-telles
Copy link
Contributor Author

@kylep-dremio

@rafael-telles rafael-telles changed the title [C++] Implement Flight SQL [ARROW-14421] [C++] Implement Flight SQL Oct 21, 2021
@lidavidm lidavidm self-requested a review October 21, 2021 16:43
@kylepbit
Copy link

@lidavidm - was just about to ping you on this but see that you already noticed it. Note that the C++ Server is missing a small bit of functionality but the client is there, so we should be in good shape to be near a vote.

@rafael-telles rafael-telles changed the title [ARROW-14421] [C++] Implement Flight SQL ARROW-14421: [C++] Implement Flight SQL Oct 21, 2021
@github-actions
Copy link

@rafael-telles
Copy link
Contributor Author

@lidavidm , just saw the linter pointed some code style issues, already fixed them

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for also implementing the server & a server example!

Most of these comments are just style/naming/Arrow codebase conventions. The main thing I'm hoping we can resolve is the templating of the client - I added some suggestions there. The other thing is whether we want to wrap Protobuf types, which I realize would be tedious, but so far we've avoided leaking Protobuf as part of the Arrow APIs and I think we may want to preserve that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no new gRPC definitions, right? We don't need to generate gRPC sources (and we don't need to depend on grpc_cpp_plugin above)

Comment on lines 43 to 47
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks unused here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should guard these definitions behind ARROW_BUILD_TESTS or ARROW_BUILD_EXAMPLES or similar, so that we don't require everyone to have SQLite3 installed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to guard these definitions because flight_sql_test_app uses Googletest.

Comment on lines 89 to 90
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these not already in ARROW_TEST_LINK_LIBS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems odd that the test has to be separately linked to the generated Protobuf sources.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this go in ARROW_FLIGHT_SQL_SRCS above?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The analogous files are omitted for Flight since they get #included into another file, but that doesn't apply here.)

Copy link
Contributor Author

@rafael-telles rafael-telles Oct 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very strange bug occurred when we tried to link them together and also when we tried following the same approach as Flight (through protocol_internal.cc), the protobuf objects were not behaving correctly when a message had multiple string fields, my understanding of what went wrong is this:

The problem was that as Arrow Flight implicitly static-links Protobuf (when compiling protocol_internal.cc), Arrow Flight SQL was linking it too by the same means and this ended up causing a one definition rule violation on an internal symbol of protobuf (google::protobuf::internal::GetEmptyStringAlreadyInited())

Other person reported the exactly same issue here: protocolbuffers/protobuf#6031

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, unfortunate. Arrow can build against both static and dynamic protobuf. Looking at it, even when dynamically linking to Protobuf, the problematic GetEmptyStringAlreadyInited symbol mentioned in that issue still appears in libarrow_flight.so.

One way would be to fold FlightSQL into libarrow_flight.so/libarrow_flight.a, that way there is no need to have two separate libraries that can both link Protobuf. (I wonder how you are supposed to be able to build multiple things relying on Protobuf though…)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or really, just link all the FlightSQL sources into libarrow_flight.

Note that the FlightSQL protobuf sources need not be #included like protocol_internal.cc does - that's only done because we need to override things. In this case, they can just be specified in CMake like all the other sources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lidavidm , I tried folding FlightSql into libarrow_flight, but it didn't work... I tried these two approaches:

1 - include FlightSql on protocol_internal:

  #include "arrow/flight/Flight.grpc.pb.cc"  // NOLINT
  #include "arrow/flight/Flight.pb.cc"       // NOLINT
+ #include "arrow/flight/FlightSql.pb.cc"    // NOLINT

This doesn't compile due to redefinition errors:

error: redefinition of 'const google::protobuf::internal::MigrationSchema schemas []'

2 - include FlightSql generated sources on CMakeLists:

  set(ARROW_FLIGHT_SRCS
      client.cc
      client_cookie_middleware.cc
      client_header_internal.cc
      internal.cc
      protocol_internal.cc
+     "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc"
      serialization_internal.cc
      server.cc
      server_auth.cc
      types.cc)

This compiles, but causes the same error as before (duplicate protobuf Symbol GetEmptyStringAlreadyInited)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, was this merging only the generated Protobuf into libarrow_flight, or all the new files? I think it needs to be the latter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, as it stands it looks like we have to compile the generated sources into the test executable - so what does that mean for future library users? Will they have to generate and compile the sources into their own executables too? (It seems so, given the example CLI and server.) I don't think we can ship something like that - so we need to figure out the root issue. I'll try to take a look at this today as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to fix it for me, by avoiding linking Protobuf in multiple times:

@rafael-telles does this work for you?

diff --git a/cpp/src/arrow/flight/CMakeLists.txt b/cpp/src/arrow/flight/CMakeLists.txt
index 309e5a968..66776424e 100644
--- a/cpp/src/arrow/flight/CMakeLists.txt
+++ b/cpp/src/arrow/flight/CMakeLists.txt
@@ -27,11 +27,14 @@ endif()
 
 if(ARROW_TEST_LINKAGE STREQUAL "static")
   set(ARROW_FLIGHT_TEST_LINK_LIBS
-      arrow_flight_static arrow_flight_testing_static ${ARROW_FLIGHT_STATIC_LINK_LIBS}
+      arrow_flight_static
+      arrow_flight_testing_static
+      ${ARROW_FLIGHT_STATIC_LINK_LIBS}
+      gRPC::grpc++
       ${ARROW_TEST_LINK_LIBS})
 else()
   set(ARROW_FLIGHT_TEST_LINK_LIBS arrow_flight_shared arrow_flight_testing_shared
-                                  ${ARROW_TEST_LINK_LIBS})
+                                  gRPC::grpc++ ${ARROW_TEST_LINK_LIBS})
 endif()
 
 # TODO(wesm): Protobuf shared vs static linking
@@ -165,6 +168,7 @@ add_arrow_lib(arrow_flight
               ${ARROW_VERSION_SCRIPT_FLAGS} # Defined in cpp/arrow/CMakeLists.txt
               SHARED_LINK_LIBS
               arrow_shared
+              SHARED_PRIVATE_LINK_LIBS
               ${ARROW_FLIGHT_LINK_LIBS}
               STATIC_LINK_LIBS
               arrow_static
diff --git a/cpp/src/arrow/flight/flight_sql/CMakeLists.txt b/cpp/src/arrow/flight/flight_sql/CMakeLists.txt
index fd453ccc5..8fd852909 100644
--- a/cpp/src/arrow/flight/flight_sql/CMakeLists.txt
+++ b/cpp/src/arrow/flight/flight_sql/CMakeLists.txt
@@ -36,7 +36,8 @@ set_source_files_properties(${FLIGHT_SQL_GENERATED_PROTO_FILES} PROPERTIES GENER
 
 add_custom_target(flight_sql_protobuf_gen ALL DEPENDS ${FLIGHT_SQL_GENERATED_PROTO_FILES})
 
-set(ARROW_FLIGHT_SQL_SRCS server.cc client.cc client_internal.cc)
+set(ARROW_FLIGHT_SQL_SRCS server.cc client.cc client_internal.cc
+                          "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc")
 
 add_arrow_lib(arrow_flight_sql
               CMAKE_PACKAGE_NAME
@@ -51,10 +52,13 @@ add_arrow_lib(arrow_flight_sql
               "$<$<COMPILE_LANGUAGE:CXX>:arrow/flight/flight_sql/pch.h>"
               DEPENDENCIES
               flight_sql_protobuf_gen
+              toolchain
               SHARED_LINK_FLAGS
               ${ARROW_VERSION_SCRIPT_FLAGS} # Defined in cpp/arrow/CMakeLists.txt
               SHARED_LINK_LIBS
               arrow_flight_shared
+              SHARED_PRIVATE_LINK_LIBS
+              ${ARROW_PROTOBUF_LIBPROTOBUF}
               STATIC_LINK_LIBS
               arrow_flight_static)
 
@@ -75,7 +79,6 @@ target_link_libraries(flight_sql_test_app PRIVATE arrow_flight_sql_static
 add_arrow_test(flight_sql_client_test
                SOURCES
                client_test.cc
-               "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc"
                STATIC_LINK_LIBS
                ${ARROW_FLIGHT_SQL_TEST_LINK_LIBS}
                LABELS
@@ -85,7 +88,6 @@ add_arrow_test(flight_sql_server_test
                SOURCES
                server_test.cc
                client_internal.cc
-               "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc"
                STATIC_LINK_LIBS
                ${ARROW_FLIGHT_SQL_TEST_LINK_LIBS}
                LABELS
@@ -102,8 +104,7 @@ if(ARROW_BUILD_TESTS OR ARROW_BUILD_EXAMPLES)
                  example/sqlite_statement.cc
                  example/sqlite_statement_batch_reader.cc
                  example/sqlite_server.cc
-                 example/sqlite_tables_schema_batch_reader.cc
-                 "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc")
+                 example/sqlite_tables_schema_batch_reader.cc)
   target_link_libraries(flight_sql_test_server
                         PRIVATE arrow_flight_sql_static ${GFLAGS_LIBRARIES}
                                 ${SQLite3_LIBRARIES})
diff --git a/cpp/vcpkg.json b/cpp/vcpkg.json
index 11ce5250d..1aee71eb8 100644
--- a/cpp/vcpkg.json
+++ b/cpp/vcpkg.json
@@ -17,7 +17,9 @@
     "benchmark",
     "boost-filesystem",
     "boost-multiprecision",
+    "boost-process",
     "boost-system",
+    "boost-uuid",
     "brotli",
     "bzip2",
     "c-ares",
@@ -40,8 +42,5 @@
     "zlib",
     "zstd"
   ],
-  "overrides": [
-    { "name": "gtest", "version": "1.10.0", "port-version": 4 }
-  ],
-  "builtin-baseline": "b5865bfeba430cd44063fc54a45a8861195a715f"
+  "builtin-baseline": "85b31152df8da18be5ec2b0dbd7b39b0d2404db4"
 }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sqlite3_stmt* sqlite_statement() const;?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this file has extension cpp, should have extension cc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARROW_ASSIGN_OR_RAISE is probably what you want here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you can do something like ARROW_ASSIGN_OR_RAISE(*batch, first_batch->AddColumn(...));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ticket_parsed

@lidavidm
Copy link
Member

Sorry, just a general note: I think we prefer underscores to dashes in the file naming so ideally it would be cpp/src/arrow/flight/flight_sql and not cpp/src/arrow/flight/flight-sql (there are a few exceptions due to vendored sources).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to guard these definitions because flight_sql_test_app uses Googletest.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use ARROW_FLIGHT_SQL_TEST_LINK_LIBS instead of hardcoding arrow_flight_sql_static

@lidavidm
Copy link
Member

So far I've not been able to reproduce the problem with Protobuf, this is with a dynamically linked build where the generated protobuf sources are compiled into libarrow_flight_sql.so. (A fully statically linked build fails for me while building gRPC; a build where only libarrow_XXX.a are statically linked does pass tests.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends on CMake 3.14+; also we should set REQUIRED to prevent errors later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of our CMake files don't require anything that new, so we might want to provide cmake_modules/FindSqlite3Alt.cmake or similar. (At the very least we need to declare that dependency here.)

@lidavidm
Copy link
Member

Alright, I got a fully statically linked build and can't reproduce the Protobuf issue, with this diff:

diff --git a/cpp/src/arrow/flight/flight-sql/CMakeLists.txt b/cpp/src/arrow/flight/flight-sql/CMakeLists.txt
index 7491896fe..f4b6a0546 100644
--- a/cpp/src/arrow/flight/flight-sql/CMakeLists.txt
+++ b/cpp/src/arrow/flight/flight-sql/CMakeLists.txt
@@ -36,7 +36,7 @@ set_source_files_properties(${FLIGHT_SQL_GENERATED_PROTO_FILES} PROPERTIES GENER
 
 add_custom_target(flight_sql_protobuf_gen ALL DEPENDS ${FLIGHT_SQL_GENERATED_PROTO_FILES})
 
-set(ARROW_FLIGHT_SQL_SRCS server.cc)
+set(ARROW_FLIGHT_SQL_SRCS server.cc "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc")
 
 add_arrow_lib(arrow_flight_sql
               CMAKE_PACKAGE_NAME
@@ -69,8 +69,7 @@ else()
                                       ${ARROW_TEST_LINK_LIBS})
 endif()
 
-add_executable(flight_sql_test_app test_app_cli.cc
-                                   ${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc)
+add_executable(flight_sql_test_app test_app_cli.cc)
 target_link_libraries(flight_sql_test_app PRIVATE arrow_flight_sql_static
                                                   ${GFLAGS_LIBRARIES})
 
@@ -78,7 +77,6 @@ add_arrow_test(flight_sql_test
         SOURCES
         client_test.cc
         server_test.cc
-        "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc"
         STATIC_LINK_LIBS
         ${ARROW_FLIGHT_SQL_TEST_LINK_LIBS}
         LABELS
@@ -95,8 +93,7 @@ if(ARROW_BUILD_TESTS OR ARROW_BUILD_EXAMPLES)
           example/sqlite_statement.cc
           example/sqlite_statement_batch_reader.cc
           example/sqlite_server.cc
-          example/sqlite_tables_schema_batch_reader.cpp
-          "${CMAKE_CURRENT_BINARY_DIR}/FlightSql.pb.cc")
+          example/sqlite_tables_schema_batch_reader.cpp)
   target_link_libraries(flight_sql_test_server
           PRIVATE arrow_flight_sql_static ${GFLAGS_LIBRARIES}
           ${SQLite3_LIBRARIES})

@rafael-telles
Copy link
Contributor Author

Hey @lidavidm !
I applied your diff here and it reproduced the issue when running server_test.cc:

[libprotobuf ERROR /home/rafael/simbiose/vcpkg/buildtrees/protobuf/src/23fa7edd52-3ba2225d30.clean/src/google/protobuf/descriptor.cc:3624] Invalid proto descriptor for file "google/protobuf/descriptor.proto":
[libprotobuf ERROR /home/rafael/simbiose/vcpkg/buildtrees/protobuf/src/23fa7edd52-3ba2225d30.clean/src/google/protobuf/descriptor.cc:3627]   google/protobuf/descriptor.proto: Unrecognized syntax: SELECT * FROM intTable
[libprotobuf ERROR /home/rafael/simbiose/vcpkg/buildtrees/protobuf/src/23fa7edd52-3ba2225d30.clean/src/google/protobuf/descriptor.cc:3624] Invalid proto descriptor for file "FlightSql.proto":
[libprotobuf ERROR /home/rafael/simbiose/vcpkg/buildtrees/protobuf/src/23fa7edd52-3ba2225d30.clean/src/google/protobuf/descriptor.cc:3627]   arrow.flight.protocol.sql.experimental: ".google.protobuf.MessageOptions" is not defined.
[libprotobuf FATAL /home/rafael/simbiose/vcpkg/buildtrees/protobuf/src/23fa7edd52-3ba2225d30.clean/src/google/protobuf/generated_message_reflection.cc:2457] CHECK failed: file != nullptr: 

@lidavidm
Copy link
Member

I see, thanks. I'll try with some more build configurations.

Do you have the output from CMake to share (if it's not sensitive)? It might help make sure I have my setup/dependencies the same way as yours.

@rafael-telles
Copy link
Contributor Author

Sure! Here it is: https://gist.github.com/rafael-telles/14e39ae4d8856e78b5517bf69960843d

Thank you so much for helping us with this issue :)

@rafael-telles
Copy link
Contributor Author

Hey @lidavidm , we managed to remove the templating on client! Can you please take a look?

Still need to fix the protobuf issue

@lidavidm
Copy link
Member

Yes - sorry, I've been tied up this week but will get back to this ASAP. Thanks for addressing things. I also want to look at the Protobuf issue still.

@rafael-telles
Copy link
Contributor Author

That's OK! Now that we addresses most of the other comments I will try to solve this too. Thank you so much!

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the updates. I think we're getting there. Here's a second round of feedback.

I didn't look too closely at a lot of the more boilerplate-y parts - I'll give those a closer look when I can. (For instance: lots of the any.Is<Foo>() dispatching).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable doesn't appear to ever be set - is it needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or was it meant to be ARROW_FLIGHT_LINK_LIBS?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: try not to include "arrow/api.h" in a public header as it can slow compilation significantly - either include the headers needed, using type_fwd.h where possible, or forward-declare things as needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you should need only arrow/status.h from a quick glance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Though, please add <string>, <memory>, etc. for std::string, std::shared_ptr, std::unique_ptr, et. al.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should note that it's best to explicitly close the statement or else errors can't be caught.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should parse the schemas up front in the constructor, then make these const getters like std::shared_ptr<Schema> result_set_schema() const;. Since we need to handle errors, we could pImpl PreparedStatement and handle it in a static factory, or do the parsing in Prepare.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if we do the parsing in Prepare I think we can get rid of having to pass the Protobuf to the constructor here, which obviates the comment about not exposing the Protobuf types in the public header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, hmm. Is this legal? It feels like we're violating ODR…though it certainly appears to work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lidavidm ! Can you please clarify?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're defining FlightClientImpl twice and just using the fact that only one definition is visible at a time, trusting that the compiler/linker do the right thing. I'm not sure we can rely on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect that the object files for client.cc and client_test.cc both contain symbols for FlightClientImpl, and it just so happens that the linker picks the mock when compiling the test binary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, my understanding of this is shaky and maybe there is a good justification for why this works.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry - I want to try some alternative approaches but haven't found the time. At this rate it may be Friday before I can get into things…)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sorry, can we avoid using FlightClientImpl like this? I don't think this will be maintainable/I think this will be brittle. We can do something like the following:

  • Make FlightSqlClient an abstract base class, with an implementation template <typename Client> class FlightSqlClientImpl : public FlightSqlClient { ... }; in client_internal.h. Then in client.cc we can define a factory method that instantiates and returns FlightSqlClientImpl<FlightClient> and in client_test.cc we can instantiate FlightSqlClientImpl<MockedFlightClient>.

    This way the parameterization on the client impl is explicit and we are not relying on linker magic.

  • Alternatively, instead of mocking, we can just implement in-process dummy servers to provide the expected reply. This is definitely more code but avoids having to parameterize the client.

  • Alternatively, we can extract a FlightClientInterface or refactor FlightClient so it has virtual methods and is directly mockable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lidavidm , thank you for the suggestions!
I think we can do similar to the first suggestion:

Make FlightSqlClient as...

class ARROW_EXPORT FlightSqlClient {
 private:
  std::shared_ptr<FlightClient> client_;

 public:
  explicit FlightSqlClient(std::shared_ptr<FlightClient> client);

  ~FlightSqlClient();

  virtual Status GetFlightInfo(const FlightCallOptions& options,
                       const FlightDescriptor& descriptor,
                       std::unique_ptr<FlightInfo>* info) {
    return client_->GetFlightInfo(options, descriptor, info);
  }
  
  // Do the same with other FlightClient methods called

Then we can mock these virtual methods(only the ones we simply forward to FlightClient) from FlightSqlClient instead of FlightClient/FlightClientImpl on client_test.cc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea! They could be made protected methods too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done @lidavidm !

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^still wondering about this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry if it wasn't clear - we need to do delete or use unique_ptr, since AddressSanitizer-enabled builds will fail here since we're pairing new and free.

@lidavidm
Copy link
Member

It looks like we'll have to merge the Java PR first when time comes since the current Java Protobuf setup can't handle the optional fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that Doxygen complains here since only one of the parameters is documented.

@lidavidm
Copy link
Member

Regarding the Protobuf issue: I switched to vcpkg as that appears to be what you use, but I can't replicate still. What OS and compiler are you using?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: sql_client

@rafael-telles
Copy link
Contributor Author

rafael-telles commented Nov 1, 2021

Regarding the Protobuf issue: I switched to vcpkg as that appears to be what you use, but I can't replicate still. What OS and compiler are you using?

@lidavidm , yes, I am using vcpkg.

Linux Mint 20.2, 
cmake version 3.16.3
g++ (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0

Thanks for your help @lidavidm , please let me know if any other info is helpful! Working on the other comments

* Make FlightSqlClient::GetFlightInfo return a Result instead of Status

* Make most methods on FlightSqlServerBase to return a Result instead of Status

* Move private functions on server.cc to anonymous namespace

* Fixes on doxygen and better readability on server_test.cc

* Rename fields on client_test.cc to follow the convention
* Make other methods from SQLite server example to return arrow::Result instead of Status

* Fix bug for null values in numeric columns on SQLite server example

* Add comment regarding to performance on sqlite_statement_batch_reader

* Separate GetSqlInfoResultMap from sqlite_server.h

* Remove unused parameter on DoPutCommandStatementUpdate
…#220)

* Make sure to wait for server to be ready before running tests

* Start server independently for each test

* Use unique_ptr for server thread on server_test.cc
* Make other methods from SQLite server example to return arrow::Result instead of Status

* Fix bug for null values in numeric columns on SQLite server example

* Structure catalog-schema-table tuple on a TableRef struct on client

* Rename 'schema' to 'db_schema'

* Use TableRef struct on server.cc

* Undo renaming db_schema_filter_pattern
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think things are pretty much done here. @pitrou anything else?

TEST_F(TestFlightSqlServer, TestCommandGetSqlInfoNoInfo) {
FlightCallOptions call_options;
ASSERT_OK_AND_ASSIGN(auto flight_info, sql_client->GetSqlInfo(call_options, {999999}));
auto result = sql_client->DoGet(call_options, flight_info->endpoints()[0].ticket);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: you can make this ASSERT_RAISES_WITH_MESSAGE_THAT(KeyError, ::testing::HasSubstr(...), sql_client->DoGet(...));

… SQLiteFlightSqlServer::Impl and remove dependency of boost-uuid (apache#221)

* Use pimpl idiom on sqlite_server and add comments on protobuf file

* Correctly implement impl pattern
@rafael-telles
Copy link
Contributor Author

@pitrou @lidavidm - now I think everything was addressed!

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question from me.

{"acos", "acosh", "asin", "asinh", "atan", "atan2", "atanh", "ceil",
"ceiling", "cos", "cosh", "degrees", "exp", "floor", "ln", "log",
"log", "log10", "log2", "mod", "pi", "pow", "power", "radians",
"sin", "sinh", "sqrt", "tan", "tanh", "trunc"}))},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious: any particular reason these are lowercase, unlike the other entries here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason! Changed to keep consistency

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some nits, then let's start the vote. Also, don't forget to sync Protobuf changes to the Java PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raising this again - iequals is used below, but not lexical_cast

@rafael-telles
Copy link
Contributor Author

Done @lidavidm ! Sorry I forgot those two.

Will update the Java PR now

@lidavidm
Copy link
Member

lidavidm commented Dec 7, 2021

Looks like all tests are passing here and on the Java PR! Thanks to Rafael & co. for iterating on things and to @pitrou, @emkornfield, &co. for reviews.

Unless there are further comments, I'll submit the vote to the ML later today (EST).

@lidavidm lidavidm closed this in 968e6ea Dec 23, 2021
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
Implement Flight SQL in C++ and Java, and add the protocol definitions.

This is a combination of multiple pull requests, merged into one branch before merging into master.

Closes apache#10906 (ARROW-12922).
Closes apache#11507 (ARROW-14421).
Closes apache#11989 (ARROW-15112).
Closes apache#12021 (ARROW-15187).
Closes apache#12035 (ARROW-15198).

Closes apache#12013 from apache/flight-sql

Lead-authored-by: Rafael Telles <[email protected]>
Co-authored-by: Abner Eduardo Ferreira <[email protected]>
Co-authored-by: James Duong <[email protected]>
Co-authored-by: Jose Almeida <[email protected]>
Co-authored-by: Juscelino Junior <[email protected]>
Co-authored-by: Kyle Porter <[email protected]>
Co-authored-by: Ryan Nicholson <[email protected]>
Co-authored-by: Vinicius Fraga <[email protected]>
Co-authored-by: tifflhl <[email protected]>
Signed-off-by: David Li <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants