Skip to content

Conversation

@rtpsw
Copy link
Contributor

@rtpsw rtpsw commented May 25, 2022

Replacing #13214

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like some (presumably unintentional) submodule changes snuck in with the rebase.

@rtpsw
Copy link
Contributor Author

rtpsw commented May 26, 2022

Looks like some (presumably unintentional) submodule changes snuck in with the rebase.

Sorry, I had a bit of a mess there. Hopefully the recent commit fixes this.

@rtpsw rtpsw requested a review from westonpace May 26, 2022 15:24
Copy link
Collaborator

@vibhatha vibhatha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have committed submodule files. May be updating submodule would be better.

@rtpsw
Copy link
Contributor Author

rtpsw commented May 31, 2022

For some reason, the commit didn't come out right even though my local repo looks OK. I'll look into it a bit later.

@rtpsw
Copy link
Contributor Author

rtpsw commented Jun 1, 2022

I tried to find out, but I don't know why the "files changed" keeps showing 4 files under testing. The links for these 4 files lead to empty Github comparisons. Moreover, they are not part of my local repo nor part of the recent commit (even in full view). Generally, I couldn't find a way to get rid of them by committing to this PR, and they might be due to some GitHub weirdness. Are these files really going to be part of the contribution? If you know how to fix this, please let me know how. The only fix I can think of is opening a replacement PR.

@westonpace
Copy link
Member

From the base directory run:

git submodule update --init --recursive
# Note, at this point you will have some staged changes that you want to commit and
# some unstaged changes that you do not want to commit.  So do not run `git add`
git commit -m "Restored submodules"
git submodule update
# At this point you should have no unstaged or staged changes and the submodules should
# be in sync with master

I'm going to try and get some time to review this in the evening and I'll update this for you if you hadn't gotten a chance to get to it by then.

In the future I recommend rebasing master instead of merging master. However, that won't help avoid this situation.

What happened is someone updated the submodules on master. When you merged master it did not update the submodules (submodules are just a pain to work with). I think, after you rebase/merge master, you need to run:

git submodule update

Otherwise, if you just stage those submodule changes and commit them it will reset the submodules back to what they were before.

@vibhatha
Copy link
Collaborator

vibhatha commented Jun 2, 2022

From the base directory run:

git submodule update --init --recursive
# Note, at this point you will have some staged changes that you want to commit and
# some unstaged changes that you do not want to commit.  So do not run `git add`
git commit -m "Restored submodules"
git submodule update
# At this point you should have no unstaged or staged changes and the submodules should
# be in sync with master

I'm going to try and get some time to review this in the evening and I'll update this for you if you hadn't gotten a chance to get to it by then.

In the future I recommend rebasing master instead of merging master. However, that won't help avoid this situation.

What happened is someone updated the submodules on master. When you merged master it did not update the submodules (submodules are just a pain to work with). I think, after you rebase/merge master, you need to run:

git submodule update

Otherwise, if you just stage those submodule changes and commit them it will reset the submodules back to what they were before.

thank you @westonpace for the detailed answer. It is always best to keep the submodule up-to-date and not commit the submodule files to the PR.

@rtpsw
Copy link
Contributor Author

rtpsw commented Jun 2, 2022

Thanks @westonpace. To avoid this going forward, perhaps git submodule update, or at least a check that it is not needed, should be a pre-commit operation?

@rtpsw
Copy link
Contributor Author

rtpsw commented Jun 2, 2022

git submodule update --init --recursive
# Note, at this point you will have some staged changes that you want to commit and
# some unstaged changes that you do not want to commit.  So do not run `git add`
git commit -m "Restored submodules"
git submodule update
# At this point you should have no unstaged or staged changes and the submodules should
# be in sync with master

I tried this just in case but, as expected, git said there was nothing to commit after git submodule update --init --recursive. As noted above, the issue is not in my local repo, which is in a good state. Though not probable, the issue might be in GitHub or how it presents this PR, which doesn't look consistent (see links in my note).

To move forward, if you're not happy with the state of the testing submodule in this PR, I'd start a fresh one because I don't seem to have a way to fix this one.

@vibhatha
Copy link
Collaborator

vibhatha commented Jun 2, 2022

git submodule update --init --recursive
# Note, at this point you will have some staged changes that you want to commit and
# some unstaged changes that you do not want to commit.  So do not run `git add`
git commit -m "Restored submodules"
git submodule update
# At this point you should have no unstaged or staged changes and the submodules should
# be in sync with master

I tried this just in case but, as expected, git said there was nothing to commit after git submodule update --init --recursive. As noted above, the issue is not in my local repo, which is in a good state. Though not probable, the issue might be in GitHub or how it presents this PR, which doesn't look consistent (see links in my note).

To move forward, if you're not happy with the state of the testing submodule in this PR, I'd start a fresh one because I don't seem to have a way to fix this one.

May be this could help: https://stackoverflow.com/questions/8191299/update-a-submodule-to-the-latest-commit

Also I guess, you may need to remove the committed files first and then checkout the version of submodule used by the project. In that case you may not need to commit any thing.

@rtpsw
Copy link
Contributor Author

rtpsw commented Jun 2, 2022

May be this could help: https://stackoverflow.com/questions/8191299/update-a-submodule-to-the-latest-commit

I tried this and committed to this PR. I still see 2 files under testing.

@vibhatha
Copy link
Collaborator

vibhatha commented Jun 2, 2022

@rtpsw I created a test-PR to show this. I didn't want to commit to this branch without verifying it.

#13297

What I did was, I cd into the testing submodule.

Then checkout the commit that is used by the Arrow master branch: https://github.com/apache/arrow-testing/tree/53b498047109d9940fcfab388bd9d6aeb8c57425

Inside the testing I did the following

git checkout 53b498047109d9940fcfab388bd9d6aeb8c57425

Then commit these changes to this branch.

It should flush the files. Because in theory there is no difference between the current git diff of testing vs the checked out branch (which is the upstream's testing modules version).

cc @westonpace

@rtpsw
Copy link
Contributor Author

rtpsw commented Jun 2, 2022

I think it worked, thanks.

Copy link
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test cases. Just a few minor nitpicks and we can merge.

ARROW_ENGINE_EXPORT Result<std::shared_ptr<Buffer>> SerializeJsonPlan(
const std::string& substrait_json);

/// \brief Makes a nested registry with the default registry as parent.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// \brief Makes a nested registry with the default registry as parent.
/// \brief Make a nested registry with the default registry as parent.

/// Note: Function support is currently very minimal, see ARROW-15538
ARROW_ENGINE_EXPORT ExtensionIdRegistry* default_extension_id_registry();

/// \brief Makes a nested registry with a given parent.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// \brief Makes a nested registry with a given parent.
/// \brief Make a nested registry with a given parent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment still needs addressed. Then we can merge.

}
Status RegisterFunction(Id id, std::string arrow_function_name) override {
return parent_->CanRegisterFunction(id, arrow_function_name) &
ExtensionIdRegistryImpl::RegisterFunction(id, arrow_function_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ExtensionIdRegistryImpl::RegisterFunction(id, arrow_function_name);
ExtensionIdRegistryImpl::RegisterFunction(id, std::move(arrow_function_name));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not make this change per above discussion.

*names_.emplace(id.name.to_string()).first};
Status RegisterType(Id id, std::shared_ptr<DataType> type) override {
return parent_->CanRegisterType(id, type) &
ExtensionIdRegistryImpl::RegisterType(id, type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ExtensionIdRegistryImpl::RegisterType(id, type);
ExtensionIdRegistryImpl::RegisterType(id, std::move(type));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this change but it causes a segmentation fault with the following gdb output:

$  gdb --args ./release/arrow-substrait-substrait-test 
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./release/arrow-substrait-substrait-test...
(gdb) run
Starting program: /mnt/user1/tscontract/github/rtpsw/arrow/cpp/build/release/release/arrow-substrait-substrait-test 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff45ff700 (LWP 156004)]
Running main() from /build/googletest-j5yxiC/googletest-1.10.0/googletest/src/gtest_main.cc
[==========] Running 33 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 4 tests from ExtensionIdRegistryTest
[ RUN      ] ExtensionIdRegistryTest.RegisterTempTypes

Thread 1 "arrow-substrait" received signal SIGSEGV, Segmentation fault.
0x00007ffff65199dc in arrow::DataType::Hash() const () from /mnt/user1/tscontract/github/rtpsw/arrow/cpp/build/release/release/libarrow.so.900
(gdb) bt
#0  0x00007ffff65199dc in arrow::DataType::Hash() const () from /mnt/user1/tscontract/github/rtpsw/arrow/cpp/build/release/release/libarrow.so.900
#1  0x00007ffff7aa420c in std::_Hashtable<arrow::DataType const*, std::pair<arrow::DataType const* const, int>, std::allocator<std::pair<arrow::DataType const* const, int> >, std::__detail::_Select1st, arrow::engine::(anonymous namespace)::TypePtrHashEq, arrow::engine::(anonymous namespace)::TypePtrHashEq, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::find(arrow::DataType const* const&) const ()
   from /mnt/user1/tscontract/github/rtpsw/arrow/cpp/build/release/release/libarrow_substrait.so.900
#2  0x00007ffff7aa7b7a in arrow::engine::(anonymous namespace)::ExtensionIdRegistryImpl::CanRegisterType(arrow::engine::ExtensionIdRegistry::Id, std::shared_ptr<arrow::DataType> const&) const ()
   from /mnt/user1/tscontract/github/rtpsw/arrow/cpp/build/release/release/libarrow_substrait.so.900
#3  0x00007ffff7aa9a1c in arrow::engine::(anonymous namespace)::NestedExtensionIdRegistryImpl::RegisterType(arrow::engine::ExtensionIdRegistry::Id, std::shared_ptr<arrow::DataType>) ()
   from /mnt/user1/tscontract/github/rtpsw/arrow/cpp/build/release/release/libarrow_substrait.so.900
#4  0x00005555555953da in arrow::engine::ExtensionIdRegistryTest_RegisterTempTypes_Test::TestBody() ()
#5  0x00005555555ff3d1 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (location=0x5555556186bf "the test body", method=<optimized out>, object=0x555555670140)
    at ./googletest/src/gtest.cc:2414
#6  testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=object@entry=0x555555670140, method=<optimized out>, location=location@entry=0x5555556186bf "the test body")
    at ./googletest/src/gtest.cc:2469
#7  0x00005555555f3756 in testing::Test::Run (this=0x555555670140) at ./googletest/src/gtest.cc:2508
#8  testing::Test::Run (this=0x555555670140) at ./googletest/src/gtest.cc:2498
#9  0x00005555555f38b5 in testing::TestInfo::Run (this=0x5555556749f0) at ./googletest/src/gtest.cc:2684
#10 testing::TestInfo::Run (this=0x5555556749f0) at ./googletest/src/gtest.cc:2657
#11 0x00005555555f399d in testing::TestSuite::Run (this=0x5555556731f0) at ./googletest/src/gtest.cc:2816
#12 testing::TestSuite::Run (this=0x5555556731f0) at ./googletest/src/gtest.cc:2795
#13 0x00005555555f3ebc in testing::internal::UnitTestImpl::RunAllTests (this=0x55555565bec0) at /usr/include/c++/9/bits/stl_vector.h:1040
#14 0x00005555555ff941 in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (
    location=0x555555619a78 "auxiliary test code (environments or event listeners)", method=<optimized out>, object=0x55555565bec0) at ./googletest/src/gtest.cc:2414
#15 testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x55555565bec0, method=<optimized out>, 
    location=location@entry=0x555555619a78 "auxiliary test code (environments or event listeners)") at ./googletest/src/gtest.cc:2469
#16 0x00005555555f40ec in testing::UnitTest::Run (this=0x55555563f560 <testing::UnitTest::GetInstance()::instance>) at ./googletest/include/gtest/gtest.h:1412
#17 0x0000555555594274 in main () at /usr/include/c++/9/ext/new_allocator.h:89

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's odd. Do you know what compiler you are using?

Copy link
Contributor Author

@rtpsw rtpsw Jun 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced there is something unexpected here. Presumably, the use of std::move on a std::shared_ptr<DataType> invalidates the caller's copy, which leads to a segmentation fault when the caller tries to access it. AFAICS, passing or returning a shared_ptr by value is normal in Arrow, so it should be fine here too.

To your question, I think my build is using gcc, for which I have version gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be the first to admit that there is a good possibility I'm just plain wrong or missing something 😆 so I apologize for the trouble in advance. However, I think you are correct, that a std::move is not allowed here. After doing some research though, I think your explanation is slightly off.

The function argument is (correctly, IMO) defined as RegisterType(Id id, std::shared_ptr<DataType> type) and not RegisterType(Id id, const std::shared_ptr<DataType>& type) so this function is explicitly requesting its own copy of the shared_ptr.

So nothing we do here should be capable of invalidating the caller's copy.

Instead I think this is a consequence of using & as the status coalescing operator and not &&. I believe it was you that pointed out a while back that this is a little odd. Perhaps it is not just odd but actually semantically incorrect as well. It turns out that && defines a "sequence point" (https://en.cppreference.com/w/cpp/language/eval_order) that prevents the compiler from reordering statements.

What I think is happening is the compiler is choosing to reorder the call to ExtensionIdRegistryImpl::RegisterType(id, type) before the call to parent_->CanRegisterType(id, type). When I first made this comment I thought that sort of reordering was illegal and thus a bug. However, after reading about sequence points, it seems that reordering is perfectly legal and so using std::move is indeed incorrect here (although this is more motivation to perhaps change the operator to && but that does not need to be a concern for this PR).

So, thank you for bearing with me, I consider this comment resolved. I'll follow up on the &/&& in a separate JIRA to see how others feel about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, great explanation!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future reference, here is the discussion about &/&& you are referring to.

ASSERT_EQ(*e.type, *typerec.type);
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add failed lookup checks here...

ASSERT_FALSE(registry->GetType(kNonExistentId));
ASSERT_FALSE(registry->GetType(kNonExistentType));

ASSERT_EQ(id, funcrec.id);
ASSERT_EQ(name, funcrec.function_name);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add failed lookup checks here too?

@westonpace
Copy link
Member

Odd that CI isn't running all of the tests 😕

@rtpsw
Copy link
Contributor Author

rtpsw commented Jun 2, 2022

Odd that CI isn't running all of the tests 😕

Let's see if it gets back to normal on the next commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants