Skip to content

Commit b515b1f

Browse files
committed
Two-level hashing of std::type_info * via pointer & string hash (fixes #283)
Exchange of RTTI type information between separately compiled extension libraries tends to be fragile: depending on the compiler and platform, the C++ standard library may use one of two strategies to decide whether two ``std::type_info`` instances are equal. The first is to perform a string comparison of the mangled type name. When types are organized in hash tables, a string hash is then also needed. This strategy yields the expected result but can be rather inefficient. Other platforms simply compare the pointer value and rely on a poiner-based hashing scheme. This is far more efficient but requires that the linker merges duplicate RTTI information from separate shared libraries. Unfortunately, this does not always work for the following reasons: 1. Python passes the ``RTLD_LOCAL`` flag to ``dlopen()`` when it loads shared libraries. If RTTI symbols are exported by a separate shared library, then things may still be fine. But if the Python extension is in charge of exporting RTTI symbols, there is problem. 2. It can generally be tricky to get the compiler to export RTTI symbols for non-polymorphic types. 3. Setting the right ``__attribute__ ((visibility("default")))`` flags in extension libraries is error-prone and a source of confusion for new users. This commit changes nanobind to adopt both strategies at the same time. Type queries first go through a fast pointer-based hash table followed by a secondary string-based hash table. In the latter case, nanobind also populates the faster pointer-based table with the missing information so that the fast path eventually resolves all of the queries. This commit changes the internal representation of nanobind data structures, hence the ABI version had to be incremented.
1 parent 11a9b3c commit b515b1f

File tree

10 files changed

+279
-153
lines changed

10 files changed

+279
-153
lines changed

cmake/nanobind-config.cmake

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,8 @@ function (nanobind_build_library TARGET_NAME)
152152
${NB_DIR}/include/nanobind/eigen/sparse.h
153153

154154
${NB_DIR}/src/buffer.h
155+
${NB_DIR}/src/hash.h
156+
${NB_DIR}/src/hash.cpp
155157
${NB_DIR}/src/nb_internals.h
156158
${NB_DIR}/src/nb_internals.cpp
157159
${NB_DIR}/src/nb_func.cpp

docs/faq.rst

Lines changed: 0 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -236,56 +236,6 @@ will:
236236
definition is changed, only a subset of the binding code will generally need
237237
to be recompiled.
238238

239-
Nanobind cannot pass instances of my type in a multi-library/extension project
240-
------------------------------------------------------------------------------
241-
242-
Suppose that nanobind unexpectedly raises a ``TypeError`` when passing or
243-
returning an instance of a bound type. There is usually a simple explanation:
244-
the type (let's call it "``Foo``") is defined in a library compiled separately
245-
from the main nanobind extension (let's call it ``libfoo``). The problem can
246-
also arise when there are multiple extension libraries that all make use of
247-
``Foo``.
248-
249-
The problem is that the runtime type information ("RTTI") describing ``Foo`` is
250-
is not synchronized among these different libraries, at which point it appears
251-
to nanobind that there are multiple identically named but distinct types called
252-
``Foo``. The dynamic linker is normally responsible for merging the RTTI
253-
records, but it can only do so when the shared library exports them correctly.
254-
255-
On Windows you must specify a DLL export/import annotation, and on other
256-
platforms it suffices to raise the visibility of the associated symbols.
257-
258-
.. code-block:: cpp
259-
260-
/* TODO: Change 'MYLIB' to the name of your project. It's probably best to put
261-
these into a common header file included by all parts of the project */
262-
#if defined(_WIN32)
263-
# define MYLIB_EXPORT __declspec(dllexport)
264-
# define MYLIB_IMPORT __declspec(dllimport)
265-
#else
266-
# define MYLIB_EXPORT __attribute__ ((visibility("default")))
267-
# define MYLIB_IMPORT __attribute__ ((visibility("default")))
268-
#endif
269-
270-
#if defined(MYLIB_BUILD)
271-
# define MYLIB_API MYLIB_EXPORT
272-
#else
273-
# define MYLIB_API MYLIB_IMPORT
274-
#endif
275-
276-
/// Important: annotate the Class declaration with MYLIB_API
277-
class MYLIB_API Foo {
278-
// ... Foo definitions ..
279-
};
280-
281-
In the CMake build system, you must furthermore specify the ``-DMYLIB_BUILD``
282-
definition so that symbols are exported when building ``libfoo`` and imported
283-
by consumers of ``libfoo``.
284-
285-
.. code-block:: cmake
286-
287-
target_compile_definitions(libfoo PRIVATE MYLIB_BUILD)
288-
289239
.. _type-visibility:
290240

291241
How can I avoid conflicts with other projects using nanobind?

include/nanobind/nb_class.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,13 +77,17 @@ enum class type_init_flags : uint32_t {
7777
all_init_flags = (0x1f << 19)
7878
};
7979

80+
// See internals.h
81+
struct nb_alias_chain;
82+
8083
/// Information about a type that persists throughout its lifetime
8184
struct type_data {
8285
uint32_t size;
8386
uint32_t align : 8;
8487
uint32_t flags : 24;
8588
const char *name;
8689
const std::type_info *type;
90+
nb_alias_chain *alias_chain;
8791
PyTypeObject *type_py;
8892
void (*destruct)(void *);
8993
void (*copy)(void *, const void *);

src/hash.cpp

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
#include "hash.h"
2+
3+
#if defined(_MSC_VER)
4+
# define ROTL32(x,y) _rotl(x,y)
5+
# define ROTL64(x,y) _rotl64(x,y)
6+
7+
#else
8+
inline uint32_t rotl32(uint32_t x, int8_t r) {
9+
return (x << r) | (x >> (32 - r));
10+
}
11+
12+
inline uint64_t rotl64(uint64_t x, int8_t r) {
13+
return (x << r) | (x >> (64 - r));
14+
}
15+
16+
# define ROTL32(x,y) rotl32(x,y)
17+
# define ROTL64(x,y) rotl64(x,y)
18+
#endif
19+
20+
//-----------------------------------------------------------------------------
21+
22+
uint64_t MurmurHash3_x64_64(const void *key, const size_t len,
23+
const uint32_t seed) {
24+
const uint8_t *data = (const uint8_t *) key;
25+
const size_t nblocks = len / 16;
26+
27+
uint64_t h1 = seed;
28+
uint64_t h2 = seed;
29+
30+
const uint64_t c1 = (uint64_t) 0x87c37b91114253d5ull;
31+
const uint64_t c2 = (uint64_t) 0x4cf5ad432745937full;
32+
33+
//----------
34+
// body
35+
36+
const uint64_t * blocks = (const uint64_t *)(data);
37+
38+
for(size_t i = 0; i < nblocks; i++) {
39+
uint64_t k1 = blocks[i*2+0];
40+
uint64_t k2 = blocks[i*2+1];
41+
42+
k1 *= c1; k1 = ROTL64(k1,31); k1 *= c2; h1 ^= k1;
43+
44+
h1 = ROTL64(h1,27); h1 += h2; h1 = h1*5+0x52dce729;
45+
46+
k2 *= c2; k2 = ROTL64(k2,33); k2 *= c1; h2 ^= k2;
47+
48+
h2 = ROTL64(h2,31); h2 += h1; h2 = h2*5+0x38495ab5;
49+
}
50+
51+
//----------
52+
// tail
53+
54+
const uint8_t *tail = (const uint8_t *) (data + nblocks * 16);
55+
56+
uint64_t k1 = 0;
57+
uint64_t k2 = 0;
58+
59+
switch(len & 15) {
60+
case 15: k2 ^= ((uint64_t)tail[14]) << 48;
61+
case 14: k2 ^= ((uint64_t)tail[13]) << 40;
62+
case 13: k2 ^= ((uint64_t)tail[12]) << 32;
63+
case 12: k2 ^= ((uint64_t)tail[11]) << 24;
64+
case 11: k2 ^= ((uint64_t)tail[10]) << 16;
65+
case 10: k2 ^= ((uint64_t)tail[ 9]) << 8;
66+
case 9: k2 ^= ((uint64_t)tail[ 8]) << 0;
67+
k2 *= c2; k2 = ROTL64(k2,33); k2 *= c1; h2 ^= k2;
68+
69+
case 8: k1 ^= ((uint64_t)tail[ 7]) << 56;
70+
case 7: k1 ^= ((uint64_t)tail[ 6]) << 48;
71+
case 6: k1 ^= ((uint64_t)tail[ 5]) << 40;
72+
case 5: k1 ^= ((uint64_t)tail[ 4]) << 32;
73+
case 4: k1 ^= ((uint64_t)tail[ 3]) << 24;
74+
case 3: k1 ^= ((uint64_t)tail[ 2]) << 16;
75+
case 2: k1 ^= ((uint64_t)tail[ 1]) << 8;
76+
case 1: k1 ^= ((uint64_t)tail[ 0]) << 0;
77+
k1 *= c1; k1 = ROTL64(k1,31); k1 *= c2; h1 ^= k1;
78+
};
79+
80+
//----------
81+
// finalization
82+
83+
h1 ^= len; h2 ^= len;
84+
85+
h1 += h2;
86+
h2 += h1;
87+
88+
h1 = fmix64(h1);
89+
h2 = fmix64(h2);
90+
91+
h1 += h2;
92+
93+
return h1;
94+
}

src/hash.h

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
//-----------------------------------------------------------------------------
2+
// Slightly adapted version of the MurmurHash3 codebase (originally by Austin
3+
// Appleby, in the public domain)
4+
//
5+
// The changes are as follows:
6+
//
7+
// - fmix32 and fmix64 are exported to other compilation units, since they
8+
// are useful has a hash function for 32/64 bit integers and pointers
9+
//
10+
// - The MurmurHash3_x64_64() function is a variant of the original
11+
// MurmurHash3_x64_128() that only returns the low 64 bit of the hash
12+
// value.
13+
//-----------------------------------------------------------------------------
14+
15+
#pragma once
16+
17+
#include <cstdint>
18+
#include <cstdlib>
19+
20+
inline uint32_t fmix32(uint32_t h) {
21+
h ^= h >> 16;
22+
h *= 0x85ebca6b;
23+
h ^= h >> 13;
24+
h *= 0xc2b2ae35;
25+
h ^= h >> 16;
26+
27+
return h;
28+
}
29+
30+
inline uint64_t fmix64(uint64_t k) {
31+
k ^= k >> 33;
32+
k *= (uint64_t) 0xff51afd7ed558ccdull;
33+
k ^= k >> 33;
34+
k *= (uint64_t) 0xc4ceb9fe1a85ec53ull;
35+
k ^= k >> 33;
36+
return k;
37+
}
38+
39+
extern uint64_t MurmurHash3_x64_64(const void *key, size_t len, uint32_t seed);

src/implicit.cpp

Lines changed: 6 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,10 @@ NAMESPACE_BEGIN(detail)
1515

1616
void implicitly_convertible(const std::type_info *src,
1717
const std::type_info *dst) noexcept {
18-
nb_type_map &type_c2p = internals->type_c2p;
18+
type_data *t = nb_type_c2p(internals, dst);
19+
check(t, "nanobind::detail::implicitly_convertible(src=%s, dst=%s): "
20+
"destination type unknown!", type_name(src), type_name(dst));
1921

20-
nb_type_map::iterator it = type_c2p.find(std::type_index(*dst));
21-
check(it != type_c2p.end(),
22-
"nanobind::detail::implicitly_convertible(src=%s, dst=%s): "
23-
"destination type unknown!", type_name(src), type_name(dst));
24-
25-
type_data *t = it->second;
2622
size_t size = 0;
2723

2824
if (t->flags & (uint32_t) type_flags::has_implicit_conversions) {
@@ -47,14 +43,10 @@ void implicitly_convertible(const std::type_info *src,
4743
void implicitly_convertible(bool (*predicate)(PyTypeObject *, PyObject *,
4844
cleanup_list *),
4945
const std::type_info *dst) noexcept {
50-
nb_type_map &type_c2p = internals->type_c2p;
51-
52-
nb_type_map::iterator it = type_c2p.find(std::type_index(*dst));
53-
check(it != type_c2p.end(),
54-
"nanobind::detail::implicitly_convertible(src=<predicate>, dst=%s): "
55-
"destination type unknown!", type_name(dst));
46+
type_data *t = nb_type_c2p(internals, dst);
47+
check(t, "nanobind::detail::implicitly_convertible(src=<predicate>, dst=%s): "
48+
"destination type unknown!", type_name(dst));
5649

57-
type_data *t = it->second;
5850
size_t size = 0;
5951

6052
if (t->flags & (uint32_t) type_flags::has_implicit_conversions) {

src/nb_func.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -942,9 +942,9 @@ static void nb_func_render_signature(const func_data *f) noexcept {
942942
"nb::detail::nb_func_render_signature(): missing type!");
943943

944944
if (!(is_method && arg_index == 0)) {
945-
auto it = internals->type_c2p.find(std::type_index(**descr_type));
945+
auto it = internals->type_c2p_slow.find(*descr_type);
946946

947-
if (it != internals->type_c2p.end()) {
947+
if (it != internals->type_c2p_slow.end()) {
948948
handle th((PyObject *) it->second->type_py);
949949
buf.put_dstr((borrow<str>(th.attr("__module__"))).c_str());
950950
buf.put('.');

src/nb_internals.cpp

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717

1818
/// Tracks the ABI of nanobind
1919
#ifndef NB_INTERNALS_VERSION
20-
# define NB_INTERNALS_VERSION 11
20+
# define NB_INTERNALS_VERSION 12
2121
#endif
2222

2323
/// On MSVC, debug and release builds are not ABI-compatible!
@@ -252,12 +252,13 @@ static void internals_cleanup() {
252252
leak = true;
253253
}
254254

255-
if (!internals->type_c2p.empty()) {
255+
if (!internals->type_c2p_slow.empty() ||
256+
!internals->type_c2p_fast.empty()) {
256257
if (internals->print_leak_warnings) {
257258
fprintf(stderr, "nanobind: leaked %zu types!\n",
258-
internals->type_c2p.size());
259+
internals->type_c2p_slow.size());
259260
int ctr = 0;
260-
for (const auto &kv : internals->type_c2p) {
261+
for (const auto &kv : internals->type_c2p_slow) {
261262
fprintf(stderr, " - leaked type \"%s\"\n", kv.second->name);
262263
if (ctr++ == 10) {
263264
fprintf(stderr, " - ... skipped remainder\n");

0 commit comments

Comments
 (0)