-
Notifications
You must be signed in to change notification settings - Fork 383
Description
Describe your problem
Summary
We observed significant performance improvements but also recall degradation when comparing the stable version (v0.6.0-dev3) with the nightly build. The query latency improved by ~81.5% but recall dropped by ~7.7%. We're seeking clarification on whether these changes are expected behavior.
Background
During benchmarking of HNSW performance between Infinity versions, we compared:
- Stable version: v0.6.0-dev3
- Nightly build: Latest nightly version (docker id: 48932341cd18)
Performance Changes Observed
Query Performance Improvements
- Query latency: ~81.5% improvement (9.6ms → 3.0ms average)
- P95 latency: ~68.5% improvement (11.1ms → 3.5ms)
- P99 latency: ~65.1% improvement (12.6ms → 4.4ms)
Recall Accuracy Changes
- Recall drop: ~7.7% degradation (0.912 → 0.842)
- ID consistency: ~80% of returned IDs remain the same
- Similarity scores: More accurate in nightly build (old version had systematic calculation errors)
Import Performance
- Import speed: ~40% slower in nightly build
- Memory usage: Optimized in nightly build
Technical Details
- Dataset: NYTimes 256-dimensional vectors (~290K vectors)
- Dataset download: https://ann-benchmarks.com/nytimes-256-angular.hdf5
- Index Type: HNSW with M=16, ef_construction=600, ef=600
- Metric: Cosine similarity
- Batch size: 8192 (same for both versions)
Test Configuration
{
"name": "infinity_nytimes",
"app": "infinity",
"host": "127.0.0.1:23817",
"data_path": "datasets/NYTimes/nytimes-256-angular.hdf5",
"insert_batch_size": 8192,
"topK": 100,
"mode": "vector",
"schema": {
"embeddings": {"type": "vector, 256, float"}
},
"vector_size": 256,
"vector_name": "embeddings",
"metric_type": "cosine",
"index": {
"embeddings": {
"type": "HNSW",
"index_params": {
"M": 16,
"ef_construction": 600,
"metric": "cosine",
"encode": "lvq"
},
"query_params": {
"ef": 600
}
}
},
"query_path": "datasets/NYTimes/nytimes-256-angular.hdf5",
"batch_size": 8192,
"ground_truth_path": "datasets/NYTimes/nytimes-256-angular.hdf5"
}
Client Fix Applied
During our testing, we discovered and fixed an issue in the benchmark client (python/benchmark/clients/infinity_client.py
):
Issue: The ef
parameter from query_params
was not being sent per query in the original client code.
Fix: Applied the following patch to properly handle query parameters:
diff --git a/python/benchmark/clients/infinity_client.py b/python/benchmark/clients/infinity_client.py
index 7d9ffea12..af9a72d15 100644
--- a/python/benchmark/clients/infinity_client.py
+++ b/python/benchmark/clients/infinity_client.py
@@ -37,6 +37,7 @@ class InfinityClient(BaseClient):
self.data_mode = self.data["mode"]
self.path_prefix = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
self.table_objs = list()
+ self.query_options_ = {}
def _parse_index_schema(self, index_schema):
indexs = []
@@ -44,10 +45,9 @@ class InfinityClient(BaseClient):
if value["type"] == "text":
indexs.append(index.IndexInfo(key, index.IndexType.FullText))
elif value["type"] == "HNSW":
- params = {}
- for param, v in value["params"].items():
- params[param] = str(v)
- indexs.append(index.IndexInfo(key, index.IndexType.Hnsw, params))
+ # Pass only index-time parameters
+ index_params = {str(k): str(v) for k, v in value.get("index_params", {}).items()}
+ indexs.append(index.IndexInfo(key, index.IndexType.Hnsw, index_params))
elif value["type"] == "BMP":
params = {}
for param, v in value["params"].items():
@@ -150,6 +150,20 @@ class InfinityClient(BaseClient):
else:
raise TypeError("Unsupport file type!")
+ def create_index(self):
+ """
+ Create indexes for the table.
+ """
+ db_obj = self.client.get_database("default_db")
+ table_obj = db_obj.get_table(self.table_name)
+ indexs = self._parse_index_schema(self.data["index"])
+ logging.info(f"Creating {len(indexs)} indexes...")
+ for i, idx in enumerate(indexs):
+ logging.info(f"Creating index: index{i} on column {idx.column_name}")
+ table_obj.create_index(f"index{i}", [idx])
+ logging.info("Finished creating indexes.")
+
+
def setup_clients(self, num_threads=1):
host, port = self.data["host"].split(":")
self.clients = list()
@@ -160,12 +174,22 @@ class InfinityClient(BaseClient):
self.clients.append(client)
self.table_objs.append(table_obj)
+ # One-time processing of query parameters
+ embedding_column = ""
+ for k, v in self.data["schema"].items():
+ if "vector" in v['type']:
+ embedding_column = k
+ break
+ if embedding_column:
+ query_params_config = self.data["index"][embedding_column].get("query_params", {})
+ self.query_options_ = {str(k): str(v) for k, v in query_params_config.items()}
def do_single_query(self, query_id, client_id) -> list[Any]:
result = None
query = self.queries[query_id]
table_obj = self.table_objs[client_id]
if self.data_mode == "vector":
- res, _ = (
+ res, _, _ = (
table_obj.output(["_row_id"])
.match_dense(
self.data["vector_name"],
@@ -173,12 +197,13 @@ class InfinityClient(BaseClient):
"float",
self.data["metric_type"],
self.data["topK"],
self.query_options_,
)
.to_result()
)
result = res["ROW_ID"]
elif self.data_mode == "fulltext":
- res, _ = (
+ res, _, _ = (
table_obj.output(["_row_id", "_score"])
.match_text(
"",
Key Changes:
- Added
self.query_options_ = {}
in__init__
to initialize query options - Fixed HNSW index parameter handling to use
index_params
instead ofparams
- Added query parameter processing in
setup_clients
to extractquery_params
from config - Added
self.query_options_
parameter tomatch_dense
calls to pass query parameters per query - Fixed return value unpacking from
to_result()
calls
This ensures that the ef
parameter (and other query parameters) are properly passed to each query, which is crucial for accurate HNSW performance testing.
Questions for the Development Team
-
Expected Behavior: Are these performance/recall trade-offs expected in the nightly build? Is the recall drop a known side effect of the performance optimizations?
-
Root Cause: What specific changes in the HNSW implementation are causing the recall degradation? Is it related to:
- Memory layout changes affecting search paths?
- Concurrency improvements altering insertion order?
- Search algorithm optimizations?
-
Search Path Changes: Are the recall differences due to changes in HNSW search algorithms or just different memory layouts and insertion patterns?
-
Performance vs Accuracy Trade-off: Is this an intentional trade-off where we gain query speed at the cost of some recall accuracy?
-
Future Plans: Are there plans to improve recall while maintaining the performance gains?
-
Configuration Impact: Do different HNSW parameters (M, ef_construction, ef) affect this trade-off differently?
-
Client Fix: Should the query parameter processing fix be incorporated into the main benchmark client?
Verification Results
We've verified that:
- All data is properly inserted (no data loss)
- Missing keys exist in the database
- Similarity calculations are more accurate in nightly build
- The recall drop is due to search/index changes, not data integrity issues
- Query parameters (including
ef
) are now properly sent per query
Current Impact
- Positive: Significant query performance improvements
- Negative: Recall degradation might affect search quality
- Neutral: More accurate similarity scores
- Fixed: Query parameters now properly applied
Environment
- Stable version: v0.6.0-dev3
- Nightly build: Latest nightly version (docker id: 48932341cd18)
- Dataset: NYTimes 256-angular (https://ann-benchmarks.com/nytimes-256-angular.hdf5)
- Index: HNSW with cosine similarity
- Config: Standard HNSW config (LSG builder not enabled)
- Client: Fixed benchmark client with proper query parameter handling
Additional Context
The nightly build shows improved similarity calculation precision, suggesting the optimizations are generally beneficial. However, the recall drop raises questions about whether this is an acceptable trade-off for the performance gains.
We're simply seeking clarification on whether these observed changes are normal and expected.
Labels: performance
, hnsw
, recall-accuracy
, nightly-build
, client-fix
Priority: Medium
Component: HNSW Index