-
Notifications
You must be signed in to change notification settings - Fork 143
Closed
Labels
feature requestNew feature or requestNew feature or request
Description
This tracker is related with the PR #713 . With logical merge, the search delay increases roughly linearly with split_num. Although each graph is smaller, overall search time grows due to multiple searches, and there's no noticeable speedup.
The setup: A6000 * 1, dataset: sift-128-euclidean, batchsize=1:
- Build
---------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations Split_num MergeStrategy UserCounters...
---------------------------------------------------------------------------------------------------------------------------
raft_cagra.dim32/process_time/real_time 7.51 s 212 s 1 1 No merge GPU=7.51419 graph_degree=32 index_size=1000k intermediate_graph_degree=48 ivf_pq_build_niter=10 ivf_pq_build_nlist=16.384k ivf_pq_build_pq_bits=8 ivf_pq_build_pq_dim=32 ivf_pq_build_ratio=10 ivf_pq_search_nprobe=30 ivf_pq_search_refine_ratio=1 split_num=1 dataset_memory_type="device"#graph_build_algo="NN_DESCENT"#ivf_pq_search_internalDistanceDtype="float"#ivf_pq_search_smemLutDtype="float"#merge_type="LOGICAL"
raft_cagra.dim32/process_time/real_time 7.55 s 211 s 1 4 Physical GPU=7.54935 graph_degree=32 index_size=1000k intermediate_graph_degree=48 ivf_pq_build_niter=10 ivf_pq_build_nlist=16.384k ivf_pq_build_pq_bits=8 ivf_pq_build_pq_dim=32 ivf_pq_build_ratio=10 ivf_pq_search_nprobe=30 ivf_pq_search_refine_ratio=1 split_num=4 dataset_memory_type="device"#graph_build_algo="NN_DESCENT"#ivf_pq_search_internalDistanceDtype="float"#ivf_pq_search_smemLutDtype="float"#merge_type="PHYSICAL"
raft_cagra.dim32/process_time/real_time 8.18 s 254 s 1 4 Logical GPU=8.17527 graph_degree=32 index_size=1000k intermediate_graph_degree=48 ivf_pq_build_niter=10 ivf_pq_build_nlist=16.384k ivf_pq_build_pq_bits=8 ivf_pq_build_pq_dim=32 ivf_pq_build_ratio=10 ivf_pq_search_nprobe=30 ivf_pq_search_refine_ratio=1 split_num=4 dataset_memory_type="device"#graph_build_algo="NN_DESCENT"#ivf_pq_search_internalDistanceDtype="float"#ivf_pq_search_smemLutDtype="float"#merge_type="LOGICAL"
- Search
----------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations Split_num MergeStrategy UserCounters...
----------------------------------------------------------------------------------------------------------------------------------
raft_cagra.dim32/process_time/real_time 1.26 ms 1.26 ms 44323 GPU=1.25917m Latency=1.26365m Recall=0.99851 end_to_end=56.009 items_per_second=791.355/s itopk=256 k=10 n_queries=1 total_queries=44.323k algo="single_cta"#dataset_memory_type="device"
raft_cagra.dim32/process_time/real_time 1.26 ms 1.26 ms 44651 GPU=1.25266m Latency=1.25714m Recall=0.99856 end_to_end=56.1326 items_per_second=795.455/s itopk=256 k=10 n_queries=1 total_queries=44.651k algo="single_cta"#dataset_memory_type="device"
raft_cagra.dim32/process_time/real_time 5.09 ms 5.09 ms 10989 GPU=5.08771m Latency=5.0925m Recall=0.99928 end_to_end=55.9615 items_per_second=196.367/s itopk=256 k=10 n_queries=1 total_queries=10.989k algo="single_cta"#dataset_memory_type="device"
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or request