SymRank is a blazing-fast Python library for top-k cosine similarity ranking, designed for vector search, retrieval-augmented generation (RAG), and embedding-based matching.
Built with a Rust + SIMD backend, it offers the speed of native code with the ease of Python.
β‘ Fast: SIMD-accelerated cosine scoring with adaptive parallelism
π§ Smart: Automatically selects serial or parallel mode based on workload
π’ Top-K optimized: Efficient inlined heap selection (no full sort overhead)
π Pythonic: Easy-to-use Python API
π¦ Powered by Rust: Safe, high-performance core engine
π Memory Efficient: Supports batching for speed and to reduce memory footprint
You can install SymRank with 'uv' or alternatively using 'pip'.
uv pip install symrankpip install symrankimport symrank as sr
query = [0.1, 0.2, 0.3, 0.4]
candidates = [
("doc_1", [0.1, 0.2, 0.3, 0.5]),
("doc_2", [0.9, 0.1, 0.2, 0.1]),
("doc_3", [0.0, 0.0, 0.0, 1.0]),
]
results = sr.cosine_similarity(query, candidates, k=2)
print(results)Output
[{'id': 'doc_1', 'score': 0.9939991235733032}, {'id': 'doc_3', 'score': 0.7302967309951782}]import symrank as sr
import numpy as np
query = np.array([0.1, 0.2, 0.3, 0.4], dtype=np.float32)
candidates = [
("doc_1", np.array([0.1, 0.2, 0.3, 0.5], dtype=np.float32)),
("doc_2", np.array([0.9, 0.1, 0.2, 0.1], dtype=np.float32)),
("doc_3", np.array([0.0, 0.0, 0.0, 1.0], dtype=np.float32)),
]
results = sr.cosine_similarity(query, candidates, k=2)
print(results)Output
[{'id': 'doc_1', 'score': 0.9939991235733032}, {'id': 'doc_3', 'score': 0.7302967309951782}]cosine_similarity(
query_vector, # List[float] or np.ndarray
candidate_vectors, # List[Tuple[str, List[float] or np.ndarray]]
k=5, # Number of top results to return
batch_size=None # Optional: set for memory-efficient batching
)| Parameter | Type | Default | Description |
|---|---|---|---|
query_vector |
list[float] or np.ndarray |
required | The query vector you want to compare against the candidate vectors. |
candidate_vectors |
list[tuple[str, list[float] or np.ndarray]] |
required | List of (id, vector) pairs. Each vector can be a list or NumPy array. |
k |
int |
5 | Number of top results to return, sorted by descending similarity. |
batch_size |
int or None |
None | Optional batch size to reduce memory usage. If None, uses SIMD directly. |
List of dictionaries with id and score (cosine similarity), sorted by descending similarity:
[{"id": "doc_42", "score": 0.8763}, {"id": "doc_17", "score": 0.8451}, ...]This project is licensed under the Apache License 2.0.
