Skip to content

Multi-Cluster-Optimisations #25

@eddableheath

Description

@eddableheath

For ease of implementation, when adapting the tiptoe client to search over multiple clusters it currently instantiates a new client for each cluster it searches over, which then runs the embedding again each time.

Based on this there are two optimisations that will improve the efficiency of the querying:

  • Only compute the embedding for the first cluster search and pass it to the next set of cluster searches.
  • For the test datasets we can preprocess all the queries by embedding them prior to running the experiments, passing those through instead of running the embedding again for each query.

Additionally, it is currently single threaded as something was going wrong with the buffering of calls to the servers, so it now only runs one query at a time and searches over one cluster at a time. (This might be best as a separate issue).

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions