[BFCL] Support Parallel Inference for Hosted Models #571

HuanzhiMao · 2024-08-07T08:01:05Z

This PR introduces multi-threading to parallel the API call to the hosted model endpoints and significantly speeds up the model response generation process.

User can specify the number of threads to use for parallel inference by setting the --num-threads flag. The default is 1, which means no parallel inference.

CharlieJCJ

Test on hosted models e.g. GPT, Claude, Cohere, Gorilla Openfunctions models, LGTM

ShishirPatil

Tested by @CharlieJCJ

ShishirPatil · 2024-08-12T06:01:40Z

Hey @HuanzhiMao can you also update the README with this flag?

HuanzhiMao added 11 commits August 7, 2024 00:37

add multi-threading support

84f0ebd

add tqdm support

f7c3d10

add caching

379931d

Merge branch 'main' into parallel-inference

b7db70d

remove random shuffle part, simplify logic

b26591d

clean up; update change log

6ae5305

chore: add debugging print

88ddb8a

improve inference error handling logic

ebb59d0

remove try-except block for API inference call from model_handler

f6a11de

make warning visually significant

69eb727

Merge branch 'main' into parallel-inference

8bfecd0

CharlieJCJ approved these changes Aug 10, 2024

View reviewed changes

ShishirPatil approved these changes Aug 12, 2024

View reviewed changes

ShishirPatil merged commit de8307b into ShishirPatil:main Aug 12, 2024

HuanzhiMao deleted the parallel-inference branch August 12, 2024 06:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BFCL] Support Parallel Inference for Hosted Models #571

[BFCL] Support Parallel Inference for Hosted Models #571

Uh oh!

HuanzhiMao commented Aug 7, 2024 •

edited

Loading

Uh oh!

CharlieJCJ left a comment

Uh oh!

ShishirPatil left a comment

Uh oh!

ShishirPatil commented Aug 12, 2024

Uh oh!

Uh oh!

[BFCL] Support Parallel Inference for Hosted Models #571

[BFCL] Support Parallel Inference for Hosted Models #571

Uh oh!

Conversation

HuanzhiMao commented Aug 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CharlieJCJ left a comment

Choose a reason for hiding this comment

Uh oh!

ShishirPatil left a comment

Choose a reason for hiding this comment

Uh oh!

ShishirPatil commented Aug 12, 2024

Uh oh!

Uh oh!

HuanzhiMao commented Aug 7, 2024 •

edited

Loading