You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -219,6 +219,7 @@ This repository contains a curated list of awesome open source libraries that wi
219
219
*[S-LoRA](https://github.com/S-LoRA/S-LoRA) - Serving Thousands of Concurrent LoRA Adapters.
220
220
*[Tempo](https://github.com/SeldonIO/tempo) - Open source SDK that provides a unified interface to multiple MLOps projects that enable data scientists to deploy and productionise machine learning systems.
221
221
*[Tensorflow Serving](https://github.com/tensorflow/serving) - High-performant framework to serve Tensorflow models via grpc protocol able to handle 100k requests per second per core.
222
+
*[text-generation-inference](https://github.com/huggingface/text-generation-inference) - Large Language Model Text Generation Inference.
222
223
*[TorchServe](https://github.com/pytorch/serve) - TorchServe is a flexible and easy to use tool for serving PyTorch models.
223
224
*[Transformer-deploy](https://github.com/ELS-RD/transformer-deploy/) - Transformer-deploy is an efficient, scalable and enterprise-grade CPU/GPU inference server for Hugging Face transformer models.
224
225
*[Triton Inference Server](https://github.com/triton-inference-server/server) - Triton is a high performance open source serving software to deploy AI models from any framework on GPU & CPU while maximizing utilization.
0 commit comments