add text-generation-inference (#465)

zhimin-z · web-flow · commit e602df97e408 · 2024-07-28T08:20:00.000+02:00
diff --git a/README.md b/README.md
@@ -219,6 +219,7 @@ This repository contains a curated list of awesome open source libraries that wi
 * [S-LoRA](https://github.com/S-LoRA/S-LoRA) ![](https://img.shields.io/github/stars/S-LoRA/S-LoRA.svg?style=social) - Serving Thousands of Concurrent LoRA Adapters.
 * [Tempo](https://github.com/SeldonIO/tempo) ![](https://img.shields.io/github/stars/SeldonIO/tempo.svg?style=social) - Open source SDK that provides a unified interface to multiple MLOps projects that enable data scientists to deploy and productionise machine learning systems.
 * [Tensorflow Serving](https://github.com/tensorflow/serving) ![](https://img.shields.io/github/stars/tensorflow/serving.svg?style=social) - High-performant framework to serve Tensorflow models via grpc protocol able to handle 100k requests per second per core.
+* [text-generation-inference](https://github.com/huggingface/text-generation-inference) ![](https://img.shields.io/github/stars/huggingface/text-generation-inference.svg?style=social) - Large Language Model Text Generation Inference.
 * [TorchServe](https://github.com/pytorch/serve) ![](https://img.shields.io/github/stars/pytorch/serve.svg?style=social) - TorchServe is a flexible and easy to use tool for serving PyTorch models.
 * [Transformer-deploy](https://github.com/ELS-RD/transformer-deploy/) ![](https://img.shields.io/github/stars/ELS-RD/transformer-deploy.svg?style=social) - Transformer-deploy is an efficient, scalable and enterprise-grade CPU/GPU inference server for Hugging Face transformer models.
 * [Triton Inference Server](https://github.com/triton-inference-server/server) ![](https://img.shields.io/github/stars/triton-inference-server/server.svg?style=social) - Triton is a high performance open source serving software to deploy AI models from any framework on GPU & CPU while maximizing utilization.