Skip to content

Commit cb0ed55

Browse files
authored
feat(neutts): add backend (#6404)
* feat(neutts): add backend Signed-off-by: Ettore Di Giacinto <[email protected]> * chore(ci): add images to CI Signed-off-by: Ettore Di Giacinto <[email protected]> * chore(gallery): add Neutts Signed-off-by: Ettore Di Giacinto <[email protected]> * Make it work with quantized versions Signed-off-by: Ettore Di Giacinto <[email protected]> * Fixups Signed-off-by: Ettore Di Giacinto <[email protected]> * Docs Signed-off-by: Ettore Di Giacinto <[email protected]> * Fixups Signed-off-by: Ettore Di Giacinto <[email protected]> * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <[email protected]> * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <[email protected]> * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <[email protected]> --------- Signed-off-by: Ettore Di Giacinto <[email protected]> Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent 2fe9711 commit cb0ed55

File tree

18 files changed

+492
-2
lines changed

18 files changed

+492
-2
lines changed

.github/workflows/backend.yml

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -993,6 +993,55 @@ jobs:
993993
backend: "kitten-tts"
994994
dockerfile: "./backend/Dockerfile.python"
995995
context: "./backend"
996+
# neutts
997+
- build-type: ''
998+
cuda-major-version: ""
999+
cuda-minor-version: ""
1000+
platforms: 'linux/amd64,linux/arm64'
1001+
tag-latest: 'auto'
1002+
tag-suffix: '-cpu-neutts'
1003+
runs-on: 'ubuntu-latest'
1004+
base-image: "ubuntu:22.04"
1005+
skip-drivers: 'false'
1006+
backend: "neutts"
1007+
dockerfile: "./backend/Dockerfile.python"
1008+
context: "./backend"
1009+
- build-type: 'cublas'
1010+
cuda-major-version: "12"
1011+
cuda-minor-version: "0"
1012+
platforms: 'linux/amd64'
1013+
tag-latest: 'auto'
1014+
tag-suffix: '-gpu-nvidia-cuda-12-neutts'
1015+
runs-on: 'ubuntu-latest'
1016+
base-image: "ubuntu:22.04"
1017+
skip-drivers: 'false'
1018+
backend: "neutts"
1019+
dockerfile: "./backend/Dockerfile.python"
1020+
context: "./backend"
1021+
- build-type: 'hipblas'
1022+
cuda-major-version: ""
1023+
cuda-minor-version: ""
1024+
platforms: 'linux/amd64'
1025+
tag-latest: 'auto'
1026+
tag-suffix: '-gpu-rocm-hipblas-neutts'
1027+
runs-on: 'arc-runner-set'
1028+
base-image: "rocm/dev-ubuntu-22.04:6.4.3"
1029+
skip-drivers: 'false'
1030+
backend: "neutts"
1031+
dockerfile: "./backend/Dockerfile.python"
1032+
context: "./backend"
1033+
- build-type: 'l4t'
1034+
cuda-major-version: "12"
1035+
cuda-minor-version: "0"
1036+
platforms: 'linux/arm64'
1037+
skip-drivers: 'true'
1038+
tag-latest: 'auto'
1039+
tag-suffix: '-nvidia-l4t-arm64-neutts'
1040+
base-image: "nvcr.io/nvidia/l4t-jetpack:r36.4.0"
1041+
runs-on: 'ubuntu-24.04-arm'
1042+
backend: "neutts"
1043+
dockerfile: "./backend/Dockerfile.python"
1044+
context: "./backend"
9961045
backend-jobs-darwin:
9971046
uses: ./.github/workflows/backend_build_darwin.yml
9981047
strategy:

Makefile

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -376,6 +376,9 @@ backends/llama-cpp-darwin: build
376376
bash ./scripts/build/llama-cpp-darwin.sh
377377
./local-ai backends install "ocifile://$(abspath ./backend-images/llama-cpp.tar)"
378378

379+
backends/neutts: docker-build-neutts docker-save-neutts build
380+
./local-ai backends install "ocifile://$(abspath ./backend-images/neutts.tar)"
381+
379382
build-darwin-python-backend: build
380383
bash ./scripts/build/python-darwin.sh
381384

@@ -432,6 +435,12 @@ docker-save-kitten-tts: backend-images
432435
docker-save-chatterbox: backend-images
433436
docker save local-ai-backend:chatterbox -o backend-images/chatterbox.tar
434437

438+
docker-build-neutts:
439+
docker build --build-arg BUILD_TYPE=$(BUILD_TYPE) --build-arg BASE_IMAGE=$(BASE_IMAGE) -t local-ai-backend:neutts -f backend/Dockerfile.python --build-arg BACKEND=neutts ./backend
440+
441+
docker-save-neutts: backend-images
442+
docker save local-ai-backend:neutts -o backend-images/neutts.tar
443+
435444
docker-build-kokoro:
436445
docker build --build-arg BUILD_TYPE=$(BUILD_TYPE) --build-arg BASE_IMAGE=$(BASE_IMAGE) -t local-ai-backend:kokoro -f backend/Dockerfile.python --build-arg BACKEND=kokoro ./backend
437446

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -269,6 +269,7 @@ LocalAI supports a comprehensive range of AI backends with multiple acceleration
269269
| **piper** | Fast neural TTS system | CPU |
270270
| **kitten-tts** | Kitten TTS models | CPU |
271271
| **silero-vad** | Voice Activity Detection | CPU |
272+
| **neutts** | Text-to-speech with voice cloning | CUDA 12, ROCm, CPU |
272273

273274
### Image & Video Generation
274275
| Backend | Description | Acceleration Support |
@@ -290,7 +291,7 @@ LocalAI supports a comprehensive range of AI backends with multiple acceleration
290291
|-------------------|-------------------|------------------|
291292
| **NVIDIA CUDA 11** | llama.cpp, whisper, stablediffusion, diffusers, rerankers, bark, chatterbox | Nvidia hardware |
292293
| **NVIDIA CUDA 12** | All CUDA-compatible backends | Nvidia hardware |
293-
| **AMD ROCm** | llama.cpp, whisper, vllm, transformers, diffusers, rerankers, coqui, kokoro, bark | AMD Graphics |
294+
| **AMD ROCm** | llama.cpp, whisper, vllm, transformers, diffusers, rerankers, coqui, kokoro, bark, neutts | AMD Graphics |
294295
| **Intel oneAPI** | llama.cpp, whisper, stablediffusion, vllm, transformers, diffusers, rfdetr, rerankers, exllama2, coqui, kokoro, bark | Intel Arc, Intel iGPUs |
295296
| **Apple Metal** | llama.cpp, whisper, diffusers, MLX, MLX-VLM, bark-cpp | Apple M1/M2/M3+ |
296297
| **Vulkan** | llama.cpp, whisper, stablediffusion | Cross-platform GPUs |

backend/Dockerfile.python

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ RUN apt-get update && \
2828
curl python3-pip \
2929
python-is-python3 \
3030
python3-dev llvm \
31-
python3-venv make && \
31+
python3-venv make cmake && \
3232
apt-get clean && \
3333
rm -rf /var/lib/apt/lists/* && \
3434
pip install --upgrade pip

backend/index.yaml

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -427,6 +427,68 @@
427427
- text-to-speech
428428
- TTS
429429
license: apache-2.0
430+
- &neutts
431+
name: "neutts"
432+
urls:
433+
- https://github.com/neuphonic/neutts-air
434+
description: |
435+
NeuTTS Air is the world’s first super-realistic, on-device, TTS speech language model with instant voice cloning. Built off a 0.5B LLM backbone, NeuTTS Air brings natural-sounding speech, real-time performance, built-in security and speaker cloning to your local device - unlocking a new category of embedded voice agents, assistants, toys, and compliance-safe apps.
436+
tags:
437+
- text-to-speech
438+
- TTS
439+
license: apache-2.0
440+
capabilities:
441+
default: "cpu-neutts"
442+
nvidia: "cuda12-neutts"
443+
amd: "rocm-neutts"
444+
nvidia-l4t: "nvidia-l4t-neutts"
445+
- !!merge <<: *neutts
446+
name: "neutts-development"
447+
capabilities:
448+
default: "cpu-neutts-development"
449+
nvidia: "cuda12-neutts-development"
450+
amd: "rocm-neutts-development"
451+
nvidia-l4t: "nvidia-l4t-neutts-development"
452+
- !!merge <<: *neutts
453+
name: "cpu-neutts"
454+
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-neutts"
455+
mirrors:
456+
- localai/localai-backends:latest-cpu-neutts
457+
- !!merge <<: *neutts
458+
name: "cuda12-neutts"
459+
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-neutts"
460+
mirrors:
461+
- localai/localai-backends:latest-gpu-nvidia-cuda-12-neutts
462+
- !!merge <<: *neutts
463+
name: "rocm-neutts"
464+
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-rocm-hipblas-neutts"
465+
mirrors:
466+
- localai/localai-backends:latest-gpu-rocm-hipblas-neutts
467+
- !!merge <<: *neutts
468+
name: "nvidia-l4t-neutts"
469+
uri: "quay.io/go-skynet/local-ai-backends:latest-nvidia-l4t-arm64-neutts"
470+
mirrors:
471+
- localai/localai-backends:latest-nvidia-l4t-arm64-neutts
472+
- !!merge <<: *neutts
473+
name: "cpu-neutts-development"
474+
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-neutts"
475+
mirrors:
476+
- localai/localai-backends:master-cpu-neutts
477+
- !!merge <<: *neutts
478+
name: "cuda12-neutts-development"
479+
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-nvidia-cuda-12-neutts"
480+
mirrors:
481+
- localai/localai-backends:master-gpu-nvidia-cuda-12-neutts
482+
- !!merge <<: *neutts
483+
name: "rocm-neutts-development"
484+
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-rocm-hipblas-neutts"
485+
mirrors:
486+
- localai/localai-backends:master-gpu-rocm-hipblas-neutts
487+
- !!merge <<: *neutts
488+
name: "nvidia-l4t-neutts-development"
489+
uri: "quay.io/go-skynet/local-ai-backends:master-nvidia-l4t-arm64-neutts"
490+
mirrors:
491+
- localai/localai-backends:master-nvidia-l4t-arm64-neutts
430492
- !!merge <<: *mlx
431493
name: "mlx-development"
432494
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-mlx"

backend/python/neutts/Makefile

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
.PHONY: neutts
2+
neutts:
3+
bash install.sh
4+
5+
.PHONY: run
6+
run: neutts
7+
@echo "Running neutts..."
8+
bash run.sh
9+
@echo "neutts run."
10+
11+
.PHONY: test
12+
test: neutts
13+
@echo "Testing neutts..."
14+
bash test.sh
15+
@echo "neutts tested."
16+
17+
.PHONY: protogen-clean
18+
protogen-clean:
19+
$(RM) backend_pb2_grpc.py backend_pb2.py
20+
21+
.PHONY: clean
22+
clean: protogen-clean
23+
rm -rf venv __pycache__

backend/python/neutts/backend.py

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
#!/usr/bin/env python3
2+
"""
3+
This is an extra gRPC server of LocalAI for NeuTTSAir
4+
"""
5+
from concurrent import futures
6+
import time
7+
import argparse
8+
import signal
9+
import sys
10+
import os
11+
import backend_pb2
12+
import backend_pb2_grpc
13+
import torch
14+
from neuttsair.neutts import NeuTTSAir
15+
import soundfile as sf
16+
17+
import grpc
18+
19+
def is_float(s):
20+
"""Check if a string can be converted to float."""
21+
try:
22+
float(s)
23+
return True
24+
except ValueError:
25+
return False
26+
def is_int(s):
27+
"""Check if a string can be converted to int."""
28+
try:
29+
int(s)
30+
return True
31+
except ValueError:
32+
return False
33+
34+
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
35+
36+
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
37+
MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
38+
39+
# Implement the BackendServicer class with the service methods
40+
class BackendServicer(backend_pb2_grpc.BackendServicer):
41+
"""
42+
BackendServicer is the class that implements the gRPC service
43+
"""
44+
def Health(self, request, context):
45+
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
46+
def LoadModel(self, request, context):
47+
48+
# Get device
49+
# device = "cuda" if request.CUDA else "cpu"
50+
if torch.cuda.is_available():
51+
print("CUDA is available", file=sys.stderr)
52+
device = "cuda"
53+
else:
54+
print("CUDA is not available", file=sys.stderr)
55+
device = "cpu"
56+
mps_available = hasattr(torch.backends, "mps") and torch.backends.mps.is_available()
57+
if mps_available:
58+
device = "mps"
59+
if not torch.cuda.is_available() and request.CUDA:
60+
return backend_pb2.Result(success=False, message="CUDA is not available")
61+
62+
63+
options = request.Options
64+
65+
# empty dict
66+
self.options = {}
67+
self.ref_text = None
68+
69+
# The options are a list of strings in this form optname:optvalue
70+
# We are storing all the options in a dict so we can use it later when
71+
# generating the images
72+
for opt in options:
73+
if ":" not in opt:
74+
continue
75+
key, value = opt.split(":")
76+
# if value is a number, convert it to the appropriate type
77+
if is_float(value):
78+
value = float(value)
79+
elif is_int(value):
80+
value = int(value)
81+
elif value.lower() in ["true", "false"]:
82+
value = value.lower() == "true"
83+
self.options[key] = value
84+
85+
codec_repo = "neuphonic/neucodec"
86+
if "codec_repo" in self.options:
87+
codec_repo = self.options["codec_repo"]
88+
del self.options["codec_repo"]
89+
if "ref_text" in self.options:
90+
self.ref_text = self.options["ref_text"]
91+
del self.options["ref_text"]
92+
93+
self.AudioPath = None
94+
95+
if os.path.isabs(request.AudioPath):
96+
self.AudioPath = request.AudioPath
97+
elif request.AudioPath and request.ModelFile != "" and not os.path.isabs(request.AudioPath):
98+
# get base path of modelFile
99+
modelFileBase = os.path.dirname(request.ModelFile)
100+
# modify LoraAdapter to be relative to modelFileBase
101+
self.AudioPath = os.path.join(modelFileBase, request.AudioPath)
102+
try:
103+
print("Preparing models, please wait", file=sys.stderr)
104+
self.model = NeuTTSAir(backbone_repo=request.Model, backbone_device=device, codec_repo=codec_repo, codec_device=device)
105+
except Exception as err:
106+
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
107+
# Implement your logic here for the LoadModel service
108+
# Replace this with your desired response
109+
return backend_pb2.Result(message="Model loaded successfully", success=True)
110+
111+
def TTS(self, request, context):
112+
try:
113+
kwargs = {}
114+
115+
# add options to kwargs
116+
kwargs.update(self.options)
117+
118+
ref_codes = self.model.encode_reference(self.AudioPath)
119+
120+
wav = self.model.infer(request.text, ref_codes, self.ref_text)
121+
122+
sf.write(request.dst, wav, 24000)
123+
except Exception as err:
124+
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
125+
return backend_pb2.Result(success=True)
126+
127+
def serve(address):
128+
server = grpc.server(futures.ThreadPoolExecutor(max_workers=MAX_WORKERS),
129+
options=[
130+
('grpc.max_message_length', 50 * 1024 * 1024), # 50MB
131+
('grpc.max_send_message_length', 50 * 1024 * 1024), # 50MB
132+
('grpc.max_receive_message_length', 50 * 1024 * 1024), # 50MB
133+
])
134+
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
135+
server.add_insecure_port(address)
136+
server.start()
137+
print("Server started. Listening on: " + address, file=sys.stderr)
138+
139+
# Define the signal handler function
140+
def signal_handler(sig, frame):
141+
print("Received termination signal. Shutting down...")
142+
server.stop(0)
143+
sys.exit(0)
144+
145+
# Set the signal handlers for SIGINT and SIGTERM
146+
signal.signal(signal.SIGINT, signal_handler)
147+
signal.signal(signal.SIGTERM, signal_handler)
148+
149+
try:
150+
while True:
151+
time.sleep(_ONE_DAY_IN_SECONDS)
152+
except KeyboardInterrupt:
153+
server.stop(0)
154+
155+
if __name__ == "__main__":
156+
parser = argparse.ArgumentParser(description="Run the gRPC server.")
157+
parser.add_argument(
158+
"--addr", default="localhost:50051", help="The address to bind the server to."
159+
)
160+
args = parser.parse_args()
161+
162+
serve(args.addr)

backend/python/neutts/install.sh

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
#!/bin/bash
2+
set -e
3+
4+
backend_dir=$(dirname $0)
5+
if [ -d $backend_dir/common ]; then
6+
source $backend_dir/common/libbackend.sh
7+
else
8+
source $backend_dir/../common/libbackend.sh
9+
fi
10+
11+
# This is here because the Intel pip index is broken and returns 200 status codes for every package name, it just doesn't return any package links.
12+
# This makes uv think that the package exists in the Intel pip index, and by default it stops looking at other pip indexes once it finds a match.
13+
# We need uv to continue falling through to the pypi default index to find optimum[openvino] in the pypi index
14+
# the --upgrade actually allows us to *downgrade* torch to the version provided in the Intel pip index
15+
if [ "x${BUILD_PROFILE}" == "xintel" ]; then
16+
EXTRA_PIP_INSTALL_FLAGS+=" --upgrade --index-strategy=unsafe-first-match"
17+
fi
18+
19+
if [ "x${BUILD_TYPE}" == "xcublas" ] || [ "x${BUILD_TYPE}" == "xl4t" ]; then
20+
export CMAKE_ARGS="-DGGML_CUDA=on"
21+
fi
22+
23+
if [ "x${BUILD_TYPE}" == "xhipblas" ]; then
24+
export CMAKE_ARGS="-DGGML_HIPBLAS=on"
25+
fi
26+
27+
EXTRA_PIP_INSTALL_FLAGS+=" --no-build-isolation"
28+
29+
git clone https://github.com/neuphonic/neutts-air neutts-air
30+
31+
cp -rfv neutts-air/neuttsair ./
32+
33+
installRequirements
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
datasets==4.1.1
2+
torchtune==0.6.1

0 commit comments

Comments
 (0)