Skip to content

XRILLC/vui

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vui

Small Conversational speech models that can run on device

Installation

uv pip install -e .

Demo

Try on Gradio

python demo.py

Models

  • Vui.BASE is base checkpoint trained on 40k hours of audio conversations
  • Vui.ABRAHAM is a single speaker model that can reply with context awareness.
  • Vui.COHOST is checkpoint with two speakers that can talk to each other.

Voice Cloning

You can clone with the base model quite well but it's not perfect as hasn't seen that much audio / wasn't trained for long

Research

vui is a llama based transformer that predicts audio tokens.

FAQ

  1. Was developed with on two 4090's https://x.com/harrycblum/status/1752698806184063153
  2. Hallucinations: yes the model does hallucinate, but this is the best I could do with limited resources! :(
  3. VAD does slow things down but needed to help remove areas of silence.
@software{vui_2025,
  author = {Coultas Blum, Harry},
  month = {01},
  title = {{vui}},
  url = {https://github.com/fluxions-ai/vui},
  version = {1.0.0},
  year = {2025}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 95.9%
  • Jupyter Notebook 4.1%