Local and Open Source Speech to Speech Assistant
AI Summary
Summary of Video Transcript
- The video is a tutorial on setting up a local voice assistant using the project “Warby.”
- Warby allows communication with LLMs (Large Language Models) through voice.
- The setup involves three local models: speech-to-text, LLM for generating responses, and text-to-speech.
- The hardware used in the tutorial is a MacBook Pro with an M2 Chip and 96 GB of VRAM, but the models can run on a CPU as well.
- The three API endpoints needed are:
- Fast Whisper API for speech-to-text conversion.
- Local LLM served through Olama for generating responses.
- Mow Text-to-Speech system for converting text to speech.
- The tutorial provides step-by-step instructions for setting up each component.
- The setup process includes cloning repositories, installing packages, and running APIs.
- The video demonstrates the use of Warby, showing real-time response speeds and interactions with the voice assistant.
- The voice assistant can perform tasks and answer questions with varying response times based on hardware and text length.
- The video also mentions upcoming updates to Warby, including a UI and improvements to the codebase.
Detailed Instructions and URLs
- Local LLM Setup:
- Install Olama and run the desired model using the command
AMA run <model_name>
.- Fast Whisper API Setup:
- Clone the Fast Whisper API repository (URL not provided).
- Install required packages using
pip install -r requirements.txt
.- Run the API with
uvicorn main:app --reload
.- Mow Text-to-Speech Setup:
- Follow the installation instructions for your operating system (URL not provided).
- Clone the Mow TTS repository (URL not provided).
- Change directory to the cloned repo and install with the provided command.
- Download the text-to-speech model using the provided command.
- Warby Configuration:
- Update
config.py
to use Fast Whisper API, Olama, and Mow TTS.- Run the voice assistant with
python run_voice_assistant.py
.Additional Notes
- The video emphasizes the modularity of Warby, allowing users to replace any of the endpoints with models of their choice.
- The presenter advises checking out previous videos for initial setup and architecture overview of Warby.
- Links to additional resources and videos are mentioned but not provided in the transcript.