Introducing Agency Swarm Realtime Voice Interface
AI Summary
Video Summary
- The video introduces a voice interface project that simplifies interaction with technology through voice commands.
- The project is customizable and allows users to add their own tools and agents by dropping them into specific directories.
- The project was forked from indev Dan’s PC python realtime API assistant repo but has been significantly improved for ease of use.
- The new repo allows for easy addition of tools with a consistent B tool class interface and agents by simply placing them in the agency’s folder.
- The setup process includes cloning the repo, installing UV (a package manager), copying and configuring the
.env
file, installing Port audio, and syncing with UV.- Google Cloud credentials are required for tools that use Google Cloud APIs, such as checking calendars or emails.
- The video demonstrates how to enable APIs, configure OAuth consent, and obtain credentials for Google Cloud.
- Example tools are provided, and the video shows how to run the assistant with the command
UV run Main
.- The video also covers how to create new tools and agents without coding by using the cursor tool.
- The process of adding a new tool to check unread Slack messages is demonstrated, including fixing errors and testing the tool.
- The video concludes with thoughts on the similarity between the realtime API and OpenAI’s assistant API, expressing hope for future integration.
Detailed Instructions and URLs
- Clone the repo.
- Install UV package manager:
- MacOS/Linux: Run the command provided in the video to install UV.
- Copy the
.env.sample
file, rename it to.env
, and add your OpenAI key.- Install Port audio:
- Run
Brew install Port audio
in the terminal.- Sync with UV by running
UV sync
.- Enable Gmail API and Google Calendar API:
- Go to
console.cloud.google.com
, create or select a project, and enable the APIs.- Configure OAuth consent screen with the project name and required scopes.
- Add test users’ email addresses.
- Obtain credentials:
- Create credentials in Google Cloud, download the JSON file, rename it to
credentials.json
, and place it in the voice interface repo directory.- Run the assistant with
UV run Main
.Additional Tips
- If you encounter issues, ask Char GPT for help.
- Modify personalization by accessing the assistant instructions to add information about yourself and tasks.
- To use tools that require authentication, obtain Google Cloud credentials as described.
- To add your own tools and agents, simply drop them into the respective directories (
tools
oragencies
).- When creating new tools or agents, ensure they are placed in the correct folder and install any additional packages needed with
UV add <package_name>
.- Run the assistant with
UV run Main
to access your new tools.