Introducing Agency Swarm Realtime Voice Interface



AI Summary

Video Summary

  • The video introduces a voice interface project that simplifies interaction with technology through voice commands.
  • The project is customizable and allows users to add their own tools and agents by dropping them into specific directories.
  • The project was forked from indev Dan’s PC python realtime API assistant repo but has been significantly improved for ease of use.
  • The new repo allows for easy addition of tools with a consistent B tool class interface and agents by simply placing them in the agency’s folder.
  • The setup process includes cloning the repo, installing UV (a package manager), copying and configuring the .env file, installing Port audio, and syncing with UV.
  • Google Cloud credentials are required for tools that use Google Cloud APIs, such as checking calendars or emails.
  • The video demonstrates how to enable APIs, configure OAuth consent, and obtain credentials for Google Cloud.
  • Example tools are provided, and the video shows how to run the assistant with the command UV run Main.
  • The video also covers how to create new tools and agents without coding by using the cursor tool.
  • The process of adding a new tool to check unread Slack messages is demonstrated, including fixing errors and testing the tool.
  • The video concludes with thoughts on the similarity between the realtime API and OpenAI’s assistant API, expressing hope for future integration.

Detailed Instructions and URLs

  1. Clone the repo.
  2. Install UV package manager:
    • MacOS/Linux: Run the command provided in the video to install UV.
  3. Copy the .env.sample file, rename it to .env, and add your OpenAI key.
  4. Install Port audio:
    • Run Brew install Port audio in the terminal.
  5. Sync with UV by running UV sync.
  6. Enable Gmail API and Google Calendar API:
    • Go to console.cloud.google.com, create or select a project, and enable the APIs.
  7. Configure OAuth consent screen with the project name and required scopes.
  8. Add test users’ email addresses.
  9. Obtain credentials:
    • Create credentials in Google Cloud, download the JSON file, rename it to credentials.json, and place it in the voice interface repo directory.
  10. Run the assistant with UV run Main.

Additional Tips

  • If you encounter issues, ask Char GPT for help.
  • Modify personalization by accessing the assistant instructions to add information about yourself and tasks.
  • To use tools that require authentication, obtain Google Cloud credentials as described.
  • To add your own tools and agents, simply drop them into the respective directories (tools or agencies).
  • When creating new tools or agents, ensure they are placed in the correct folder and install any additional packages needed with UV add <package_name>.
  • Run the assistant with UV run Main to access your new tools.