Realtime AI in the Browser



AI Summary

Summary of AI-Powered Translation Tool Video

Concept

  • Build an AI-powered translation tool, similar to a “Babble fish,” using client-side technologies.
  • Transcribe speech to text and broadcast it to an audience using Supabase.
  • Audience can select their preferred language for translation.

Implementation Details

  • Use the Whisper base model from Hugging Face for transcription, which works offline in the browser.
  • Channel ID is used to receive data through Supabase Realtime.
  • Translation is done locally in the browser using another model.
  • The tool works better with Latin languages due to model limitations.

Technical Underpinnings

  • Utilizes Hugging Face Transformers JS and the open-source Onyx runtime.
  • Implements WebGPU support for better performance.
  • The transcription is WebGPU-enabled, while translation is not yet WebGPU-enabled.

Development Setup

  • A React application with Vite is used, employing a hash router for GitHub Pages compatibility.
  • The app consists of a broadcaster homepage and a receiver with a channel ID for subscriptions.
  • Audio Web APIs and web workers are used for chunking and transcription tasks.
  • The transcription worker is a Singleton pattern to ensure a single model instance.
  • The model is loaded from Hugging Face, with WebGPU support checked and utilized if available.

Broadcasting and Receiving

  • Broadcasting uses a utils function to chunk transcribed sentences and send them via Supabase Realtime.
  • The receiver uses a translation worker with a similar Singleton pattern.
  • The translation model is open-source from Meta, trained on 200 languages.
  • The receiver subscribes to the Supabase Realtime channel and listens for transcribed text to translate.

Deployment and Testing

  • The project is set up on Supabase with a generated project name and password.
  • The Supabase Anon key and URL are added to a local environment file.
  • The real-time service is checked for proper operation.
  • Debugging can be done using the Supabase Realtime debugger to listen to the channel.

Final Notes

  • The entire application is client-side, with no server involved.
  • The code is available on the Supabase Community GitHub repo.
  • The demo is hosted on GitHub Pages.
  • The presenter encourages trying out the tool and providing feedback.

Detailed Instructions and Tips

  • No specific CLI commands, website URLs, or detailed instructions were provided in the transcript.