Realtime AI in the Browser
AI Summary
Summary of AI-Powered Translation Tool Video
Concept
- Build an AI-powered translation tool, similar to a “Babble fish,” using client-side technologies.
- Transcribe speech to text and broadcast it to an audience using Supabase.
- Audience can select their preferred language for translation.
Implementation Details
- Use the Whisper base model from Hugging Face for transcription, which works offline in the browser.
- Channel ID is used to receive data through Supabase Realtime.
- Translation is done locally in the browser using another model.
- The tool works better with Latin languages due to model limitations.
Technical Underpinnings
- Utilizes Hugging Face Transformers JS and the open-source Onyx runtime.
- Implements WebGPU support for better performance.
- The transcription is WebGPU-enabled, while translation is not yet WebGPU-enabled.
Development Setup
- A React application with Vite is used, employing a hash router for GitHub Pages compatibility.
- The app consists of a broadcaster homepage and a receiver with a channel ID for subscriptions.
- Audio Web APIs and web workers are used for chunking and transcription tasks.
- The transcription worker is a Singleton pattern to ensure a single model instance.
- The model is loaded from Hugging Face, with WebGPU support checked and utilized if available.
Broadcasting and Receiving
- Broadcasting uses a utils function to chunk transcribed sentences and send them via Supabase Realtime.
- The receiver uses a translation worker with a similar Singleton pattern.
- The translation model is open-source from Meta, trained on 200 languages.
- The receiver subscribes to the Supabase Realtime channel and listens for transcribed text to translate.
Deployment and Testing
- The project is set up on Supabase with a generated project name and password.
- The Supabase Anon key and URL are added to a local environment file.
- The real-time service is checked for proper operation.
- Debugging can be done using the Supabase Realtime debugger to listen to the channel.
Final Notes
- The entire application is client-side, with no server involved.
- The code is available on the Supabase Community GitHub repo.
- The demo is hosted on GitHub Pages.
- The presenter encourages trying out the tool and providing feedback.
Detailed Instructions and Tips
- No specific CLI commands, website URLs, or detailed instructions were provided in the transcript.