ThirdBrAIn.tech

❯

❯

❯

❯

❯

Realtime AI in the Browser

Realtime AI in the Browser

Aug 07, 20242 min read

realtime-AI
browser-AI
speech-transcription
language-translation
web-development
Hugging-Face-Transformers
WebGPU
React-Vite
client-side-AI
open-source-models
YT/2024/M08
YT/2024/W32

Realtime AI in the Browser

AI Summary

Summary of AI-Powered Babble Fish Video

Concept

Create an AI-powered Babble Fish to transcribe speech to text and broadcast it to an audience in real-time.

Audience can select their preferred language for translation.

Implementation

Use the Whisper base model from Hugging Face for offline transcription in the browser.

Broadcast transcribed text using Supabase Realtime.

Translate text into different languages using another model in the browser.

Technical Details

Utilize Hugging Face Transformers JS and Onyx runtime for model execution in the browser.

Implement WebGPU support for better performance (transcription is WebGPU enabled, translation is not yet).

The application is a client-side JavaScript application using React and Vite.

Use React Router with a hash router for GitHub Pages compatibility.

Audio Web APIs are used for chunking audio for transcription.

Web workers handle transcription and translation tasks.

Transcription Process

Create a transcription worker on the broadcaster page.

Load the Whisper model from Hugging Face once and cache it in the browser.

Use a Singleton pattern to ensure only one instance of the model is loaded.

The transcription worker builds a speech recognition pipeline.

Audio input is batch decoded into output text and sent back to the main thread.

Broadcasting Process

Utilize a utils function for broadcasting sentences using Supabase Realtime.

Generate a random channel ID for broadcasting.

Set up a Supabase project and configure .env.local with the necessary keys and URLs.

Ensure the real-time service is running for broadcasting.

Receiver Process

The receiver subscribes to the real-time broadcast and translates the transcript.

Use a translation worker with a Singleton pattern for the translation pipeline.

The translation model is open-source from Meta, trained on 200 languages.

Stream translations back through a callback function for a live effect.

Use Supabase Realtime to subscribe to a channel and listen for transcripts.

Disable new translation tasks until the current one is complete to avoid overloading.

Additional Notes

The application is purely static, hosted on GitHub Pages.

The code is available on the Supabase Community GitHub repository.

The demo showcases the capabilities of Transformers JS and Supabase Realtime for AI in the browser.

Conclusion

The video demonstrates building a real-time transcription and translation service using client-side technologies.

It emphasizes the use of open-source models and the potential of AI in the browser without the need for a server backend.

Graph View

Realtime AI in the Browser
Summary of AI-Powered Babble Fish Video

🟡 Loading DB...

GitHub Discord Community Obsidian

Created with Quartz v4.5.0 © 2025