Self Hosted AI Server gateway for LLM APIs, Ollama, ComfyUI & FFmpeg servers
AI Summary
AI Server Overview
- AI Server is a free, self-hosted unified API for interacting with various AI models and services.
- It supports major LLM APIs, Open Router, Olama instances, Comfy UI workflows, and more.
- Features include typed client support for 11 languages, API key access control, live background jobs monitoring, and a flexible, easy installation process.
- Integrates with Comfy UI for simplified AI tasks like text-to-image, speech-to-text, text-to-speech, etc.
Installation Process
- Clone the AI Server repository from GitHub (
server-stack/ai-server
).- Run the installation script (
install.sh | bash
) in the cloned AI Server folder.- During installation, provide API keys for providers like Open Router, OpenAI, Mistal, Google Cloud, Gro Cloud, and Replicate.
- Set up an auth secret for accessing the AI Server’s admin UI.
- Configuration details are saved to a
.env
file for review.Features and Configuration
- AI Server acts as a single OpenAI-compatible entry point for LLM integrations.
- Supports APIs for modalities like text-to-image, image-to-image, image-to-text, image upscaling, speech-to-text, and more.
- Comfy UI agent can be deployed on separate hosts with specific hardware requirements (e.g., Nvidia GPUs).
- Admin portal accessible at
localhost:5006/for/admin
for configuring AI providers, generating API keys, monitoring tasks, and testing integrations.UIs and Testing
- Chat UI: Test different AI providers and steer chats with system prompts.
- Text to Image UI: Generate images with prompts, resolution, image count, and batch requests.
- Image to Text UI: Describe images using the Florence 2 model via Comfy UI agent.
- Image to Image UI: Transform images with prompts using an inpainting workflow.
- Upscaling UI: Double image size using an upscale model on Comfy UI agent.
- Speech to Text UI: Convert audio to text using OpenAI’s Whisper model.
- Text to Speech UI: Generate spoken audio from text using Comfy UI agent or OpenAI’s service.
API and Client Support
- OpenAI-compatible chat API accessible using service clients in languages like C#, TypeScript, Python, PHP, Java, Swift, etc.
- Provides typed end-to-end experience with language-specific DTOs.
- Offers sync and async endpoints with web callback feature for event-driven integration.
Additional Information
- AI Server is open-source, actively developed, and available on GitHub.
- Documentation and links to the project are provided in the video description.
(Note: The summary is based on the provided transcript and does not include any URLs or CLI commands as none were specified in the text.)