Private LLM Inference - One-Click Open WebUI Setup with Docker



AI Nuggets

Open Web UI Installation and Usage Instructions

Prerequisites

  • Docker installed on your machine
    • For Windows: Download and install Docker Desktop from the official website.
    • For Linux (Ubuntu): Use the command sudo install Docker.

Installation Steps

  1. Navigate to the Open Web UI GitHub repository. The URL is provided in the video description.
  2. Copy the Docker command provided in the repository’s README for a quick start with Docker.
  3. Open your terminal or command prompt.
  4. Run the Docker command to pull and start the Open Web UI container. The command will look something like this:
docker run -d -p 3000:8080 --add-host=host-gateway:host-gateway -v $(pwd)/data:/app/backend/data --name=openwey --restart=always ghcr.io/openai/openwey:main  
  • -d runs the container in detached mode.
  • -p maps the ports from the host to the container.
  • --add-host adds an entry to the container’s /etc/hosts.
  • -v mounts a volume for persistent data.
  • --name assigns a name to the container.
  • --restart=always ensures the container restarts if it stops unexpectedly.
  • ghcr.io/openai/openwey:main is the image to pull from GitHub Container Registry.

Usage Instructions

  1. After running the Docker command, wait for the image to be pulled and the container to start.
  2. Once the container is running, open a web browser and navigate to http://localhost:3000.
  3. Sign up for an account or log in if you already have one.
  4. Customize your settings, such as changing the theme or setting system prompts.
  5. Go to the dashboard and start a new chat with the AI model.
  6. To add a model, use the olama pull command followed by the model name in the terminal.
  7. To run a model, use the olama run command followed by the model name.
  8. Interact with the AI by typing questions or commands.
  9. Utilize additional features such as documents, prompts, tools, and functions to enhance your interaction with the AI.

Tips

  • If you’re running on a CPU machine, the response time may be slower compared to using a GPU.
  • You can personalize your interaction with LLMs by adding memory through the “Manage” button.
  • For a better experience, ensure your system has adequate compute resources.
  • You can create custom models and functions for specific use cases.
  • The system supports both speech-to-text and text-to-speech functionalities.

Additional Resources

  • Join the Discord server for support and community interaction. The link is provided in the video description.
  • Refer to the official Docker documentation for more information on Docker commands and usage.

Remember, if you have any doubts or questions, you can ask in the video’s comment section or on the Discord server.