Local UNLIMITED Memory Ai Agent | Ollama RAG Crash Course



AI Nuggets

Based on the provided YouTube transcript, here are the detailed instructions, CLI commands, website URLs, and tips extracted and organized in an easy-to-follow outline form:

Prerequisites

  • Python 3.11 or 3.12 installed on your PC.
  • Install Olama from their official site.

Setup

  1. Create a folder to store all of our program’s code.
  2. Open the folder in VS Code or your preferred code editor.

Install Python Libraries

  1. Create a file called requirements.txt.
  2. Copy the required libraries into requirements.txt.
  3. Open a terminal (Mac/Linux) or command prompt as admin (Windows).
  4. Run the command to install Python libraries:
    pip install -r requirements.txt  
    

Test Olama Models

  • Recommended models: Llama 3 and mixl models.
  • For mini language models: 53.
  • For larger models: llama 3 8B.

Create Assistant Script

  1. Create a file called assistant.py.
  2. Import Olama in the script.
  3. Start by writing code to prompt the language model through the local Olama API and print the AI-generated response.

Start Olama API

  • Before running Python scripts, start up the API by running any Olama model and allowing the chat app to start.

Make Program Conversational

  1. Create a while true loop for continuous interaction.
  2. Request input from the user and append it to the conversation list.
  3. Call Olama’s chat function and print the response.

Implement Streaming Response

  1. Define a function called stream_response to reduce latency.
  2. Use a for loop to print the streamed response.

Create Vector Embeddings

  1. Download the Olama nomic embed text embedding model.
  2. Import chroma DB to use as the vector database.
  3. Create a list of sample prompt responses.
  4. Define a function called create_vector_db to create a vector database from conversations.

Set Up Postgres SQL Database

  1. Install Postgres SQL for your operating system.
  2. Create a superuser for the database.
  3. Create a database named memory_agent.
  4. Set up a table called conversations to store messages and responses.

Store and Retrieve Conversations

  1. Define functions connect_db and fetch_conversations to interact with the Postgres database.
  2. Store The Prompt responses in the SQL database after each response is generated.

Improve Retrieval System

  1. Define a function called create_queries to generate a list of queries for context retrieval.
  2. Use multi-shot learning to teach the language model to respond correctly.
  3. Define retrieve_embeddings to retrieve the most relevant embeddings for each query.
  4. Implement a function classify_embeddings to classify the relevancy of each retrieved context.

Add User Experience Features

  1. Add a loading bar using the tqdm library.
  2. Implement /recall, /forget, and /memorize commands to control memory storage and retrieval.
  3. Colorize print statements using the colorama library.

Final Steps

  • Run the program and interact with your local AI agent.

Note: The exact URLs for downloading Olama or any other specific resources were not provided in the transcript. The instructions above are based on the actions described in the transcript.