Local UNLIMITED Memory Ai Agent | Ollama RAG Crash Course
AI Nuggets
Based on the provided YouTube transcript, here are the detailed instructions, CLI commands, website URLs, and tips extracted and organized in an easy-to-follow outline form:
Prerequisites
- Python 3.11 or 3.12 installed on your PC.
- Install Olama from their official site.
Setup
- Create a folder to store all of our program’s code.
- Open the folder in VS Code or your preferred code editor.
Install Python Libraries
- Create a file called
requirements.txt
.- Copy the required libraries into
requirements.txt
.- Open a terminal (Mac/Linux) or command prompt as admin (Windows).
- Run the command to install Python libraries:
pip install -r requirements.txt
Test Olama Models
- Recommended models: Llama 3 and mixl models.
- For mini language models: 53.
- For larger models: llama 3 8B.
Create Assistant Script
- Create a file called
assistant.py
.- Import Olama in the script.
- Start by writing code to prompt the language model through the local Olama API and print the AI-generated response.
Start Olama API
- Before running Python scripts, start up the API by running any Olama model and allowing the chat app to start.
Make Program Conversational
- Create a
while true
loop for continuous interaction.- Request input from the user and append it to the conversation list.
- Call Olama’s chat function and print the response.
Implement Streaming Response
- Define a function called
stream_response
to reduce latency.- Use a for loop to print the streamed response.
Create Vector Embeddings
- Download the Olama nomic embed text embedding model.
- Import
chroma DB
to use as the vector database.- Create a list of sample prompt responses.
- Define a function called
create_vector_db
to create a vector database from conversations.Set Up Postgres SQL Database
- Install Postgres SQL for your operating system.
- Create a superuser for the database.
- Create a database named
memory_agent
.- Set up a table called
conversations
to store messages and responses.Store and Retrieve Conversations
- Define functions
connect_db
andfetch_conversations
to interact with the Postgres database.- Store The Prompt responses in the SQL database after each response is generated.
Improve Retrieval System
- Define a function called
create_queries
to generate a list of queries for context retrieval.- Use multi-shot learning to teach the language model to respond correctly.
- Define
retrieve_embeddings
to retrieve the most relevant embeddings for each query.- Implement a function
classify_embeddings
to classify the relevancy of each retrieved context.Add User Experience Features
- Add a loading bar using the
tqdm
library.- Implement
/recall
,/forget
, and/memorize
commands to control memory storage and retrieval.- Colorize print statements using the
colorama
library.Final Steps
- Run the program and interact with your local AI agent.
Note: The exact URLs for downloading Olama or any other specific resources were not provided in the transcript. The instructions above are based on the actions described in the transcript.