Build RAG-Powered AI Agents on Custom Data in Google Colab
AI Summary
Video Summary: Building a RAG-Powered AI Agent with Llama Index on Google Colab
- Objective: Demonstrate how to create a RAG-powered AI agent using Llama Index and Crew AI on Google Colab for free.
- RAG-Powered AI Agent:
- RAG stands for Retrieval Augmented Generation.
- Integrates external data with AI agents to generate context-specific information.
- Tools Used:
- Llama Index: A framework for building large language model applications with custom data.
- Crew AI: A framework for creating customizable AI agents for specific tasks.
- Integration:
- Various models, including API-based ones like OpenAI Gemini, can be integrated with Crew AI.
- Open-source software like AMA can also be used locally.
- Example Data:
- A fictitious finance CSV file with product sales data by region, country, and year.
- Google Colab Setup:
- Installation of Llama Index and integrations.
- Importing necessary libraries and uploading the finance CSV file.
- Data Loading and Indexing:
- Using Llama Index’s directory reader to load the CSV file.
- Creating a vector store and index for the data.
- Large Language Model Setup:
- Using OpenAI’s GPT-4 model.
- Setting up API keys and environment variables in Google Colab.
- Building the AI Agent:
- Defining a “researcher” agent as a senior market analyst.
- Defining a “writer” agent to produce content based on research insights.
- Creating tasks for data analysis and blog post writing.
- Forming a Crew AI team with the researcher and writer agents.
- Execution and Results:
- Kicking off the process with
crew.kick_off
.- Agents collaborate to analyze data and generate a blog post.
- The process results in a markdown-formatted blog post.
- Conclusion:
- The video showcases the integration of Crew AI with custom data and Llama Index to build a RAG-powered application.
- The code will be shared in the video description for viewers to experiment with.
- Viewers are encouraged to subscribe and share the content.
Link to the code in the video description (Note: This is a placeholder as the actual link is not provided in the summary).
!pip install llama-index-core
!pip install llama-index-readers-file
!pip install llama-index-embeddings-openai
!pip install llama-index-llms-llama-api
!pip install ‘crewai[tools]‘
import os
from crewai import Agent, Task, Crew, Process
from crewai_tools import LlamaIndexTool
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.openai import OpenAI
reader = SimpleDirectoryReader(input_files=[“finance.csv”])
docs = reader.load_data()
docs[1].get_content()
from google.colab import userdata
os.environ[‘OPENAI_API_KEY’]=userdata.get(‘OPENAI_API_KEY’)
llm = OpenAI(model=“gpt-4o”)
index = VectorStoreIndex.from_documents(docs)
query_engine = index.as_query_engine(similarity_top_k=5, llm=llm)
query_tool = LlamaIndexTool.from_query_engine(
query_engine,
name=“Finance Query Tool”,
description=“Use this tool to lookup the financial data of products and their sales”,
)
query_tool.args_schema.schema()
researcher = Agent(
role=“Senior Market Analyst”,
goal=“Uncover insights about product sales trends”,
backstory="""You work at a market research firm.
Your goal is to understand sales patterns across different product categories.""",
verbose=True,
allow_delegation=False,
tools=[query_tool],
)
writer = Agent(
role=“Product Content Specialist”,
goal=“Craft compelling content on product trends”,
backstory="""You are a renowned Content Specialist, known for your insightful and engaging articles.
You transform complex sales data into compelling narratives.""",
verbose=True,
allow_delegation=False,
)
Create tasks for your agents
task1 = Task(
description="""Analyze the sales data of top 5 products in the last quarter.""",
expected_output=“Detailed sales report with trends and insights”,
agent=researcher,
)
task2 = Task(
description="""Using the insights provided, develop an engaging blog
post that highlights the top-selling products and their market trends.
Your post should be informative yet accessible, catering to a casual audience.
Make it sound cool, avoid complex words.""",
expected_output=“Full blog post of at least 4 paragraphs”,
agent=writer,
)
Instantiate your crew with a sequential process
crew = Crew(
agents=[researcher, writer],
tasks=[task1, task2],
verbose=2, # You can set it to 1 or 2 to different logging levels
)
result = crew.kickoff()
print(”######################“)
print(result)
Written by Fahd Mirza
Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest
Labels: agentic, crewai, llamaindex