Instructions for Installing and Using Guidance with Large Language Models
Prerequisites
Use Google Colab with a T4 GPU runtime.
Installation Steps
Install Guidance:
pip install guidance
Install Lama CPP (Python wrapper for Llama models):
pip install lama-cpp-python
This may take a few minutes to build the wheel.
Downloading the Model
Choose a model from the Hugging Face website:
Go to the Hugging Face models page.
Click on “Files” to see different versions.
Select a model version based on your disk space.
Click on “Download” or right-click and copy the link.
Use the wget command in Colab to download the model:
wget <MODEL_URL>
Replace <MODEL_URL> with the actual URL of the model you want to download from Hugging Face.
Initializing Guidance
Import the models from guidance:
from guidance import models
Create an initial model object using the constructor under guidance.model:
model = guidance.model(path_to_model, n_gpu_layers=-1, n_core_ctx=4096)
path_to_model: The local path to the downloaded model file.
n_gpu_layers: Set to -1 to use all available GPU layers for faster processing.
n_core_ctx: The context size or the maximum sequence length the model can handle (e.g., 4096 tokens).
Using Guidance for Inference
Specify your prompt with the model:
prompt = "Your prompt here"
Concatenate the model with the string (prompt):
result = model + prompt
Generate the output with a specified maximum number of tokens:
output = result.gen(max_tokens=150)
max_tokens: The maximum number of tokens to generate (e.g., 150).
Additional Capabilities
Guidance allows for multi-generation, variable capture, function encapsulation, and interleaved generation.
You can control various aspects of the generation process.
Resources
The GitHub repo link for Guidance will be provided in the video description.
Tips
You can use any model that follows the Llama architecture with Guidance.
Guidance is designed to allow you to write code in a familiar Pythonic way while leveraging the power of large language models.
Note
The exact URLs for the models and the GitHub repository are not provided in the transcript. They should be available in the video description or by following the instructions to navigate the Hugging Face website.
Call to Action
Consider subscribing to the channel for more content.
Share the video with your network if you find it helpful.