Teaching LLMs to Use Tools at Scale - Shishir Patil | Stanford MLSys #98
AI Summary
- Speaker Introduction:
- Shisher, a fifth-year PhD student at Sky Computing and Berkeley AI Research labs.
- Focus on LLMs (Large Language Models) and tool use.
- Previous work on ML systems for inference and training on edge devices.
- Former research fellow at Microsoft Research.
- Undergraduate degree from India.
- Talk Overview:
- Presenting vision for connecting LLMs to tool use.
- Demonstrations of the work.
- Insights and open questions for audience’s projects or research directions.
- Connecting LLMs to Tool Use:
- Current LLM usage involves prompting LLM, receiving a response, and user acting on that response.
- Goal: Flip this process with the model called Gorilla.
- User prompts Gorilla → Gorilla performs action → Gorilla gathers feedback → Gorilla relays response to user.
- Humans are good discriminators, LLMs are good generators.
- Example: Installing software on Linux using LLM to generate bash commands.
- Gorilla Model:
- Supports various hyperscalers (AWS, GCP, Azure) and services (Kubernetes, Salesforce, etc.).
- Project supports 60,000 APIs and growing.
- Gorilla is robust, open-source, and used in enterprises.
- Other groups have adapted Gorilla’s ideas for their models.
- Demonstrations:
- Command-line interface to list GCP instances.
- Jupyter notebook for translating text using a specific model.
- Key Ideas Behind Gorilla:
- Retrieval-Aware Training (RAT): Fine-tuning model to use or ignore retrieved context.
- Measuring hallucination using Abstract Syntax Trees (ASTs) to verify if API calls generated by LLMs are valid.
- Performance and Insights:
- Gorilla outperforms other models in API calling tasks.
- Hallucination rates can be measured and compared across models.
- Fine-tuning converges across different models with similar accuracy.
- Execution Engine (GoEx):
- Allows LLMs to perform actions with delayed verification.
- Provides reversibility guarantees and blast radius controls.
- Policies and abstractions for safe execution of LLM-generated actions.
- Conclusion:
- Gorilla connects LLMs to a wide range of tools via API calls.
- RAT and hallucination measurement are key to training effective LLMs for tool use.
- GoEx aims to enable LLMs to act autonomously with safety measures.
For further details, the audience is encouraged to read the associated papers and explore the open-source projects.