AI Engineering at Jane Street - John Crepezzi
AI Summary
Summary of AI Assistant Development at Jane Street
Speaker Introduction
- Name: John Kzi
- Team: AI Assistant at Jane Street
- Background: Extensive experience in development tools, including GitHub.
Overview of the Team’s Focus
- Maximizing value from large language models (LLMs).
- Navigating challenges with off-the-shelf tools due to using OCaml as the primary language.
Challenges with OCaml
- OCaml is obscured, primarily used for theorem proving and formal verification.
- Development practices include:
- Using OCaml libraries to transpile code to other languages (JavaScript, Vim script).
- Building custom tools for development cycles, including:
- Monorepo management.
- Custom distributed build and code review systems.
Need for Custom Models
- LLMs are not generally effective with OCaml due to data availability.
- Built internal models aligned with OCaml code base specificities.
Approach to Model Development
- Define Goals: Generate diffs based on user prompts in editors.
- Data Collection: Use workspace snapshotting to collect data on developer actions and errors.
- Training Process:
- Supervised training with labeled data.
- Reinforcement learning to ensure code quality (compiles and passes tests).
Implementation of AI Development Environment
- Integrated LLMs into editors (VS Code, Neovim, Emacs) with a unified architecture (AID).
- Flexibility to update models without altering editors directly.
- Collect metrics on user experience (latency and diff application).
Editor Integration Examples
- VS Code: Sidebar for multifile diff suggestions.
- Emacs: Markdown buffer interface for interaction.
Future Work
- Expanding applications of RAG (retrieval-augmented generation).
- Exploring multi-agent workflows and reasoning models.
Conclusion
- Focus on building modular, pluggable systems to adapt to evolving technology in AI.
Contact
- Speaker is open to further discussions after the presentation.