Speaker: Jason Weston, Research Scientist at Facebook, NY and a Visiting Research Professor at NYU
Registration for all CUID holders is preferred. If you do not have an active CUID, registration is required and is due at 12:00 PM the day prior to the seminar. Unfortunately, we cannot guarantee entrance to Columbia’s Morningside campus if you register following 12:00 PM the day prior to the seminar. Thank you for understanding!
REGISTER
Title: Self-Improvement of LLMs
Abstract: Classically, learning algorithms were designed to improve their performance by updating their parameters (weights), while keeping other components, such as the training data, loss function, and algorithm, fixed. We argue that fully intelligent systems will be able to self-improve across all aspects of their makeup. We describe recent methods that enable large language models (LLMs) to self-improve in various ways, increasing their performance on tasks relevant to human users. In particular, we describe methods whereby models are able to create their own training data (self-challenging), train on this data using themselves as their own reward model (self-rewarding), and train themselves to better provide their own rewards (meta-rewarding). We then discuss the future of self-improvement for AI and key challenges that remain unresolved.