More compute will go to reinforcement learning than any other phase of model training as the use of synthetic data ramps up, he predicted.
AI agents promise to supercharge that process, he added, with RL advancing from “a single-shot approach to an agent approach.” He envisions a software development environment where there is first a thin agent layer sitting on top of an editor, but there will also be other “surface areas” that are “headless,” with agents running the show.
