AI coding tools are advancing quickly, especially with new models like GPT-5, Gemini 2.5, and Sonnet 4.5. However, other AI skills, such as writing emails, are not improving as much, reflecting a disparity in progress known as the reinforcement gap.
Rapid improvements in AI coding
Coding applications benefit from many measurable tests. These tests allow models to learn and produce working code efficiently through reinforcement learning (RL). This approach has driven AI advancements significantly over the past six months.
As reliance on RL increases, notable differences emerge. Skills like bug-fixing improve rapidly due to clear grading metrics, whereas less measurable skills like writing see only slight progress.
Challenges in measuring some skills
Unlike coding, validating tasks like writing or responding as a chatbot is subjective and difficult to test at scale. Although not all tasks fit neatly into easily or hard-to-test categories, some areas, like financial reporting, could still benefit from tailored testing solutions.
Companies that develop effective testing kits could turn challenging processes into successful products, highlighting the importance of testability in AI advancements.
Emerging technologies and future impacts
Some processes may be more testable than expected. For instance, OpenAI’s Sora 2 model shows significant advancements in AI-generated video, achieving improvements in realism and consistency.
The ongoing role of reinforcement learning in AI development shapes which processes can be automated successfully. This growth creates potential shifts in the job market, especially in fields like healthcare, as tasks become more trainable through RL.