Self-Reviewing Agents
A model will yield better results when thinking step by step. But can it review its own response?
I recently had the chance to test out Sweep, an AI-driven junior programmer designed to tackle issues like bug fixing, implementing simple features, writing tests, completing documentation, and so forth. What struck me during this process was that Sweep is essentially a self-reviewing entity—and that’s quite intriguing.
We’re all fairly familiar now with the fact that results from large language models like ChatGPT can fluctuate quite a bit. The output depends heavily on the given prompt. At times, the model might produce wholly imagined answers, make errors, display human-like cognitive biases, or generate text that statistically seems plausible, but isn’t accurate. There are ways to mitigate these issues, some of which closely resemble strategies we instill in our education system. For instance, asking a model to break down its response step by step often yields superior results, much like how humans often catch their own mistakes when asked to explain their thought process.
Sweep pushes this concept further. It employs dual prompts: the committer, which prompts the model to play the role of a coder writing the script, and the reviewer, which prompts the model to act as a code reviewer providing feedback. The coder then revises and enhances their code based on this feedback, and the cycle continues, yielding better code.
I find this approach fascinating because, in my experience, models, despite having a wealth of knowledge ingrained in them, often don’t put that knowledge to use unless explicitly asked to do so. This is somewhat like how humans can fall victim to certain cognitive biases unless we take a moment to slow down and apply a more scientific approach to our own thinking. We also have to consciously engage our internal reviewer!
Philosophically speaking, this leads me to wonder if such behavior is learned—essentially mimicked—from human inputs given to the model, or if it’s inherent to the cognitive process itself. It might indicate a feature common to all intelligent beings. Well, maybe not all, but those with structurally networked minds. (Can minds exist without such networking? It’s anyone’s guess.) It could also be tied to their method of information retrieval or generation.
Intriguing thoughts to toy with. Don’t expect me to give you answers, though.
The future of tech, direct to your inbox
Discover the next generation. Subscribe for hand-picked startup intel that’ll put you ahead of the curve, straight from one founder to another.