Skip to content

Self-Reviewing Agents

A model will yield better results when thinking step by step. But can it review its own response?

2 min read
Self-Reviewing Agents
self-reviewing agents, artificial intelligence, robots, working, helping each other
🦉
Before Growth is a weekly column about startups and their builders prior to product–market fit.

I recently had the chance to test out Sweep, an AI-driven junior programmer designed to tackle issues like bug fixing, implementing simple features, writing tests, completing documentation, and so forth. What struck me during this process was that Sweep is essentially a self-reviewing entity—and that’s quite intriguing.

We’re all fairly familiar now with the fact that results from large language models like ChatGPT can fluctuate quite a bit. The output depends heavily on the given prompt. At times, the model might produce wholly imagined answers, make errors, display human-like cognitive biases, or generate text that statistically seems plausible, but isn’t accurate. There are ways to mitigate these issues, some of which closely resemble strategies we instill in our education system. For instance, asking a model to break down its response step by step often yields superior results, much like how humans often catch their own mistakes when asked to explain their thought process.

Sweep pushes this concept further. It employs dual prompts: the committer, which prompts the model to play the role of a coder writing the script, and the reviewer, which prompts the model to act as a code reviewer providing feedback. The coder then revises and enhances their code based on this feedback, and the cycle continues, yielding better code.

I find this approach fascinating because, in my experience, models, despite having a wealth of knowledge ingrained in them, often don’t put that knowledge to use unless explicitly asked to do so. This is somewhat like how humans can fall victim to certain cognitive biases unless we take a moment to slow down and apply a more scientific approach to our own thinking. We also have to consciously engage our internal reviewer!


Related posts

The Facts Are Friendly

People and startups grow by facing reality.

The Facts Are Friendly

Is ChatGPT the New Alexa?

Did any custom GPTs get traction or is that playing out like Alexa skills?

Is ChatGPT the New Alexa?

High Tolerance for Imbalance

Adapt or die.

High Tolerance for Imbalance