In a not-yet-peer-reviewed paper, a team of researchers from Northeastern University and the Massachusetts Institute of Technology suggests that large language models (LLM) might be able to learn from their own mistakes — just like humans.

Teaching them to do so, they say, might be able to push AI technologies into a new phase of autonomous problem-solving.

"Self-reflection allows humans to efficiently solve novel problems through a process of trial and error," the researchers write in the paper. "Building on recent research, we propose Reflexion, an approach that endows an agent with dynamic memory and self-reflection capabilities to enhance its existing reasoning trace and task-specific action choice abilities."

In other words, their methodology dubbed "Reflexion" is a framework for teaching AI models via prompts to apply a trial-and-error technique to their outputs.

So, just like us, if at first, they don't succeed, they can try, try again.

Testing their new framework was a relatively simple process. The machine, or "agent," was presented with problem-solving tasks and asked to complete them; when it messed up, it was prompted with the Reflexion technique to find those mistakes for itself — a process that they claim helps the program evolve, just like humans.

"To achieve full automation, we introduce a straightforward yet effective heuristic that enables the agent to pinpoint hallucination instances, avoid repetition in action sequences, and, in some environments, construct an internal memory map of the given environment," the researchers write in their paper.

Using a series of standardized "decision-making tasks," the researchers found that their methodology was able to greatly improve on a model's given success rates.

The scientists note that their research was conducted using GPT-3 and GPT-3.5-powered AIs — an important consideration, given that OpenAI just released the much more powerful GPT-4. Although, in an accompanying blog post that discusses the paper, the scientists say that when applied to GPT-4, a "slightly-improvised Reflexion-based GPT-4 agent" was correct 88 percent of the time, outperforming its 67 percent success rate pre-Reflexion.

Again, this paper isn't peer-reviewed, so definitely take the researchers' results with the usual grain of salt.

That said, AI programs mess up a lot, and as they continue to be embedded into workflows across industries and platforms, frameworks for addressing their pitfalls are certainly needed. While this research is more or less an exercise in prompt engineering — rather than addressing the problem of hallucinations from the inside out — it could help in the development of tools that can verify the infamously unreliable output of AI language models.

Besides, a little self-reflection never hurt anyone — human or machine.

More on machine hallucinations: Researchers Find Gpt-4 Is Significantly Less Accurate than Gpt-3


Share This Article