Listen to this story
|
Researchers at Meta, Allen Institute for AI and Washington University have proposed a new open-source language agent designed for complex, multi-step reasoning tasks, christened Husky.
Unlike existing models that focus on specific domains, the researchers claim that Husky operates over a unified action space. This means that it can handle diverse challenges such as numerical, tabular, and knowledge-based reasoning, as opposed to specialised agents that can focus on specific challenges like agents for coding.
Husky iterates between generating actions to solve tasks and executing these actions using expert models, constantly updating its solution state. This iterative has proven a key point of distinction, allowing Husky to outperform previous agents across 14 datasets used for evaluation.
Read the full paper here.
Focus on Mixed-Tool Reasoning
One of Husky’s key innovations is its capability to manage mixed-tool reasoning. It excels in tasks that require retrieving missing knowledge and performing numerical calculations, achieving performance on par with, or exceeding, state-of-the-art models like GPT-4.
The researchers have also introduced HuskyQA, an evaluation set specifically designed to stress test language agents on mixed-tool reasoning tasks, particularly to perform numerical reasoning and retrieve missing knowledge.
Language agents perform complex tasks by using tools to execute each step precisely. However, most existing agents are based on proprietary models or designed to target specific tasks, such as mathematics or multi-hop question answering.
“Our experiments show that Husky outperforms prior language agents across 14 evaluation datasets,” the researchers stated.
While it is true that AI agents have gained significant popularity over the last couple of years, the introduction of an agent capable of reasoning over a number of complex tasks means that agent capabilities are quickly expanding.