Researchers from Sea AI Lab, UCAS, NUS, and SJTU Introduce FlowReasoner: a Question-Degree Meta-Agent for Customized System Era

April 27, 2025

56

LLM-based multi-agent methods characterised by planning, reasoning, software use, and reminiscence capabilities type the inspiration of functions like chatbots, code technology, arithmetic, and robotics. Nonetheless, these methods face important challenges as they’re manually designed, resulting in excessive human useful resource prices and restricted scalability. Graph-based strategies have tried to automate workflow designs by formulating workflows as networks, however their structural complexity restricts scalability. State-of-the-art approaches characterize multi-agent methods as programming code and use superior LLMs as meta-agents to optimize workflows, however deal with task-level options that generate single task-specific methods. This one-size-fits-all strategy lacks the potential for computerized adaptation to particular person consumer queries.

LLM-based multi-agent methods are the inspiration for numerous real-world functions, together with code intelligence, laptop use, and deep analysis. These methods function LLM-based brokers outfitted with planning capabilities, database entry, and gear operate invocation that collaborate to attain promising efficiency. Early approaches targeted on optimizing prompts or hyperparameters via evolution algorithms to automate agent profiling. ADAS launched code illustration for brokers and workflows with a meta-agent to generate workflows. Furthermore, OpenAI has superior reasoning in LLMs by growing the o1 mannequin. Fashions like QwQ, QvQ, DeepSeek, and Kimi have adopted go well with, growing o1-like reasoning architectures. OpenAI’s o3 mannequin achieves promising outcomes on the ARG-AGI benchmark.

Researchers from the Sea AI Lab, Singapore, the College of Chinese language Academy of Sciences, the Nationwide College of Singapore, and Shanghai Jiao Tong College have proposed FlowReasoner, a query-level meta-agent designed to automate the creation of query-level multi-agent methods, producing one personalized system per consumer question. The researchers distilled DeepSeek R1 to provide FlowReasoner with the basic reasoning capabilities wanted to create multi-agent methods, after which enhanced it via reinforcement studying with exterior execution suggestions. A multi-purpose reward mechanism is developed to optimize coaching throughout three important dimensions: efficiency, complexity, and effectivity. This allows FlowReasoner to generate customized multi-agent methods via deliberative reasoning for every distinctive consumer question.

Researchers from Sea AI Lab, UCAS, NUS, and SJTU Introduce FlowReasoner: a Question-Degree Meta-Agent for Customized System Era

The researchers choose three datasets: BigCodeBench for engineering-oriented duties, HumanEval, and MBPP for algorithmic challenges for detailed analysis throughout various code technology eventualities. FlowReasoner is evaluated in opposition to three classes of baselines:

Single-model direct invocation utilizing standalone LLMs
Manually designed workflows together with Self-Refine, LLM-Debate, and LLM-Blender with human-crafted reasoning methods
Automated workflow optimization strategies like Aflow, ADAS, and MaAS that assemble workflows via search or optimization.

Each o1-mini and GPT-4o-mini are used as employee fashions for manually designed workflows. FlowReasoner is applied with two variants of DeepSeek-R1-Distill-Qwen (7B and 14B parameters) utilizing o1-mini because the employee mannequin.

FlowReasoner-14B outperforms all competing approaches, attaining an general enchancment of 5 share factors in comparison with the strongest baseline, MaAS. It exceeds the efficiency of its underlying employee mannequin, o1-mini, by a considerable margin of 10%. These outcomes present the effectiveness of the workflow-based reasoning framework in enhancing code technology accuracy. To judge generalization capabilities, experiments are carried out changing the o1-mini employee with fashions like Qwen2.5-Coder, Claude, and GPT-4o-mini, whereas retaining the meta-agent mounted as both FLOWREASONER-7B or FLOWREASONER-14B. FLOWREASONER displays notable transferability, sustaining constant efficiency throughout totally different employee fashions on the identical duties.

On this paper, researchers current FlowReasoner, a query-level meta-agent designed to automate the creation of customized multi-agent methods for particular person consumer queries. FlowReasoner makes use of exterior execution suggestions and reinforcement studying with multi-purpose rewards specializing in efficiency, complexity, and effectivity to generate optimized workflows with out counting on complicated search algorithms or fastidiously designed search units. This strategy reduces human useful resource prices whereas enhancing scalability by enabling extra adaptive and environment friendly multi-agent methods that dynamically optimize their construction based mostly on particular consumer queries fairly than counting on mounted workflows for complete activity classes.

Try the Paper and GitHub Web page. Additionally, don’t neglect to comply with us on Twitter and be part of our Telegram Channel and LinkedIn Group. Don’t Neglect to hitch our 90k+ ML SubReddit.

🔥 [Register Now] miniCON Digital Convention on AGENTIC AI: FREE REGISTRATION + Certificates of Attendance + 4 Hour Quick Occasion (Might 21, 9 am- 1 pm PST) + Palms on Workshop

Sajjad Ansari is a remaining 12 months undergraduate from IIT Kharagpur. As a Tech fanatic, he delves into the sensible functions of AI with a deal with understanding the affect of AI applied sciences and their real-world implications. He goals to articulate complicated AI ideas in a transparent and accessible method.