Final 12 months, the promise of information intelligence – constructing AI that may purpose over your information – arrived with Mosaic AI, a complete platform for constructing, evaluating, monitoring, and securing AI programs. Since then, 1000’s of our clients have shipped information intelligence into manufacturing, constructing domain-specific brokers powered by their enterprise information:
- Mastercard shipped digital assistants to speed up buyer onboarding
- AT&T protects wi-fi clients from fraud and hurt
- Disaster Textual content Line constructed AI brokers specialised for psychological well being to coach the following technology of disaster counselors
- Block shipped goose, an AI coding assistant grounded in enterprise context
Nonetheless, the immaturity of the generative know-how meant that the journey to manufacturing was nonetheless difficult. Constructing high-quality brokers was usually too complicated, for a number of causes:
- Analysis is tough: Many enterprise AI duties are tough to guage, for each people and even automated LLM judges. Tutorial benchmarks comparable to math exams didn’t translate to real-world use circumstances. Constructing nuanced evaluations usually required costly guide labeling. Consequently, promising tasks stalled in infinite tuning cycles, with stakeholders dropping confidence because of unclear progress.
- Too many knobs: Brokers are complicated AI programs with many parts, every which have their very own knobs. From tuning prompts to index chunking methods to mannequin decisions and fine-tuning parameters, every adjustment creates unknown results throughout the system. What needs to be quick iterative enchancment turns into an costly and tedious guide trial-and-error, slowing time to manufacturing.
- Value and high quality: Even after groups resolve the above points and construct a high-quality agent, they’re usually shocked to seek out that the agent is just too costly to scale into manufacturing. So groups get stalled in both an extended value optimization course of, or are compelled to make trade-offs between value and high quality.
Agent Bricks: Auto-optimizing brokers in your area duties
Primarily based on our above experiences working with clients to ship AI into manufacturing, we’ve spent the final 12 months re-thinking how you can construct brokers. At this time, we’re introducing Agent Bricks, a brand new product that modifications how enterprises develop domain-specific brokers. Slightly than managing the overwhelming complexity of agent growth, groups can concentrate on what issues most: defining their agent’s goal and offering strategic steerage on high quality by means of pure language suggestions. Agent Bricks handles the remaining, mechanically producing analysis suites and auto-optimizing the standard.
Right here’s the way it works:
- Declare your activity. Choose your activity, outline in pure language a high-level description of what you need the agent to perform, and join your information sources.
- Computerized analysis: Agent Bricks will then mechanically create analysis benchmarks particular to your activity, which can contain synthetically producing new information or constructing {custom} LLM judges.
Powered by MLflow 3, Agent Bricks mechanically creates analysis datasets and {custom} judges tailor-made to your activity. - Computerized Optimization: Agent Bricks intelligently searches by means of and combines varied optimization methods, comparable to immediate engineering, model-finetuning, reward fashions, or test-adaptive optimization (TAO) to attain top quality.
- Value and high quality: Agent Bricks ensures brokers should not solely extremely efficient but in addition cost-effective. Customers can select between cost-optimized or quality-optimized fashions. In lots of circumstances, the top answer is each increased high quality and decrease value in comparison with different DIY approaches.
With Agent Bricks, get rid of guesswork by means of computerized evaluations. We auto-optimize the knobs, so you may belief your agent’s efficiency and know you are working at peak effectivity. The top result’s that you would be able to now ship high-quality and cost-efficient brokers into manufacturing. Agent Bricks is optimized for frequent business use circumstances, together with structured data extraction, dependable information help, {custom} textual content transformation, and orchestrated multi-agent programs.
Construct high-quality brokers with Agent Bricks
Agent Bricks is uniquely capable of measure, construct, and frequently enhance high quality. With constructing conversational brokers over paperwork, for instance, we measured high quality common throughout a number of Q&A benchmarks. In comparison with different merchandise on this area, Agent Bricks constructed considerably increased high quality brokers (Determine 1). Not solely that, with the power for continuous studying, efficiency continues to enhance over time.

For doc understanding, Agent Bricks builds higher-quality and lower-cost programs, in comparison with prompt-optimized proprietary LLMs (Determine 2). We are able to obtain a system that’s increased high quality on a doc parsing benchmark, however as much as 10x decrease value.

Past these benchmarks, our clients are additionally capable of construct high quality brokers with Agent Bricks:
“Agent Bricks enabled us to double our medical accuracy over customary business LLMs, whereas assembly Flo Well being’s excessive inner requirements for scientific accuracy, security, privateness, and safety.”
— Roman Bugaev, CTO, Flo Well being
“Agent Bricks considerably outperformed our unique open-source implementation in each LLM-as-judge and human analysis accuracy metrics.”
— Joel Wasson, Enterprise Knowledge & Analytics, Hawaiian Electrical
“[Agent Bricks] accelerated our AI capabilities throughout the enterprise, guiding us by means of high quality enhancements within the suggestions loop and figuring out lower-cost choices that carry out simply as properly.”
— Chris Rishnick, Director of AI, Lippert
Powered by the newest analysis in agent studying
Agent Bricks is ready to obtain these outcomes as a result of it’s powered by the analysis coming from our Databricks Mosaic AI Analysis staff. There’s a zoo of strategies for bettering agent high quality, and new analysis is launched at a breathless tempo. Our staff each curates current analysis and in addition develops new improvements which are then utilized by Agent Bricks in the course of the computerized analysis and optimization section. Whereas we’ve got an expansive set of strategies, at this time we’re excited to spotlight one in all our improvements – Agent Studying from Human Suggestions (ALHF).
Agent Studying from Human Suggestions (ALHF)
A key problem to high quality is the power to steer agent habits from suggestions. That is notably tough as a result of suggestions is commonly solely supplied with a thumbs up or thumbs down, and it is unclear which of the various parts and knobs inside an agent system have to be adjusted to respect the suggestions. The present strategy, which is to pack all of the directions into one large LLM immediate, is brittle and doesn’t generalize to a extra complicated agent system.
With ALHF, we’ve solved this with two approaches. First, we’re capable of obtain the wealthy context of pure language steerage (e.g. ignore all information earlier than Could 1990). Second, primarily based on this pure language steerage, our algorithms intelligently translate the steerage into technical optimizations – refining the retrieval algorithm, enhancing prompts, filtering the vector database, and even modifying the agentic sample.
This strategy democratizes agent growth, permitting area consultants to contribute on to system enchancment with out deep technical experience in AI infrastructure.
“The flexibility to constantly consider and enhance accuracy is a key functionality for Experian, particularly in a extremely regulated business.”
— James Lin, Head of AI ML Innovation, Experian
The Path Ahead: From Lab to Manufacturing in Days, Not Months
Early clients are already experiencing the transformation Agent Bricks delivers – accuracy enhancements that double efficiency benchmarks and cut back growth timelines from weeks to a single day. Extra importantly, they’re reaching one thing that appeared unimaginable simply months in the past: sustainable, scalable AI programs that ship constant enterprise worth.
Agent Bricks represents greater than an evolution in tooling – it is a basic shift towards mature, production-ready AI growth. As agent programs change into more and more central to enterprise operations, the “vibe examine” approaches of the previous merely will not scale. Organizations want a strong, systematic strategy to constructing and optimizing clever brokers that may deal with the complexity and necessities of real-world enterprise functions.
Clients utilizing Agent Bricks
Many Databricks clients have already constructed AI Brokers with Agent Bricks, and we’re all wanting ahead to seeing what they will do sooner or later.
Watch the video with Experian and Flo Well being
“With Agent Bricks, our groups had been capable of parse by means of greater than 400,000 scientific trial paperwork and extract structured information factors, with out writing a single line of code. In slightly below 60 minutes, we had a working agent that may remodel complicated unstructured information usable for Analytics.”
— Joseph Roemer, Head of Knowledge & AI, Industrial IT, AstraZeneca
“Agent Bricks allowed us to construct a cheap agent we may belief in manufacturing. With custom-tailored analysis, we confidently developed an data extraction agent that parsed unstructured legislative calendars, saving 30 days of guide trial-and-error optimization.”
— Ryan Jockers, Assistant Director of Reporting and Analytics on the North Dakota College System
Attempt Agent Bricks At this time
Able to bridge the hole between “demo high quality” and “manufacturing high quality”? Agent Bricks is now obtainable in beta.
Get began:
The way forward for enterprise AI is not about managing complexity – it is about specializing in the outcomes that matter whereas Agent Bricks handles the remaining.