
(Oselote/Shutterstock)
Semantic layers are immediately a sizzling commodity due to their functionality to make personal enterprise information make sense to AI fashions. Databricks and Snowflake are each constructing their very own semantic layers, but when broad trade assist, common applicability, and the aptitude to change information lakehouse suppliers are the objective, then AtScale says it’s forward of the sport.
Over the previous 12 months, the aptitude of enormous language fashions (LLMs) to generate good high quality SQL code has elevated dramatically, which has spurred nice curiosity in utilizing LLMs as defacto information analysts. The massive hope is that using an LLM to transform a pure language question into SQL will allow many extra individuals, functions, and AI brokers to get entry to enterprise information, thereby reaching (lastly!) the longstanding objective within the BI neighborhood of democratizing entry to information.
That’s the grand plan, anyway, however there’s just a few small particulars to work out–together with the truth that the large LLMs have (hopefully) by no means seen your personal database earlier than and due to this fact do not know what the columns, rows, tables, and views truly imply. That’s form of an issue if accuracy is essential to your board of administrators.
And that’s the place a semantic layer performs an essential function, by functioning as a translator, if you’ll, between the particular means you’ve modeled your information in your database–together with the actual measures, dimensions, and metrics that outline your particular person enterprise–and the generic definitions that SQL question engines and AI fashions can learn and perceive.
AtScale Co-founder and CTO David Mariani watched as demand elevated for the kind of semantic layer that his firm builds. Initially developed a dozen years in the past to assist AtScale’s on-line analytical processing (OLAP) question engine, the corporate’s semantic layer itself has turn out to be a giant gross sales driver and a spotlight for the corporate. That makes the trade exercise round semantic layers each good and dangerous, Mariani says.
“It’s like we had been alone in form of shouting from the mountaintops how essential a semantic layer was, and so now the remainder of the market agrees, in order that’s nice. You possibly can’t be a market of 1,” Mariani tells BigDATAwire. “So we’re actually inspired that different individuals are investing on this space. However man, they’ve acquired quite a lot of work in entrance of them. Loads of exhausting work.”
There’s no query {that a} semantic layer can enhance the standard of AI-generated BI queries. AtScale just lately performed a take a look at the place it measured the accuracy of SQL queries generated by Google’s Gemini and Snowflake’s Cortex choices. The primary section of the take a look at measured their efficiency on the Transaction Processing Council (TPC) Knowledge Science (DS) benchmark operating as stand-alone merchandise, and the second section measured how they labored utilizing the AtScale semantic layer functioning as a translator. With out the semantic layer, Gemini and Cortex question outcomes had been within the 0% to 30% accuracy vary, relying on schema and query complexity. With AtScale, the scores had been 100%.
Why did the scores enhance a lot? It’s all about understanding how information is saved within the database, which is the place the complexity lives. The TPC DS benchmark simulates a retailer that sells to customers in three manners: in-store, through the Internet, and thru a catalog. Gross sales in every of these channels is booked individually within the database, however to know what “whole gross sales” means, the particular person or software producing the SQL question must know which particular a part of the database has the proper quantity to plug into the equation.
“It’s acquired to look by means of dozens of tables–and these will not be all simply tables, as a result of every of those inexperienced bins are dimensions, which itself have a mannequin behind it,” Mariani says. “So it’s immensely complicated. And so to get it proper and to get it proper constantly with no map–how are you going to get to the vacation spot with no map?”
One resolution can be to easily give your proprietary database to the LLM, which can finally be capable of determine it out. However most organizations are hesitant to do this for safety and privateness considerations. The choice, in fact, is to take a seat a semantic layer in between the LLM and your database to perform because the map or the translator.
The query, then, turns into which semantic layer to make use of. Many BI instruments, like Looker, Tableau, and PowerBI, include their very own semantic layers, and datalake suppliers, like Snowflake and Databricks, are additionally constructing semantic layers that perceive information saved on their platforms. Alternatively, prospects can select to purchase an unbiased semantic layer that works with a number of front-end BI instruments and backend databases. That is what Mariani and AtScale are constructing: a common semantic layer that works with the whole lot.
“It’s like a Rosetta Stone that means that you can plug various things into it, nevertheless it nonetheless lives inside your firewall,” Mariani says. “The semantic layer is that firewall, that abstraction layer which permits them to have the independence to change out the again finish or change out the entrance finish. As a result of in the end your enterprise logic is identical and your presentation is identical no matter what it’s speaking to.”
AtScale isn’t the one vendor constructing a common semantic layer. Final week we coated the work that its competitor, Dice, is doing. Dbt Labs can be in search of to broaden from its dominant function in information transformation into semantic layers, too.
Mariani respects the work that these distributors are doing, however he additionally insists that AtScale’s semantic layer is extra mature and is best located to turn out to be the usual for this area, if one emerges (which isn’t any assure).

LLMs battle to make sense of complicated information modeling schemes on personal information (Picture supply: AtScale)
In 2024, the corporate took a step towards turning into the trade commonplace by open sourcing the language it makes use of to outline metrics. Dubbed Semantic Modeling Language (SML), the language is now within the open area. Along with defining metrics, SML can be utilized to translate between different semantic layers, together with assist for Snowflake, dbt, PowerBI, and Looker. Mariani says its being donated to the Apache Software program Basis.
Would AtScale take the subsequent step and open supply its semantic engine, as Dice as completed? That’s not within the playing cards in the mean time, Mariani says.
“For now, no, however we’re positively fascinated with establishing a typical open supply semantic modeling language as a result of, we’re seeing there’s now quite a lot of competing languages,” he says. “We’re not the one sport on the town. All people’s gotten into it they usually’re all creating their very own languages. And that’s actually type of dangerous for the trade, I feel.”
There’s another functionality in AtScale’s semantic layer that might be an ace up its sleeve: deep technical assist for Microsoft’s information and analytics stack.
“The problem to a common semantic layer is that it’s important to hook up with the whole lot, and that’s the place we have now a bonus. As a result of we’re multi-dimensional, we are able to assist the Microsoft stack by means of and thru,” he says. “Meaning Excel and Energy BI work natively with AtScale, similar to they’d work with Microsoft Analytics stack. That’s distinctive to us. And that’s actually, actually, actually exhausting as a result of these multidimensional languages will not be meant to be translated right into a tabular SQL language. And we’ve been engaged on that for actually 12 years. Different distributors are going to have a tough time supporting these interfaces.”
As demand for common semantic layers picks up, distributors like AtScale can be proper within the thick of it. The market hasn’t given a sign but whether or not common semantic layers can be favored, or whether or not prospects can be glad with utilizing semantic layers tied to specific BI instruments or information platforms. Within the meantime, better funding on this space means that extra innovation is on the way in which.
Associated Gadgets:
Past Phrases: Battle for Semantic Layer Supremacy Heats Up
AtScale Claims Textual content-to-SQL Breakthrough with Semantic Layer
Is the Common Semantic Layer the Subsequent Large Knowledge Battleground?