Change to Apache Iceberg May Streamline Queries, Open Information

June 10, 2025

107

(greenbutterfly/Shutterstock)

The oldsters behind the Apache Iceberg venture are contemplating making an architectural change to the specification within the subsequent model that might enable question engines to entry metadata immediately from the Iceberg catalog, thereby avoiding the necessity to speak to the underlying object storage system. If applied, the change–which mirrors in a roundabout way how the brand new DuckLake desk format works–might have implications on how information is saved and retrieved in Iceberg-based lakehouses.

The best way the Iceberg specification is at present written, the metadata that describes the Iceberg tables are saved in an on-disk format that’s required to reside immediately on the article storage, comparable to Amazon S3 or Google Cloud Storage (GCS). When a question engine, comparable to Apache Spark or Trino, submits a question, the REST-based metadata catalog (comparable to Apache Polaris) sends the engine a path that leads again to the article storage system to get the information.

“Usually if you learn an Iceberg desk, the very first thing you do is you get a path from the catalog and it tells you the place to learn a set of snapshots,” Russell Spitzer, a principal engineer at Snowflake and a member of the venture administration committee (PMC) for each Apache Iceberg and Apache Polaris, defined. “You begin studying your snapshot. That’s one other file on disk that offers you an inventory of manifests and every manifest has an inventory of information recordsdata. After which from all of that, you ultimately ship out that data to your to your staff, and so they begin truly studying information recordsdata.”

As a substitute of storing simply the highest of the metadata tree inside the REST catalog like Polaris, the change would enable the complete metadata tree to reside within the catalog. That will get rid of the necessity for the question engines to return to the article storage system to determine what information it wants, streamlining the information circulate and decreasing question latency.

The prevailing structure was constructed for a cause. For starters, object storage is infinitely scalable, so you’ll by no means run into an issue the place you’ll be able to’t match your whole metadata within your catalog, Spitzer mentioned. It’s additionally very straightforward for different shoppers to take care of. Nonetheless, right this moment’s question engines have extra intelligence inbuilt, and the additional layer of metadata storage and entry actually isn’t wanted. That’s main the Iceberg and Polaris initiatives to discover how they may retailer extra metadata within the catalog itself.

“One of many issues that we need to transfer in direction of, or not less than begin interested by, is how a lot of that may we cache on the catalog degree?” Spitzer informed BigDATAwire on the Snowflake Summit final week in San Francisco. “A of those programs, like Trino, Spark, and Snowflake, could have a coordination system that doesn’t want to really know the nitty gritty of each information file that’s being learn, as a result of what they really simply want is to know what parts of information are they going to assign out to their staff. After which the employees can get that with a reference to the catalog and say, ‘Hey, I’m a part of scan 5. I’m imagined to learn job 4.’ After which these information file paths will get despatched straight to the employee node as a substitute of to the coordinator. So principally you optimize away that path.”

The excellent news is that the Iceberg specification already has an API for this. It’s known as the scan API, and it permits question engines to entry metadata immediately from the REST catalog. That API had been described, however not truly developed. That growth work is going on proper now, in accordance with Spitzer. The brand new performance may very well be a part of Apache Iceberg model 4.

Along with optimizing the trail, bypassing the extra metadata layer on the article storage system might additionally enable customers to export information immediately from Iceberg lakehouses into different Iceberg lakehouses, Spitzer mentioned.

Credit: DuckDB

“You probably have a shopper that is aware of the best way to learn these scan duties which might be produced, you don’t really need the underlying desk to be in that illustration. You simply must know the best way to learn it into that on the on the catalog aspect, so the shopper doesn’t must be accustomed to all kinds of various desk codecs,” Spitzer mentioned. “The shopper simply must understand how the Iceberg Relaxation spec communicates, after which you’ll be able to principally have help for all sorts of various desk codecs in your catalog transparently to your customers, with no conversion of the metadata. You simply give them completely different units of Parquet information recordsdata.”

Enabling direct entry to desk format metadata and avoiding the necessity for a single root file that controls entry to information is without doubt one of the options within the newly launched DuckLake providing from DuckDB. DuckLake, which describes a brand new desk format and a lakehouse structure, adopts a SQL database to handle metadata, which is one thing that DuckDB’s Mark Raasveldt and Hannes Mühleisen talked about at size in a weblog publish.

Spitzer acknowledged that similarities between the proposed Iceberg adjustments and DuckLake. “It was attention-grabbing to me when Duck Lake was introduced just a bit whereas in the past, as a result of we’re already interested by these concepts,” he mentioned. “I used to be like, okay I assume that’s validation that what we’re interested by is what people are interested by too.”

If the brand new strategy is applied, it might possible be optionally available, in accordance with Spitzer, and customers would have the selection of permitting question engines to entry metadata immediately or use the prevailing strategy.

Associated Gadgets:

DuckLake Makes a Splash within the Lakehouse Stack – However Can It Break Via?

How Apache Iceberg Gained the Open Desk Wars

The Open Optimism of Apache Polaris