This can be a visitor publish by Andries Engelbrecht, Principal Accomplice Options Engineer at Snowflake, in partnership with AWS.
AWS introduced a brand new catalog federation characteristic that lets you immediately entry knowledge from Snowflake Horizon Catalog via the AWS Glue Knowledge Catalog. This integration allows you to uncover and question Horizon Catalog knowledge in Iceberg format via REST endpoints whereas making use of fine-grained entry controls utilizing AWS Lake Formation. The brand new catalog federation mixed with Snowflake’s catalog-linked database characteristic means customers can entry knowledge saved throughout AWS and Snowflake from a single level of entry, decreasing knowledge motion and related prices by eliminating the necessity to duplicate knowledge throughout platforms.
On this publish, we present you methods to join the AWS Glue Knowledge Catalog to Snowflake Horizon Catalog and question the info utilizing AWS analytics providers. We cowl methods to arrange catalogs in Horizon Catalog and configure required permissions, create and configure the federation connection in AWS Glue, implement fine-grained entry controls utilizing AWS Lake Formation, and eventually, question federated tables utilizing Amazon Athena. This step-by-step method guides you thru the entire course of of creating a integration between your Snowflake and AWS knowledge environments.
Enterprise examples and key advantages
Catalog federation allows a number of essential enterprise eventualities whereas delivering key operational and strategic advantages.
Frequent examples
This federation functionality addresses a number of key enterprise eventualities:
- Ruled, cross-platform analytics: Question knowledge throughout AWS and Snowflake environments to enhance data-driven determination making with out knowledge motion or duplication
- Knowledge mesh implementation: Allow safe and federated knowledge discovery whereas sustaining domain-oriented possession
- Compliance administration: Implement constant entry controls and auditing throughout platforms
Key advantages
- Operational effectivity: Get rid of knowledge duplication and cut back Extract Rework Load (ETL) workloads
- Enhanced safety: Centralize entry management via AWS Lake Formation with fine-grained permissions
- Value optimization: Decrease knowledge switch and storage prices throughout platforms
- Improved agility: Allow quicker time to insights with direct question entry
- Simplified governance: Keep unified compliance and audit framework
Answer overview
The answer makes use of catalog federation within the AWS Glue Knowledge Catalog to combine with Snowflake Horizon Catalog. This integration helps each Snowflake Horizon, the place the catalog is inner to Snowflake, and exterior catalogs corresponding to Apache Polaris, Snowflake Open Catalog (a managed service that hosts Apache Polaris), and others.
The next diagram illustrates how AWS Glue Knowledge Catalog federates with Snowflake Horizon Catalog, enabling prospects to immediately entry Iceberg-format knowledge managed by Snowflake Horizon Catalog via the Glue Knowledge Catalog.

The mixing works via three important parts:
- Authentication: Makes use of OAuth2 credentials of Snowflake principal
- Entry Management: AWS Lake Formation manages fine-grained permissions
- Question Entry: AWS Analytics providers like Amazon Athena can immediately question the federated tables
Now, we stroll via the step-by-step technique of establishing this integration.
Stipulations
Earlier than you start, verify you will have the next:
Configure Snowflake Horizon Catalog for Iceberg exterior entry
Snowflake Horizon Catalog already helps managing Iceberg tables. For this walkthrough, you’ll want to create Snowflake-managed Iceberg tables with knowledge saved in Amazon S3.
Observe these steps so as:
- Create an exterior quantity for S3: First, create an exterior quantity that factors to your S3 bucket the place Iceberg desk knowledge is saved. Observe the directions in Create Exterior Quantity(s) for the Iceberg Tables on S3.
- Create a database: Create a database to prepare your tables. Check with the Snowflake database creation documentation.
- Create a schema: Create a schema inside your database following the Snowflake schema creation information.
- Create an Iceberg desk: Create your Iceberg desk utilizing the exterior quantity. Observe the directions to Create Iceberg Desk.
After finishing these steps, your Snowflake-managed Iceberg tables are able to federate with AWS Glue Knowledge Catalog.
Configure entry management and authentication
To allow AWS Glue to entry your Snowflake-managed Iceberg tables, you’ll want to configure entry management and acquire authentication credentials.
Step 1: Configure entry management
Create a devoted Snowflake position for exterior engine entry to determine clear governance boundaries. Observe the directions in Configure Entry Management for exterior engines and arrange the suitable permissions in your Iceberg tables.
Step 2: Acquire an entry token
Generate an entry token for authenticating AWS Glue to Snowflake Horizon Catalog. Snowflake helps three authentication mechanisms:
- Exterior OAuth
- Key-pair authentication
- Programmatic Entry Token (PAT)
Select the authentication technique that most closely fits your safety necessities and comply with the corresponding Snowflake documentation to generate your credentials.
Catalog Federation helps OAuth or customized authentication. For particulars on utilizing OAuth check with Federate to Snowflake Iceberg Catalog.
For this publish, we use customized authentication and generate entry token utilizing PAT. Exchange role_name with the principal position and token_value with the principal’s Programmatic Entry Token.
Observe down the entry token that’s generated.
Step 3: Allow catalog federation
With entry management configured and authentication credentials in hand, AWS Glue Catalog Federation can now hook up with and entry Snowflake’s Horizon Catalog.
Non-obligatory: Snowflake Open Catalog configuration
For those who favor to make use of Snowflake Open Catalog for Iceberg exterior entry as a substitute, check with Sync a Snowflake-managed desk with Snowflake Open Catalog for different setup directions.
Setup Glue Catalog federation with Snowflake Horizon Catalog
Create a secret on AWS Secrets and techniques Supervisor
Log in to AWS console utilizing the IAM position that has entry to AWS Secrets and techniques Supervisor. Open Secrets and techniques Supervisor:
- Select Retailer a brand new secret and choose Different kind of secret for the key kind.
- Set the key-value pair:
- Key:
BEARER_TOKEN - Worth: The entry token famous earlier
- Key:
- Select Subsequent and supply the key title as horizon-secret.
- Full the setup by selecting Retailer.
Alternatively, you should use the CLI to create the key by operating the next command.
Exchange your-access-token and your-region together with your precise values:
Create IAM position for catalog federation
Because the catalog proprietor of a federated catalog in AWS Glue Knowledge Catalog, you should use Lake Formation to implement complete entry controls in your knowledge groups:
Entry management choices
You may implement entry controls at completely different granularity ranges relying in your governance wants:
- Coarse-grained: Desk-level permissions
- High quality-grained: Column-level, row-level, and cell-level filtering
- Tag-based: Dynamic entry primarily based on knowledge classification tags
Lake Formation requires an IAM position with permissions to entry the underlying S3 places of your exterior catalog.
Create an IAM position that permits the Glue Connection to entry AWS Secrets and techniques Supervisor, VPC configurations (optionally available) and Lake formation to handle credential merchandising for S3 bucket/prefix.
Required permissions
- Secrets and techniques Supervisor entry: The Glue connection requires permissions to retrieve secret values from Secrets and techniques Supervisor for OAuth tokens saved in your Snowflake service connection.
- Amazon Digital Personal Cloud (VPC) Entry (optionally available): When utilizing VPC endpoints to limit connectivity to your Snowflake Open Catalog account, the Glue connection wants permissions to explain and use VPC community interfaces. This configuration ensures safe, managed entry to each your saved credentials and community sources whereas sustaining correct isolation via VPC endpoints.
- S3 bucket and AWS Key Administration Service (KMS) key permission: The Glue connection requires S3 permissions to learn certificates if used within the connection setup. Moreover, Lake Formation requires learn permissions on the bucket/prefix the place the distant catalog desk knowledge resides. If the info is encrypted utilizing a KMS key, extra KMS permissions are required.
Setup steps:
Run the next command utilizing AWS CLI by changing the placeholder together with your setup data:
Create a JSON file (e.g., trust-policy.json) with the next construction:
Use the aws iam create-role command, referencing the belief coverage file:
First, create a JSON file (corresponding to, permissions-policy.json) for the permissions:
Then, connect it to the position:
Create federated catalog in Glue Knowledge Catalog
AWS Glue helps the SNOWFLAKEICEBERGRESTCATALOG connection kind for connecting Glue Knowledge Catalog with Snowflake Horizon Catalog and Snowflake Open Catalog. This Glue connector helps OAuth2 authentication and contains extra configuration parameters like CASING_TYPE to customise how AWS Glue Knowledge Catalog discovers metadata within the Snowflake Horizon Catalog accounts.
Log in to your AWS console as an information lake admin and open the AWS Lake Formation console.
- Select Catalog within the left navigation pane and choose Create catalog.
- Select the info supply as Snowflake Horizon Catalog.

- Present the next data:
- Title: Title of the federated catalog in Glue Catalog. For this publish, we use federated_lakehousedb
- Catalog title in Snowflake: Catalog title current in Snowflake Horizon Catalog, this could match actual title in Horizon catalog. For this publish, we use LAKEHOUSEDB
- For Connection particulars, select New connection configurations:
- Connection title: Title for the glue connection. For this publish, we use federatedconnection1.
- Workspace URL: Horizon IRC url (format: https://
.snowflakecomputing.com) - Casing kind: select Uppercase solely
- Authentication:
- Authentication kind: select Customized. Alternatively, you possibly can choose OAuth2 authentication. For Customized authentication, an entry token is created, refreshed, and managed by the shopper’s software or system and saved utilizing AWS Secrets and techniques Supervisor.
- OAuth Secret: Present the key supervisor ARN that was created within the earlier step.
- In case you have AWS PrivateLink setup and/or a proxy setup, you possibly can present community particulars below Settings for community configurations (optionally available).
- For Register Glue reference to Lake Formation:
- Select the IAM position created earlier(LFDataAccessRole) to handle knowledge entry utilizing Lake Formation.
To check the connection, select Run check. After the connection data is validated, it reveals as profitable.
Now you can create the catalog by deciding on Create catalog.
Alternatively, you should use AWS CLI to create connection and catalog utilizing instance instructions:
After the catalog is created, the Horizon databases and tables are listed below the federated catalog.
You may implement advantageous grained entry management on the tables by making use of row/column filter utilizing Lake Formation.
Question the info utilizing Athena question editor:
Open the Amazon Athena console and run the next question to entry the federated Horizon desk:
Clear up
To scrub up your sources, full the next steps:
- Drop the Snowflake Database with Cascade.
- Drop Exterior Quantity created for Iceberg Tables on S3.
- Drop the sources in Glue Knowledge Catalog and Lake Formation created for this publish.
- Delete the IAM roles and S3 buckets used for this publish.
- Delete any VPC, KMS keys if used for this publish setup.
Conclusion
On this publish, we demonstrated methods to set up a safe connection between AWS Analytics providers and Snowflake Horizon Catalog, enabling you to entry your knowledge from a single related and ruled view. You discovered methods to:
- Configure catalog federation between AWS Glue Knowledge Catalog and Snowflake Horizon Catalog
- Arrange OAuth2 authentication for safe entry
- Grant entry to Iceberg desk in Snowflake Horizon Catalog utilizing AWS Lake Formation
- Question federated tables utilizing Amazon Athena
You may comply with the identical steps to determine a safe reference to open-source catalog choices corresponding to Snowflake Open Catalog, a managed service for Apache Iceberg. Bear in mind to wash up any sources you created whereas following this tutorial to keep away from ongoing prices.
To additional discover this resolution in your setting, think about the next sources:
These sources can assist you to implement and optimize this integration sample in your particular use case. As you start this journey, bear in mind to start out small, validate your structure with check knowledge, and step by step scale your implementation primarily based in your group’s wants. Keep tuned for future workshops and sources.
Concerning the authors
Â
Â


