Effectively integrating and analyzing Salesforce knowledge is important in at the moment’s enterprise atmosphere. AWS Glue Zero ETL (extract, rework, and cargo) now helps Salesforce Bulk API, delivering substantial efficiency positive factors in comparison with Salesforce REST API for large-scale knowledge integration for targets akin to Amazon SageMaker lakehouse and Amazon Redshift. You should use this enhancement to course of hundreds of thousands of Salesforce data in minutes whereas effectively dealing with wide-column entities with a whole lot of fields. On this weblog put up, we present you find out how to use Zero-ETL powered by AWS Glue with Salesforce Bulk API to speed up your knowledge integration processes.
Zero-ETL represents a contemporary method to knowledge integration that eliminates the necessity for conventional ETL processes by establishing direct connections between knowledge sources and locations. Somewhat than explicitly extracting knowledge, reworking it, and loading it in separate steps, Zero-ETL handles these operations within the background. Zero-ETL allows direct integration with software program as a service (SaaS) functions like Salesforce, robotically synchronizing knowledge whereas sustaining consistency and eliminating the complexity of guide ETL pipeline growth. This method reduces growth time, upkeep overhead, and the potential for errors in knowledge motion processes.
Answer overview
Historically, Zero-ETL used Salesforce REST API for knowledge ingestion. Whereas the REST API supplies an easy solution to work together with Salesforce knowledge, it comes with sure limitations, particularly when coping with giant datasets. These embrace request limits, knowledge quantity constraints, efficiency overhead, and concurrency limitations. As of August 2025, relying on the Salesforce version and license kind, you could be restricted to between 15,000 and 100,000 API calls per 24-hour interval. When retrieving giant volumes of knowledge, a number of API calls are required, resulting in inefficiency and prolonged processing instances.
To deal with these limitations and improve efficiency, AWS Glue Zero-ETL now helps Salesforce Bulk API. The Bulk API is designed for processing giant datasets, providing a number of benefits over the REST API. It makes use of asynchronous processing, so you’ll be able to course of a lot bigger knowledge volumes with out timing out. Knowledge is processed in batches, which will be parallelized for sooner processing. As of August 2025, the Bulk API additionally has extra beneficiant limits; as much as 150,000,000 API calls, which is 15,000 batches, per 24-hour interval, with every batch containing as much as 10,000 data. The next diagram reveals a Salesforce Zero-ETL structure ingesting knowledge by means of Salesforce Bulk and REST APIs and writing to Amazon SageMaker Lakehouse (in Amazon Easy Storage Service (Amazon S3) or Apache Iceberg) or Amazon Redshift.
The diagram illustrates the Zero-ETL knowledge stream from Salesforce to AWS analytics companies. Salesforce knowledge is ingested utilizing good API processing, which intelligently selects between Bulk API for normal fields and REST API for compound fields. This method is critical as a result of, as of now, the Salesforce Bulk API doesn’t assist compound fields (akin to Handle). Subsequently, you need to use the REST API in such instances for complete knowledge extraction. The answer helps Salesforce wide-column entities containing as much as 800 fields, enabling complete knowledge integration. The processed knowledge is then staged in an S3 bucket owned by the service crew earlier than being made accessible within the AWS Glue Knowledge Catalog or Amazon Redshift, prepared for analytics and machine studying functions.
AWS Glue Zero-ETL now makes use of the Salesforce Bulk API by default for many knowledge integration eventualities, delivering superior efficiency and scalability. This method optimizes knowledge extraction for many use instances, notably when coping with giant datasets. Nonetheless, the answer robotically switches to the REST API when dealing with compound fields. Compound fields, akin to addresses (which embrace road, metropolis, state, postal code, and nation), are robotically processed utilizing the REST API.This clever API choice supplies environment friendly processing whereas sustaining the efficiency advantages of the Bulk API for normal knowledge extraction. This hybrid method supplies the perfect of each worlds: the scalability and throughput of the Bulk API for many operations, with the specialised dealing with capabilities of the REST API the place it makes essentially the most sense. The system handles this change robotically, so that you don’t want to fret about which API to make use of for various eventualities.
Efficiency particulars
After implementing Salesforce Bulk API assist in AWS Glue Zero-ETL, you’ll be able to see vital efficiency enhancements that scale dramatically with knowledge quantity. To check efficiency advantages, we created a customized object in our Salesforce account and populated it with 10 million data. We then established a Zero-ETL integration between Salesforce and AWS Glue databases to measure knowledge switch efficiency. Essentially the most spectacular positive factors are evident with large-scale operations: processing 10 million data now completes in 6 minutes and 20 seconds in comparison with 28 minutes and 53 seconds with the REST API—representing a 4.6-fold enchancment in processing time in our managed testing atmosphere, as proven within the following determine. Efficiency enhancements can fluctuate relying on elements akin to knowledge quantity, discipline complexity, community circumstances, and computational sources.
Multi-entity processing eventualities, the place 4 completely different Salesforce objects are processed concurrently, display the answer’s scalability. Even with this concurrent load, 1 million data throughout a number of entities full processing in beneath 3 minutes, showcasing the Bulk API’s superior dealing with of real-world knowledge integration eventualities, as proven within the following determine.
This efficiency sample demonstrates that the Bulk API’s asynchronous, batch-oriented structure delivers distinctive outcomes when dealing with the large-scale knowledge volumes that enterprises usually encounter in manufacturing Salesforce integrations. The efficiency benefit scales instantly with knowledge quantity, making it notably invaluable for organizations processing hundreds of thousands of data of their each day operations. As dataset dimension will increase, the effectivity positive factors turn out to be more and more pronounced, establishing the Bulk API because the optimum alternative for enterprise-scale knowledge processing necessities.Past the spectacular efficiency positive factors with giant datasets, our current enhancements have additionally unlocked one other crucial functionality: environment friendly processing of wide-column entities. Our efficiency benchmarks display this functionality in motion, with customized objects containing as much as 800 columns and 226 KB document sizes processing in simply 2 minutes and 11 seconds, whereas entities with 500 columns and 140 KB data full in 2 minutes and three seconds, and 100-column entities with 28 KB data course of in 1 minute and 56 seconds (proven within the following determine). This exceptional consistency throughout various column counts and document sizes demonstrates that Zero-ETL from SaaS functions maintains glorious efficiency whereas effectively ingesting and processing these wide-column entities, which signifies that you should utilize your full Salesforce datasets for analytics and machine studying initiatives.
Impression
The efficiency enhancements, demonstrated by AWS Glue Zero-ETL with Salesforce Bulk API assist, supply tangible advantages for companies managing giant volumes of Salesforce knowledge. As talked about earlier, our managed testing, demonstrated a 4.6-fold enchancment over the REST API when processing 10 million data. With these outcomes, you’ll be able to considerably cut back your knowledge integration time home windows. This sooner processing permits for extra frequent knowledge updates, probably enabling you to work with brisker knowledge on your analytics and reporting wants. Moreover, the environment friendly dealing with of wide-column entities, akin to processing customized objects with as much as 800 columns in simply over 2 minutes, means you can extra readily use your full Salesforce datasets with out sacrificing efficiency.
Stipulations
Earlier than implementing this answer, you could have the next in place:
- A Salesforce Enterprise, Limitless, or Efficiency Version account
- An AWS account with administrator entry
- Create an AWS Glue database with a reputation akin to
zero_etl_bulk_demo_db
and affiliate the S3 bucketzeroetl-etl-bulk-demo-bucket
as a location of the database. - Replace AWS Glue Knowledge Catalog settings utilizing the next IAM coverage for fine-grained entry management of the info catalog for zero-ETL.
- Create an AWS Id and Entry Administration (IAM) function named
zero_etl_bulk_role
. The IAM function can be utilized by Zero-ETL to entry knowledge out of your Saleforce account - Create the key
zero_etl_bulk_demo_secret
in AWS Secrets and techniques Supervisor to retailer Salesforce credentials.
Construct and confirm the zero-ETL integration
This part covers the steps required to arrange a Salesforce connection and utilizing that connection to create a Zero-ETL integration.
Step 1: Arrange a connector to your Salesforce occasion to allow knowledge entry
- Open the AWS Administration Console for AWS Glue.
- Within the navigation pane, beneath Knowledge catalog, select Connections.
- Select Create Connection.
- Within the Create Connection pane, enter
Salesforce
in Knowledge Sources. - Select Salesforce.
- Select Subsequent.
- Enter the Salesforce URL Occasion URL
- For IAM service function, choose the zero_etl_bulk_demo_role (created as a part of the conditions).
- For Authentication Kind, choose the authentication kind that you just’re utilizing for Salesforce. On this instance, we chosen Authorization Code.
- For AWS Secret, choose the key zero_etl_bulk_demo_secret (created as a part of the conditions).
- Select Subsequent.
- Within the Connection Properties part, for Identify, enter
zero_etl_bulk_demo_conn
. - Select Subsequent.
Step 2: Arrange Zero-ETL integration
- Open the AWS Glue console.
- Within the navigation pane, beneath Knowledge catalog, select Zero-ETL integrations.
- Select Create zero-ETL integration.
- Within the Create integration pane, enter
Salesforce
in Knowledge Sources. - Select Salesforce.
- Select Subsequent.
Â
- Choose the connection title that you just created within the earlier step.
- Choose the IAM function which you created within the earlier step.
- For Salesforce object, choose the objects you need to carry out the ingestion managed by Zero-ETL integration. For this put up, choose Alternative.
For Namespace or Database On this instance, we use the zero_etl_bulk_demo_db (from the conditions).
- For Goal IAM function, choose the zero_etl_demo_role (from the conditions).
- Select Subsequent.
- Within the Integration particulars part, for Identify, enter
zero-etl-bulk-demo-integration
. - Select Subsequent.
- Assessment the small print and select Create and launch integration.
- The newly created integration will present as Energetic in a couple of minute.
Clear up
Word that following these steps will completely delete the sources created on this put up; again up any vital knowledge earlier than continuing.
- Delete the Zero-ETL integration
zero-etl-bulk-demo-integration
. - Delete content material from the S3 bucket
zeroetl-etl-bulk-demo-bucket
. - Delete the Knowledge Catalog database
zero_etl_bulk_demo_db
. - Delete the Knowledge Catalog connection
zero_etl_bulk_demo_conn
. - Delete the Secrets and techniques Supervisor secret
zero_etl_bulk_demo_secret
.
Conclusion
The combination of Salesforce Bulk API assist in AWS Glue Zero-ETL marks a major development in our knowledge integration capabilities. By addressing the constraints of the REST API, effectively dealing with wide-column entities and compound fields, and implementing strong error dealing with, now you can use AWS Glue Zero-ETL to ingest bigger volumes of Salesforce knowledge extra effectively.This enhancement improves efficiency and opens up new prospects on your group to make use of their Salesforce knowledge for analytics, machine studying, and different data-driven initiatives. As we proceed to evolve AWS Glue Zero-ETL, we stay dedicated to offering cutting-edge options that empower our prospects to profit from their knowledge integration processes.
Be taught extra
Â
In regards to the authors