How Equinor Optimized Seismic Information Pipeline with Databricks

May 13, 2025

55

The oil and gasoline business depends closely on seismic information to discover and extract hydrocarbons safely and effectively. Nonetheless, processing and analyzing giant quantities of seismic information is usually a daunting process, requiring important computational sources and experience.

Equinor, a number one power firm, has used the Databricks Information Intelligence Platform to optimize one in all its exploratory seismic information transformation workflows, attaining important time and value financial savings whereas bettering information observability.

Equinor’s aim was to reinforce one in all its 4D seismic interpretation workflows, specializing in automating and optimizing the detection and classification of reservoir modifications over time. This course of helps figuring out drilling targets, lowering the danger of expensive dry wells, and selling environmentally accountable drilling practices. Key enterprise expectations included:

Optimum drilling targets: Enhance goal identification to drill up a lot of new wells within the upcoming many years.
Quicker, cost-effective evaluation: Cut back the time and value of 4D seismic evaluation by way of automation.
Deeper reservoir insights: Combine extra subsurface information to unlock improved interpretations and decision-making.

Understanding Seismic Information

Seismic Dice: 3D fashions of the subsurface

Seismic information acquisition includes deploying air weapons to generate sound waves, which replicate off subsurface buildings and are captured by hydrophones. These sensors, positioned on streamers towed by seismic vessels or positioned on the seafloor, gather uncooked information that’s later processed to create detailed 3D photographs of the subsurface geology.

File Format: SEG-Y (Society of Exploration Geophysicists) – proprietary file format for storing seismic information, developed within the Seventies, optimized for tape storage
Information Illustration: The processed information is saved as 3D cubes, providing a complete view of subsurface buildings.

Fig. 1: Seismic survey – buying seismic information. Uncooked information is then processed into 3D cubes. Retrieved 15‐06‐2015. Fetched from “Specificity of Geotechnical Measurements and Apply of Polish Offshore Operations”, Krzysztof Wróbel, Bogumił Łączyński, The Worldwide Journal on Marine Navigation and Security of Sea Transportation, quantity 9, quantity 4, December -2015

Seismic Horizons: Mapping Geological Boundaries

Seismic horizons are interpretations of seismic information, representing steady surfaces inside the subsurface. These horizons point out geological boundaries, tied to modifications in rock properties and even fluid content material. By analyzing the reflections of seismic waves at these boundaries, geologists can determine key subsurface options.

File Format: CSV – generally used for storing interpreted seismic horizon information.
Information Illustration: Horizons are saved as 2D surfaces.

Fig. 2: An instance of two Seismic Horizons From Open Inventor Toolikt/Seismic Horizon (Peak Subject)

Challenges with the Current Pipeline

The present seismic information pipeline processes information to generate the next key outputs:

4D Seismic Distinction Dice: Tracks modifications over time by evaluating two seismic cubes of the identical bodily space, sometimes acquired months or years aside.
4D Seismic Distinction Maps: These maps include attributes or options from the 4D seismic cubes to spotlight particular modifications within the seismic information, aiding reservoir evaluation.

Nonetheless, a number of challenges restrict the effectivity and scalability of the present pipeline:

Suboptimal Distributed Processing: Depends on a number of standalone Python jobs working in parallel on single-node clusters, resulting in inefficiencies.
Restricted Resilience: Liable to failures and lacks mechanisms for error tolerance or automated restoration.
Lack of Horizontal Scalability: Requires high-configuration nodes with substantial reminiscence (e.g., 112 GB), driving up prices.
Excessive Growth and Upkeep Effort: Managing and troubleshooting the pipeline calls for important engineering sources.

Proposed Answer Structure

To handle these challenges, we re-architected the pipeline as a distributed answer utilizing Ray and Apache Spark™ ruled by Unity Catalog on the Databricks Platform. This strategy considerably improved scalability, resilience, and value effectivity.

Proposed Architecture Diagram — Fig. 3: Proposed Structure Diagram

We used the next applied sciences on the Databricks Platform to implement the answer:

Apache Spark™: An open supply framework for large-scale information processing and analytics, making certain environment friendly and scalable computation.
Databricks Workflows: For orchestrating information engineering, information science, and analytics duties.
Delta Lake: An open supply storage layer that ensures reliability by way of ACID transactions, scalable metadata dealing with, and unified batch and streaming information processing. It serves because the default storage format on Databricks.
Ray: A high-performance distributed computing framework, used to scale Python functions and allow distributed processing of SEG-Y information by leveraging SegyIO and current processing logic.
SegyIO: A Python library for processing SEG-Y information, enabling seamless dealing with of seismic information.

Key Advantages:

This re-architected seismic information pipeline addressed inefficiencies within the current pipeline whereas introducing scalability, resilience, and value optimization. The next are the important thing advantages realized:

Important Time Financial savings: Eradicated duplicate information processing by persisting intermediate outcomes (e.g., 3D and 4D cubes), enabling reprocessing of solely the mandatory datasets.
Value Effectivity: Lowered prices by as much as 96% on particular calculation steps, reminiscent of map era.
Failure-Resilient Design: Leveraged Apache Spark’s distributed processing framework to introduce fault tolerance and automated process restoration.
Horizontal Scalability: Achieved horizontal scalability to beat the constraints of the present answer, making certain environment friendly scaling as information quantity grows.
Standardized Information Format: Adopted an open, standardized information format to streamline downstream processing, simplify analytics, enhance information sharing, and improve governance and high quality.

Conclusion

This undertaking highlights the massive potential of contemporary information platforms like Databricks in remodeling conventional seismic information processing workflows. By integrating instruments reminiscent of Ray, Apache Spark and Delta, and leveraging Databricks’ platform, we achieved an answer that delivers measurable advantages:

Effectivity Good points: Quicker information processing and fault tolerance.
Value Reductions: A extra economical strategy to seismic information evaluation.
Improved Maintainability: Simplified pipeline structure and standardized know-how stacks decreased code complexity and growth overhead.

The redesigned pipeline not solely optimized seismic workflows but in addition set a scalable and strong basis for future enhancements. This serves as a priceless mannequin for different organizations aiming to modernize their seismic information processing whereas driving comparable enterprise outcomes.

Acknowledgments

Particular because of the Equinor Information Engineering, Information Science and Analytics Communities, and Equinor Analysis and Growth Groups for his or her contributions to this initiative.

“Wonderful expertise working with Skilled providers – very excessive technical competence and communication abilities. Important achievements in fairly a short while”

— Anton Eskov

https://www.databricks.com/weblog/class/industries/power?classes=power

Previous articleWhy I Stopped Making an attempt to Be Associates With My Workers

Next articleTheom Secures $20M Sequence A to Revolutionize Information Governance within the AI Period

How Equinor Optimized Seismic Information Pipeline with Databricks

Understanding Seismic Information

Seismic Dice: 3D fashions of the subsurface

Seismic Horizons: Mapping Geological Boundaries

Challenges with the Current Pipeline

Proposed Answer Structure

Key Advantages:

Conclusion

Acknowledgments

TensorZero nabs $7.3M seed to unravel the messy world of enterprise LLM growth

Information to adopting Amazon SageMaker Unified Studio from ATPCO’s Journey

Walmart’s 4 Tremendous Brokers Are Main Retail

LEAVE A REPLY Cancel reply

Most Popular

Day by day Search Discussion board Recap: August 18, 2025

Options, Pricing & Use Instances

TensorZero nabs $7.3M seed to unravel the messy world of enterprise LLM growth

How does the metrics layer improve the ability of superior analytics?

Recent Comments

ABOUT US

POPULAR POSTS

Day by day Search Discussion board Recap: August 18, 2025

Options, Pricing & Use Instances

TensorZero nabs $7.3M seed to unravel the messy world of enterprise LLM growth

POPULAR CATEGORY