HomeBig DataHow Octus achieved 85% infrastructure price discount with zero downtime migration to...

How Octus achieved 85% infrastructure price discount with zero downtime migration to Amazon OpenSearch Service


As information volumes proceed to develop exponentially, there’s growing stress to optimize search infrastructure prices whereas sustaining the excessive efficiency and reliability that mission-critical workloads demand. Many corporations discover themselves managing complicated, costly search programs that require vital operational overhead and restrict their capacity to scale effectively. The problem turns into much more acute when organizations have to migrate between search programs, a course of that historically entails substantial downtime, complicated information synchronization, and vital influence on enterprise operations. Enterprise functions can not afford service interruptions that would influence buyer experiences, enterprise intelligence, or operational continuity. Migration methods have to ship price optimization and operational enhancements whereas sustaining zero downtime and facilitating full information integrity all through the transition course of.

Based in 2013, Octus, previously Reorg, is the important credit score intelligence and information supplier for the world’s main purchase aspect companies, funding banks, legislation companies and advisory companies. By surrounding unparalleled human experience with confirmed expertise, information and AI instruments, Octus unlocks highly effective truths that gas decisive motion throughout monetary industries.

This submit highlights how Octus migrated its Elasticsearch workloads working on Elastic Cloud to Amazon OpenSearch Service. The journey traces Octus’s shift from managing a number of programs to adopting a cost-efficient resolution powered by OpenSearch Service. Alongside the way in which, we share the structure decisions and implementation methods that made the migration profitable. The result’s uninterrupted service availability all through migration, with improved efficiency and higher price effectivity.

Strategic necessities

We recognized a number of necessities that made Amazon OpenSearch Service the precise selection for his or her migration:

  • Price effectivity: The OpenSearch Service pricing mannequin enabled us to optimize cloud spend with out compromising efficiency.
  • Responsive assist: AWS supplied reliable, high-quality assist to speed up subject decision and instill confidence.
  • Constant reliability: OpenSearch Service gives an SLA as much as 99.99% providing the reliability required for Octus’s mission-critical workloads.
  • Seamless migration with no question downtime: Migration Assistant for Amazon OpenSearch Service supplied Octus with a migration path whereas sustaining uninterrupted question availability throughout the migration, facilitating enterprise continuity.
  • Operational simplification: Consolidating onto AWS diminished infrastructure complexity whereas sustaining excessive safety requirements.

Answer overview

The Migration Assistant for Amazon OpenSearch Service gives a collection of instruments to assist in Elasticsearch to OpenSearch Service migrations. Octus use the next capabilities for his or her migration:

  • Metadata migration: The software enabled Octus emigrate dozens of indices with numerous mappings and settings. When a backward incompatibility was recognized with timestamp metadata, a customized JavaScript transformation, built-in straight into the Migration Assistant tooling, was utilized to mechanically regulate the mappings throughout the indices and facilitate compatibility.
  • Historic information migration: Octus used Reindex-from-Snapshot emigrate the historic paperwork from a point-in-time snapshot of the supply cluster, scaling this course of with out impacting the supply cluster for the reason that snapshot was saved in Amazon Easy Storage Service (Amazon S3). Reindex-from-Snapshot additionally enabled Octus to regulate the sharding scheme throughout migration, serving to to optimize cluster efficiency on the goal.
  • Stay Site visitors Replay: As soon as backfill was full, Octus used Migration Assistant’s Site visitors Replayer to ship the captured stay site visitors (from the Site visitors Seize Proxy) to the goal cluster with required request transformations for OpenSearch Service compatibility, ensuing within the goal cluster containing the paperwork from the supply cluster with updates being carried out in actual time.

The next diagram illustrates the implementation structure diagram for this migration.



Determine 1 – Migration Assistant structure with migration steps

For extra details about the Migration Assistant for Amazon OpenSearch Service, go to the AWS Options house web page.

Every node within the diagram correlates to the next steps within the migration course of:

  1. Shopper site visitors is directed to the prevailing cluster.
  2. An Software Load Balancer with seize proxies relays site visitors to a supply whereas replicating information to Amazon Managed Streaming for Apache Kafka (Amazon MSK).
  3. Utilizing the migration console, a point-in-time snapshot is taken. As soon as the snapshot completes, the Metadata Migration Device is used to ascertain indexes, templates, element templates, and aliases on the goal cluster. With steady site visitors seize in place, Reindex-from-Snapshot, migrates information from the supply.
  4. As soon as Reindex-from-Snapshot is full, captured site visitors is replayed from Amazon Managed Streaming for Apache Kafka (Amazon MSK) to the goal cluster by Site visitors Replayer.
  5. Efficiency and habits of site visitors despatched to the supply and goal clusters are in contrast by reviewing logs and metrics.
  6. After confirming that the goal cluster’s performance meets expectations, purchasers are redirected to the brand new goal.

Full migration and optimization journey

Octus’s migration from Elastic Cloud to Amazon OpenSearch Service encompassed each the core migration effort and subsequent optimization phases. The objective was to efficiently migrate the search infrastructure, functions, and information from Elastic Cloud to a brand new OpenSearch Service area with minimal disruption, whereas repeatedly optimizing efficiency and prices based mostly on real-world utilization information.

Octus used their in-house customized infrastructure frameworks (their inner tooling for infrastructure automation) to construct, deploy and monitor the goal OpenSearch Service 1.3 area, establishing a strong basis for the migration. This strategy used acquainted inner processes whereas transferring to the absolutely managed AWS service. Discuss with AWS documentation to implement safety finest practices when utilizing OpenSearch Service.

Pre-migration optimization

Previous to initiating the migration, Octus performed optimization actions on the supply Elasticsearch cluster to streamline the migration course of. This included eradicating unused indexes that had amassed over time and eradicating massive paperwork that may unnecessarily prolong migration period and enhance storage switch prices. These preparatory steps considerably diminished the information quantity requiring migration and minimized the general migration complexity, enabling extra environment friendly use of the Migration Assistant instruments.

Technical constraints and model concerns

The migration concerned particular model compatibility challenges that influenced the technical strategy. The supply Elasticsearch cluster was working model 7.17, and the Python shopper functions have been additionally constrained to Elasticsearch 7.17 compatibility. To assist the transition, the staff used Reindex-from-Snapshot, which permits cross-system migrations by reindexing information from present snapshots into a brand new OpenSearch Service cluster. RFS additionally rewrites indices created on older variations of Lucene, simplifying future upgrades to the most recent model of OpenSearch Service. Whereas evaluating a transfer to OpenSearch 1 or 2, Octus chosen OpenSearch 1.3 because the goal to reduce client-side modifications and scale back migration complexity, whereas positioning themselves for less complicated upgrades later.

The model choice notably impacted the R software atmosphere, as R language (an open-source programming language for statistical computing and information evaluation) lacked native OpenSearch 1.3 shopper assist. This constraint required Octus to develop a customized shopper resolution utilizing the ropensci/elastic library to combine with the brand new OpenSearch Service area. The Python atmosphere introduced comparable challenges, the place the Elasticsearch 7.17 shopper constraints necessitated cautious consideration of the migration strategy. These shopper compatibility considerations have been among the many elements that influenced the selection of Migration Assistant instruments over conventional snapshot-based strategies, because the Migration Assistant supplied higher assist for managing version-specific shopper interactions throughout the transition.

Trying ahead, Octus plans to improve to newer OpenSearch variations as their software stack evolves and shopper library assist matures, in order that they’ll leverage the most recent options and efficiency enhancements whereas sustaining the soundness achieved via this migration.

Software modernization throughout a number of languages

The appliance modifications represented a major technical enterprise throughout a number of programming environments:

  • Legacy PHP programs (5.6 and Laravel 4.2): Octus dealt with mapping kind deprecation on OpenSearch requests as specifying these mapping sorts should not supported, whereas persevering with to make use of the elasticsearch connector library with username/password authentication.
  • Fashionable PHP functions (8.1 and Laravel 9): These underwent extra complete modifications, changing the elasticsearch/elasticsearch library with the opensearch-project/opensearch-php shopper and leveraging IAM authentication to connect with the clusters.
  • Python atmosphere: Purposes spanning variations 3.8, 3.10, 3.11, and three.13 with Django frameworks 2.1, 3.2, and 5.2 required changing the elasticsearch library with opensearch-py and transitioning to IAM authentication.
  • R functions: For R 4.5.1 functions, Octus utilized a customized library ropensci/elastic to facilitate compatibility.

Site visitors routing and enhanced monitoring

To facilitate the migration, Octus redirected their present purchasers to route requests to the supply cluster via Migration Assistant’s Site visitors Seize Proxy, migrating the information from stay site visitors to their goal cluster.

The monitoring infrastructure underwent vital enhancement throughout this course of. Octus’s observability infrastructure screens the general well being of OpenSearch Service clusters which incorporates cluster supervisor and information nodes, community, information storage, safety and IAM entry. It additionally screens the indexing and search efficiency of their functions. This alleviated the necessity for a separate monitoring cluster as logs and metrics have been shipped on to Datadog, considerably enhancing observability. The Datadog screens have been outlined utilizing Infrastructure-as-Code and built-in seamlessly into their infrastructure frameworks.

Cutover and preliminary outcomes

The Website Reliability Engineering staff meticulously deliberate the discharge, attaining a profitable migration from Elasticsearch to OpenSearch Service and cutover of the Elasticsearch shopper to the OpenSearch Service purchasers with no downtime for the system software and 0 information loss. The preliminary migration section resulted in a 52% price discount whereas attaining operational advantages together with zero downtime for the system app, no information loss, full Infrastructure-as-Code implementation for infrastructure and monitoring, and enhanced observability.

Submit-migration optimization

Following the migration, Octus performed complete optimization based mostly on operational information from manufacturing and different environments within the new OpenSearch Service setup. This real-world utilization information supplied helpful insights into precise useful resource consumption, enabling knowledgeable selections relating to additional cluster resizing.

Via utilization metric evaluation and strategic resizing, Octus aligned cluster measurement extra exactly with operational wants, facilitating continued efficiency whereas minimizing expenditure. This optimization section delivered an extra 33% price discount in comparison with the unique Elastic Cloud prices, bringing the full discount to 85% whereas sustaining constant and optimum efficiency.

Operational monitoring

Octus makes use of Datadog to observe each search and indexing latency offering real-time visibility into Amazon OpenSearch Service cluster efficiency. The next screenshot showcases how customized Datadog dashboards present a stay view of the OpenSearch Service clusters. This visualization presents each a high-level overview and detailed insights into the ingestion course of, serving to us perceive the storage and doc depend. The underside half of the dashboard presents a time-series view of particular person node well being and efficiency metrics like learn and write latency, throughput and IOPS.



Determine 2 – DataDog dashboards

Migration observability

Migration Assistant for Amazon OpenSearch Service gives a number of dashboards to watch and validate the progress of a migration. Through the use of these observability options clients can observe each backfill and stay seize and replay progress, facilitating confidence earlier than switching manufacturing workloads to the goal cluster.The next graphs are an instance from Octus’s migration, the place roughly 4TB of knowledge was migrated in about 9 hours (from 08:00 to 17:00).



Determine 3 – Backfill progress by disk utilization



Determine 4 – Backfill progress by searchable paperwork

As soon as the backfill is full, the captured site visitors is replayed to synchronize ongoing exercise between the supply and goal clusters.

On the time the backfill completed (round 17:00), the goal cluster was roughly 467 minutes behind the supply. The replay course of quickly diminished this lag by processing captured site visitors at a sooner price than it was initially ingested on the supply.



Determine 5 – Replay lag after backfill completion

When the lag time reached 0, the goal cluster was absolutely in sync and manufacturing site visitors might safely be rerouted. Octus selected to watch replayed site visitors on the goal for a number of days earlier than making the ultimate switchover.

Reaching excellence

Octus’s migration to Amazon OpenSearch Service has yielded exceptional outcomes:

  • Scalability – Octus has nearly doubled the variety of paperwork accessible for Q&A throughout three environments in days as an alternative of weeks. Their use of Amazon Elastic Container Service (Amazon ECS) with AWS Fargate with auto scaling guidelines and controls provides them elastic scalability for his or her companies throughout peak utilization hours.
  • Price discount – By transferring away from Elastic Cloud to OpenSearch Service, Octus’s month-to-month infrastructure prices at the moment are 85% decrease.
  • Enhanced search efficiency – Octus maintained constant response occasions all through the migration with no damaging influence on latency, whereas attaining a 20% enchancment in question throughput and total search efficiency.
  • Zero downtime – Octus skilled zero downtime throughout migration and 100% uptime total for the entire software.
  • Decreased operational overhead – Submit-migration, Octus’s DevOps and SRE groups see 30% much less upkeep burden and overheads. Supporting SOC2 compliance can also be simple now that they’re utilizing one system.
  • Accelerated timeline supply – All the migration was accomplished forward of schedule, transferring from planning to full completion in beneath one quarter.

“Shifting from Elastic Cloud to Amazon OpenSearch Service was a key element of our broader technique to reduce third-party dependencies and strengthen the reliability of Octus’ system infrastructure. Migration Assistant for Amazon OpenSearch Service enabled us to execute a seamless transition with zero information loss and just about no downtime for our customers.” – Vishal Saxena, CTO, Octus

Conclusion

On this submit, we confirmed you ways Octus efficiently migrated their Elasticsearch workloads from Elastic Cloud to Amazon OpenSearch Service utilizing the Migration Assistant for OpenSearch Service, attaining zero downtime and vital operational enhancements.

The Migration Assistant for OpenSearch Service supported this complicated migration via its complete suite of instruments. The Metadata Migration functionality migrated dozens of indices with numerous mappings and settings, with customized JavaScript transformations dealing with backward incompatibilities. Reindex-from-Snapshot migrated the historic paperwork from point-in-time snapshots with out impacting the supply cluster, whereas additionally optimizing the sharding scheme for improved efficiency. Stay Site visitors Replay made certain the goal cluster remained synchronized with real-time updates all through the migration course of.

The migration delivered substantial outcomes throughout the size. Octus achieved an 85% discount in month-to-month infrastructure prices whereas practically doubling the variety of paperwork accessible for search throughout three environments. Search efficiency improved by 20% in question throughput with constant response occasions and no damaging influence on latency. The migration maintained zero downtime and 100% uptime for the complete software, with DevOps and SRE groups experiencing 30% much less upkeep burden and operational overhead. All the migration was accomplished forward of schedule in beneath one quarter.

To study extra concerning the Migration Assistant for OpenSearch Service and the way it will help you obtain comparable outcomes, go to the AWS Options house web page.

Go to Octus to learn the way we ship rigorously verified intelligence at pace and create a whole image for professionals throughout the complete credit score lifecycle. Comply with Octus on LinkedIn and X.


Concerning the Authors

Harmandeep Sethi

Harmandeep Sethi

Harmandeep is Head of SRE Engineering and Infrastructure Frameworks at Octus. with practically 10 years of expertise main high-performing groups within the implementation of large-scale programs. He has performed a pivotal function in reworking and modernizing Octus’s Search Engine infrastructure and companies by driving finest practices in observability, resilience engineering, and the automation of operational processes via Infrastructure Frameworks.

Serhii Shevchenko

Serhii Shevchenko

Serhii is a Website Reliability Engineer at Octus. With 9 years of mixed expertise in software program growth and website reliability engineering, his experience focuses on enhancing system reliability and efficiency. He was a key developer on the appliance aspect for the corporate’s vital migration from Elasticsearch Cloud to AWS OpenSearch. His planning was instrumental in executing the transition with zero client-facing downtime.

Govind Bajaj

Govind Bajaj

Govind is a Senior Website Reliability Engineer at Octus, specializing in architecting and implementing scalable infrastructure that helps high-performing engineering groups and demanding programs. With over 8 years of expertise, he excels at breaking down complicated issues and turning them into sensible, well-designed options, with a robust concentrate on constructing safe, observable, and resilient platforms.

Virendra Shinde

Virendra Shinde

Virendra is the Head of Platform at Octus, the place he oversees cloud infrastructure, website reliability, and the core frameworks that energy the Octus product suite. Earlier than becoming a member of Octus, he spent two years at Grayscale Investments constructing an investor portal and information APIs from the bottom up. Previous to that, he spent eight years at Blackstone main a number of growth groups. He holds a Grasp’s diploma in Data Administration from the College of Maryland.

Brian Presley

Brian Presley

Brian is a Software program Improvement Supervisor at OpenSearch, main groups behind OpenSearch Migrations and OpenSearch Serverless to construct scalable, high-impact search and analytics options.

Andre Kurait

Andre Kurait

Andre is a Software program Improvement Engineer II at AWS, based mostly in Austin, Texas. He’s presently engaged on Migration Assistant for Amazon OpenSearch Service. Previous to becoming a member of Amazon OpenSearch, Andre labored inside Amazon Well being Companies. In his free time, Andre enjoys touring, cooking, and enjoying in his church sport leagues. Andre holds Bachelor of the Science levels from the College of Kansas in Pc Science and Arithmetic.

Vaibhav Sabharwal

Vaibhav Sabharwal

Vaibhav is a Senior Options Architect at AWS based mostly out of New York. He’s enthusiastic about studying new cloud applied sciences and aiding clients in constructing cloud adoption methods, designing revolutionary options, and driving operational excellence. As a member of the Monetary Companies and Storage Technical Area Communities at AWS, he actively contributes to the collaborative efforts throughout the business.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments