HomeBig DataAmazon MSK Replicator and MirrorMaker2: Selecting the best replication technique for Apache...

Amazon MSK Replicator and MirrorMaker2: Selecting the best replication technique for Apache Kafka catastrophe restoration and migrations


Clients want to copy information from their Apache Kafka clusters for quite a lot of causes, akin to compliance necessities, cluster migrations, and catastrophe restoration (DR) implementations. Nevertheless, the appropriate replication technique can differ relying on the applying context. On this publish, we stroll via the totally different concerns for utilizing Amazon MSK Replicator over Apache Kafka’s MirrorMaker 2, and show you how to select the appropriate replication answer to your use case. We additionally talk about learn how to make purposes utilizing Amazon Managed Streaming for Apache Kafka (Amazon MSK) resilient to disasters utilizing a multi-Area Kafka structure utilizing MSK Replicator.

Challenges with selecting DR methods

Clients create enterprise continuity plans and DR methods to maximise resiliency for his or her purposes, as a result of downtime or information loss can lead to dropping income or halting operations. DR planning helps the enterprise proceed operating within the occasion of a catastrophe impacting a subset of their utility structure. For purchasers utilizing Kafka as a core streaming and messaging service of their purposes, planning for DR for his or her Kafka infrastructure is a vital a part of assembly targets for his or her utility Restoration Time Goal (RTO) and Restoration Level Goal (RPO).

Amazon MSK is a totally managed service that makes it easy to construct and run Kafka to course of streaming information. Amazon MSK offers excessive availability by providing multi-AZ configurations to distribute brokers throughout a number of Availability Zones inside an AWS Area. A single MSK cluster deployment offers message sturdiness via intra-cluster information replication. Knowledge replication with a replication issue of three and min-ISR worth of two together with the producer setting acks=all offers the strongest availability ensures, as a result of it makes positive different brokers within the cluster acknowledge receiving the information earlier than the chief dealer responds to the producer. This design offers strong safety in opposition to single dealer failure in addition to single-AZ failure.

For enhanced resilience inside a single Area, Amazon MSK additionally gives Specific brokers, which considerably enhance Kafka cluster reliability, throughput, restoration occasions. Specific brokers embody pay-as-you-go storage, automated best-practice reliability configurations, no upkeep home windows, and quicker dealer scaling and restoration occasions. This structure reduces restoration time, minimizes the prospect of errors with misconfigurations, and will increase throughput, making your Kafka clusters extra resilient throughout Availability Zones.

Nevertheless, if an unlikely problem is impacting your purposes or infrastructure throughout multiple Availability Zone, the structure outlined on this publish may help you put together, reply, and get well from it.

For firms that may face up to an extended RTO however require a decrease RPO on Amazon MSK, backing up information to Amazon Easy Storage Service (Amazon S3) is adequate as a DR plan. This method requires you to suppose via learn how to deal with restarting the applying after a DR failover. On this method, you construct a system to get well the information from Amazon S3 to Kafka matters (as described in Again up and restore Kafka matter information utilizing Amazon MSK Join). Relying on the quantity of knowledge being restored, it’d take a very long time to get well on this situation. Moreover, you need to think about learn how to deal with shopper group offsets, and whether or not to permit purposes to eat from the most recent offset within the restored Kafka matters. As a result of excessive RTO, in addition to the complexity and challenges related to this method, most streaming use circumstances depend on the supply of the MSK cluster itself for his or her enterprise continuity plan. In these circumstances, establishing MSK clusters in a number of Areas and configuring information replication between clusters offers the required enterprise resilience and continuity.

Selecting the best replication answer: MSK Replicator vs MirrorMaker 2

AWS recommends two main options for cross-Area Kafka replication: MSK Replicator and MirrorMaker 2. Understanding when to make use of every answer is essential for designing an efficient DR technique.

MSK Replicator: For many MSK cluster replications in the identical account

MSK Replicator is a totally managed, serverless Kafka replication service that makes it easy to reliably replicate information throughout MSK clusters in several Areas or throughout the identical Area. MSK Replicator is the really helpful answer for utility eventualities replicating information throughout the identical AWS account. MSK Replicator has the next advantages:

  • Replication between MSK clusters – It helps replicating between MSK clusters in the identical AWS account (together with active-active or active-passive DR architectures for Amazon MSK)
  • No infrastructure administration – It’s totally serverless with automated scaling and easy setup via the AWS Administration Console, AWS Command Line Interface (AWS CLI), or APIs
  • Constructed-in monitoring – It’s built-in with Amazon CloudWatch metrics and logging
  • Constructed-in excessive availability – As a managed service, it gives built-in fault tolerance throughout Availability Zones

MirrorMaker 2: For migrations and complicated and hybrid eventualities

MirrorMaker 2 (MM2) stays the popular answer for particular use circumstances that require extra flexibility or contain non-Amazon MSK environments. MM2 is a utility bundled as a part of Kafka that helps replicate information between Kafka clusters utilizing the Kafka Join framework.

We suggest MirrorMaker 2 for the next use circumstances:

  • Cross-account replication – Replicating information between MSK clusters in several AWS accounts
  • Migrations to Amazon MSK – Migrating from current Kafka clusters on premises, in different clouds, or on self-managed Amazon Elastic Compute Cloud (Amazon EC2) deployments
  • Cross-cloud or hybrid cloud eventualities – Replicating between Kafka operating on-premises or on totally different cloud suppliers and Amazon MSK for catastrophe restoration or information analytics use circumstances
  • Utilizing mTLS or SASL/SCRAM authentication – Once you want mutual TLS certificate-based or SASL/SCRAM authentication and might’t allow AWS Identification and Entry Administration (IAM) authentication in your MSK cluster (for replication from one MSK cluster to a different in these eventualities, you possibly can nonetheless use MSK Replicator by enabling IAM authentication along with current authentication strategies)
  • Customized replication insurance policies – Superior matter naming or transformation necessities

Within the following sections, we talk about the structure and deployment approaches to be used circumstances the place MSK Replicator and MirrorMaker 2 are the suitable selections.

MSK Replicator answer overview

The next diagram illustrates the structure for utilizing MSK Replicator.

We create two MSK clusters – one within the main Area, the opposite within the secondary Area as a standby cluster for catastrophe restoration. We deploy MSK Replicator within the secondary area to copy matters, ACLs, information, and shopper group offsets from the first cluster. On this answer, we showcase a single-direction replication for active-passive catastrophe restoration. This answer can be prolonged for active-active catastrophe restoration eventualities. Our Kafka purchasers hook up with the first cluster and might be configured to hook up with the secondary cluster within the occasion of a catastrophe restoration failover.

For particulars on implementation steps, discuss with Introducing Amazon MSK Replicator – Absolutely Managed Replication throughout MSK Clusters in Similar or Totally different AWS Areas. For particulars on catastrophe restoration eventualities, discuss with Use replication to extend the resiliency of a Kafka streaming utility throughout Areas. These sources present the next advantages:

  • Full deployment steps – Step-by-step deployment course of for MSK Replicator between areas
  • Complete examples – A number of deployment eventualities and configurations
  • Failover course of – Key steps in executing a catastrophe restoration failover when utilizing MSK Replicator

MirrorMaker2 answer overview

The next diagram illustrates the structure for utilizing MirrorMaker 2.

We create an MSK cluster within the main Area, with the present Kafka cluster on premises. This Kafka cluster is analogous to Kafka clusters operating in different clouds, or in self-managed Kafka clusters on Amazon EC2. On this answer, we showcase a single-direction replication for cluster migration eventualities. Our Kafka purchasers work together with the on-premises Kafka cluster and might be migrated to run on AWS to work together with the MSK cluster.

Somewhat than manually configuring every part, we suggest utilizing the automated deployment sources obtainable within the following GitHub repository. For a step-by-step walkthrough of deploying MirrorMaker 2 on Amazon ECS with Fargate utilizing auto scaling, discuss with Amazon MSK Migration Workshop: Modernizing with Specific Brokers. These sources present the next advantages:

  • Infrastructure as code – Terraform for MSK clusters and supporting infrastructure
  • Containerized Kafka Join – Docker photographs optimized for AWS
  • Amazon ECS with AWS Fargate deployment – Scalable, serverless container deployment utilizing Amazon Elastic Container Service (Amazon ECS) with AWS Fargate
  • Auto scaling configuration – Automated scaling based mostly on workload calls for
  • Complete examples – A number of deployment eventualities and configurations
  • Migration course of – Key steps in executing a Kafka migration utilizing MM2

Conclusion

Selecting the best replication answer is determined by your particular necessities. We suggest utilizing MSK Replicator when replicating from one MSK cluster to a different and also you desire a totally managed answer for catastrophe restoration. MirrorMaker 2 is really helpful for migrations to Amazon MSK, hybrid environments, or once you want complicated customized replication insurance policies.

For MSK Replicator deployments, discuss with Introducing Amazon MSK Replicator – Absolutely Managed Replication throughout MSK Clusters in Similar or Totally different AWS Areas and Use replication to extend the resiliency of a Kafka streaming utility throughout Areas.

For MirrorMaker 2 deployments, discuss with the GitHub repository and Amazon MSK Migration Workshop to implement production-ready options with automated deployment, monitoring, and scaling capabilities.

These approaches present a customizable set of choices for information redundancy and enterprise continuity capabilities wanted to satisfy regulatory compliance and catastrophe restoration necessities, whereas minimizing operational overhead via automation and finest practices.


In regards to the Writer

Mazrim Mehrtens

Mazrim Mehrtens

Mazrim is a Sr. Specialist Options Architect for messaging and streaming workloads. Mazrim works with clients to construct and assist programs that course of and analyze terabytes of streaming information in actual time, run enterprise Machine Studying pipelines, and create programs to share information throughout groups seamlessly with various information toolsets and software program stacks.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments