AMAZON MSK – MANAGED STREAMING FOR APACHE KAFKA

Fully managed, highly available, and secure Apache Kafka service.

Efficently scale your data streaming

Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. Apache Kafka is an open-source platform for building real-time streaming data pipelines and applications.

With Amazon MSK, you can use Apache Kafka APIs to populate data lakes, stream changes to and from databases, and power machine learning and analytics applications. Apache Kafka clusters are challenging to setup, scale, and manage in production. When you run Apache Kafka on your own, you need to provision servers, configure Apache Kafka manually, replace servers when they fail, orchestrate server patches and upgrades, architect the cluster for high availability, ensure data is durably stored and secured, setup monitoring and alarms, and carefully plan scaling events to support load changes. Amazon MSK makes it easy for you to build and run production applications on Apache Kafka without needing Apache Kafka infrastructure management expertise. That means you spend less time managing infrastructure and more time building applications.

With a few clicks in the Amazon MSK console you can create highly available Apache Kafka clusters with settings and configuration based on Apache Kafka’s deployment best practices. Amazon MSK automatically provisions and runs your Apache Kafka clusters. Amazon MSK continuously monitors cluster health and automatically replaces unhealthy nodes with no downtime to your application. In addition, Amazon MSK secures your Apache Kafka cluster by encrypting data at rest.



BENEFITS


  • strip-0

    Fully managed

    Provisioning, configuration, and maintenance of Apache Kafka clusters and Apache ZooKeeper nodes are managed by Amazon MSK. Also, key Apache Kafka performance metrics could be shown in the dedicated AWS web console. No need to worry about the operational overhead of managing your Apache Kafka environment because Amazon MSK allows users focusing on creation of your streaming applications.

  • Highly available

    Automatic recovery is provided by monitoring the health of the clusters and replacing unhealthy brokers without downtime for the applications. Amazon MSK manages the availability of Apache ZooKeeper nodes so you will not need to start, stop, or directly access the nodes yourself. Amazon MSK also deploys software patches as needed to keep the cluster up to date and running smoothly. Amazon MSK uses also multi-Availability Zone replication for high-availability.

    strip-1
  • strip-2

    Highly secure

    Amazon MSK provides multiple levels of security for your Apache Kafka clusters including VPC network isolation, AWS IAM for control-plane API authorization, encryption at rest, TLS encryption in-transit.

  • Fully compatible

    Amazon MSK runs and manages Apache Kafka for you. This makes it easy for you to migrate and run your existing Apache Kafka applications on AWS without changes to the application code. By using Amazon MSK, you maintain open source compatibility and can continue to use familiar custom and community-built tools such as MirrorMaker.

    strip-3
  • strip-4

    Highly scalable

    There are two kind of scaling: broker scaling, it is possible to increase or decrease the number of the brokers of the cluster, and storage scaling, by modifying the amount of storage provisioned per broker to match changes in storage requirements.

DATA REPLY BEST PRACTICE

Data Reply, AWS Premier Consulting Partner, has developed a strong expertise on AWS Big Data platform implementation. During this time, we gained expertise in the use of AWS MSK, which can guarantee reliability and cost saving in its use.
Our Best Practice relies on these four main features of Amazon MSK:
  • Integration

    Deep integration with several other AWS services (IAM. KMS, Lambda, etc.).

  • Innovation

    Fast innovation and constantly improved (42+ new features in 2 years).

  • Costs

    Up to 40% cheaper compare to a self-managed Kafka cluster.

  • Compatibility

    Fully compatible with Kafka APIs but not necessary to manage Zookeeper.

DATA REPLY MIGRATION APPROACH

As Data Reply we provide our expertise in AWS Migration, built in different industrial sectors, among several years of projects. We distilled our expertise into our Migration Approach, that consists in 4 different modules which can be adapted to the customer’s needs:

REQUIREMENTS & BUSINESS USE CASE:

Understand key business challenges and goals, in order to identify gaps and opportunity, and plan current and future state

TECHNICAL WORKSHOP:

During workshop phase, we can perform a Technical & Opportunity assessment, planning technical deep dive session, in order to identify migration success criteria, business & IT Data Lake outcomes

PILOT:

Scope of the Pilot phase is to create a simple Pilot of the target use case, in order to allow customers to have a concrete way to test the solution. We define target architecture & component level mapping, according to requirements collected in previous phases, and execute incremental data migration and automation. After the UAT step, the Pilot is ready to Go Live!

IMPLEMENTATION:

This is typically the last step, where we implement the final solution, split into Waves to guarantee a progressive release of the solution. We define full migration strategy and schedule, and the application code migration with Dual Target approach. Later, we can start with waves of implementation, including bulk import/export and the validation and audit of the solution. After the Test & UAT phases, we are ready for a successful GO LIVE!

  • strip-0

    Data Reply is the Reply group company offering a broad range of advanced analytics and AI-powered data services. We operate across different industries and business functions, working directly with executive level professionals and Chief Officers enabling them to achieve meaningful outcomes through effective use of data.

    We have a consolidated experience in designing and building cloud solutions: we support companies with either lift-and-shift solutions and with cloud native architectures. We help our customers in developing and adopting holistic Big Data architectures and implementing ML and AI models in a manner that is repeatable, efficient, scalable, simple and yet secure. We supports companies in combinatorial optimization processes with Quantum and Accelerated Computing techniques that enable an engine with high computational performances.