Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed service that makes
it easy for you to build and run applications that use Apache Kafka to process streaming data. Apache
Kafka is an open-source platform for building real-time streaming data pipelines and applications.
With Amazon MSK, you can use Apache Kafka APIs to populate data lakes, stream changes to and from databases, and power machine learning and analytics applications.
Apache Kafka clusters are challenging to setup, scale, and manage in production. When you run Apache Kafka on your own, you need to provision servers, configure Apache Kafka manually, replace servers when they fail, orchestrate server patches and upgrades, architect the cluster for high availability, ensure data is durably stored and secured, setup monitoring and alarms, and carefully plan scaling events to support load changes. Amazon MSK makes it easy for you to build and run production applications on Apache Kafka without needing Apache Kafka infrastructure management expertise. That means you spend less time managing infrastructure and more time building applications.
With a few clicks in the Amazon MSK console you can create highly available Apache Kafka clusters
with settings and configuration based on Apache Kafka’s deployment best practices. Amazon MSK
automatically provisions and runs your Apache Kafka clusters. Amazon MSK continuously monitors
cluster health and automatically replaces unhealthy nodes with no downtime to your application. In
addition, Amazon MSK secures your Apache Kafka cluster by encrypting data at rest.
Provisioning, configuration, and maintenance of Apache Kafka clusters and Apache ZooKeeper nodes are managed by Amazon MSK. Also, key Apache Kafka performance metrics could be shown in the dedicated AWS web console.
No need to worry about the operational overhead of managing your Apache Kafka environment because Amazon MSK allows users focusing on creation of your streaming applications.
Automatic recovery is provided by monitoring the health of the clusters and replacing unhealthy brokers without downtime for the applications. Amazon MSK manages the availability of Apache ZooKeeper nodes so you will not need to start, stop, or directly access the nodes yourself. Amazon MSK also deploys software patches as needed to keep the cluster up to date and running smoothly. Amazon MSK uses also multi-Availability Zone replication for high-availability.
Amazon MSK provides multiple levels of security for your Apache Kafka clusters including VPC network isolation, AWS IAM for control-plane API authorization, encryption at rest, TLS encryption in-transit.
Amazon MSK runs and manages Apache Kafka for you. This makes it easy for you to migrate and run your existing Apache Kafka applications on AWS without changes to the application code. By using Amazon MSK, you maintain open source compatibility and can continue to use familiar custom and community-built tools such as MirrorMaker.
There are two kind of scaling: broker scaling, it is possible to increase or decrease the number of the brokers of the cluster, and storage scaling, by modifying the amount of storage provisioned per broker to match changes in storage requirements.
Deep integration with several other AWS services (IAM. KMS, Lambda, etc.).
Fast innovation and constantly improved (42+ new features in 2 years).
Up to 40% cheaper compare to a self-managed Kafka cluster.
Fully compatible with Kafka APIs but not necessary to manage Zookeeper.
REQUIREMENTS & BUSINESS USE CASE:
Understand key business challenges and goals, in order to identify gaps and opportunity, and plan current and future state
During workshop phase, we can perform a Technical & Opportunity assessment, planning technical deep dive session, in order to identify migration success criteria, business & IT Data Lake outcomes
Scope of the Pilot phase is to create a simple Pilot of the target use case, in order to allow customers to have a concrete way to test the solution. We define target architecture & component level mapping, according to requirements collected in previous phases, and execute incremental data migration and automation. After the UAT step, the Pilot is ready to Go Live!
This is typically the last step, where we implement the final solution, split into Waves to guarantee a progressive release of the solution. We define full migration strategy and schedule, and the application code migration with Dual Target approach. Later, we can start with waves of implementation, including bulk import/export and the validation and audit of the solution. After the Test & UAT phases, we are ready for a successful GO LIVE!
Data Reply is the Reply group company offering a broad range of advanced analytics and AI-powered data services. We operate across different industries and business functions, working directly with executive level professionals and Chief Officers enabling them to achieve meaningful outcomes through effective use of data.
We have a consolidated experience in designing and building cloud solutions: we support companies with either lift-and-shift solutions and with cloud native architectures. We help our customers in developing and adopting holistic Big Data architectures and implementing ML and AI models in a manner that is repeatable, efficient, scalable, simple and yet secure.
We supports companies in combinatorial optimization processes with Quantum and Accelerated Computing techniques that enable an engine with high computational performances.