Knowing the trends, analyzing behaviors and meeting copyright contracts – businesses in the online on-demand-industry must provide highly customized services. To meet this need an online media streaming service had to implement a variety of real-time data-driven use cases.
The most important one: preventing customers from being able to log into a premium account on several devices simultaneously with a blocking feature. This is necessary because of legal contracts with content providers, but also has a clear impact on the service business. The blocking functionality also needed to work without data from the frontend players because at that time the frontend components did not provide the necessary information for the backend.
The client operates his infrastructure completely on the Amazon Web Services public cloud and has a focus on using modern technologies and approaches for containerization, scalability and event sourcing.
Data Reply was tasked to build the first use case (blocking concurrent Streams) from scratch. The blocking must be enacted quickly so the team decided on a near-real-time event-driven approach. The idea: The frontend components would generate events which need to be processed in a scalable way. The events would be processed in the cloud environment and the results would be made available through REST APIs.
First the consultants needed to identify what data was needed to block the simultaneous account access. Therefore, Data Reply developed an integration for the customer frontends that would send packages of information (so-called heartbeats) every 10s to a REST API layer. These events are then validated and sent to Apache Kafka, a scalable event streaming platform, where they can be distributed to a multitude of microservices which all enable different use cases.
For the blocking of concurrent streams, a microservice using the Kafka Streams framework aggregates the heartbeats in real-time and provides REST APIs. These enable the frontends to check whether the video stream they are displaying needs to be blocked or not.
Once structured data was provided through the event streaming platform, more and more use cases could be defined and prioritized:
Accommodating the cloud-only approach of the customer, Data Reply leveraged AWS technologies and specifically serverless technologies wherever possible.
All services have been containerized and deployed on container orchestration services like Fargate and ECS. DynamoDB has been used for intermediate storage of unstructured JSON payloads for clickstream tracking. Logging is performed through CloudWatch. Moreover, all infrastructure provisioning and rolling updates of the services are performed through AWS CloudFormation with an Infrastructure-as-Code approach. AWS Secrets Manager is leveraged to securely share credentials for other systems with the container instances. Permissions are efficiently managed through IAM policies.
Best practices for AWS development and infrastructure provisioning are used to provide efficient use of resources. Templates have been prepared to lessen the bootstrap time of rolling out a new stream processing service. This limits the effort spent on infrastructure and deployment, which can then be spent on implementing business logic and deliver value.
The new solution is scalable by design, taking advantage of the elasticity of AWS Cloud services and of the scalability guarantees of Apache Kafka.
Moreover, the introduction of event sourcing with a high volume of granular and information-dense data points allows for a variety of new use cases and business values to be explored and brought to production. All the services and architectures have been implemented with minimal operational and maintenance effort, which decreases the total cost of ownership.
This allowed Data Reply's customer to meet contractual obligations, deliver an improved user experience and provide more accurate and timely data to the business stakeholders in a cost-efficient manner.
Data Reply is the Reply group company offering a broad range of advanced analytics and AI-powered data services. We operate across different industries and business functions, enabling them to achieve meaningful outcomes through effective use of data. We have strong competences in Big Data Engineering, Data Science and IPA; we build Big Data platforms and implement ML and AI models in a manner that is repeatable, efficient, scalable, simple and yet secure. We supports companies in combinatorial optimization processes with Quantum Computing techniques that enable an engine with high computational performances.