Agency Health Check

Technology Reply has designed and implemented a cloud architecture for an important Italian insurance company that expressed the need of a system capable of recognize, categorize and visualize critical IT events

Scenario

As part of the initiatives for disservices monitoring, an important Italian insurance company expressed the need of a system capable of recognize, categorize and visualize critical IT events that would generate an impact on the operation of any of its 1000+ insurance agencies located in the various Italian municipalities. To answer this request, Technology Reply has designed and implemented a cloud architecture capable of analyzing a high flow of real time events, detecting occurring critical issues or anomalies from these analyzed events and reporting on a web map every agency that is experiencing critical disservices.

Advantages

  • Cloud-based infrastructure, allowing greater flexibility and scalability according to the amount of data to be processed
  • Self-managed cloud services allowing reduced deploy time and costs
  • A near real-time monitoring system enabling rapid reaction time to disservices
  • A geographic web map for a detailed overview of all the agencies status across the national territory

Solution

The proposed solution consists of a back-end system that analyzes and classifies the captured events and a front-end web page that shows the agencies experiencing service downgrade on an interactive map chart. To meet customer requirements, especially in terms of effectiveness, efficiency and costs, different architectures were designed using both Oracle and Cloud services. After reviewing each alternatives advantages and disadvantages, a hybrid architecture supported by some Google Cloud Platform services, such as Google Cloud Composer and BigQuery, was chosen. Google Cloud Composer was used for the orchestration of data flows and BigQuery for the creation of a serverless, highly scalable data warehouse. Node.js and React were also used for the creation of the front-end application.


Back-end

The architecture was designed to analyze, process and manage large quantities of events generated by agencies in near real time. A certain number of Beacons are used to monitor agencies availability, and all their data is pulled from an Oracle On-premise database and uploaded into the cloud data warehouse for in-depth analysis and extracting valuable informations about any occurring disservices. Data ingestion and subsequent processing are managed by a cloud workflow orchestrator, which allows to design, implement and monitor pipelines both on cloud and on-premise data centers.

Through some Python implemented pipelines the orchestrator exploit the data warehouse to orchestrate the execution of aggregation queries on the real time data, and subsequently calculate the metrics for the detection of possible occurring disservices. By comparing these metrics against specific thresholds, relevant events are finally displayed on the web application, designed to show disservices informations in near real-time, while the historical data visualization is provided by a built-in dashboard. The communication between the various components, both cloud and on-premises, was made possible by system integration.


Front-end

Through the back-end implemented solution each disservice is categorized in near real time as severe, moderate or minor, and is shown on the geographical map of Italy by marking the company’s insurance agencies that are experiencing impacts in their operation with different color codes based on the severity (red, yellow or white). In this way the number of agencies that are experiencing disservices and those operating regularly can be constantly monitored on the map. In the event that an agency is being hit by several disservices at once, it would be categorized on the map considering the most serious of them. The map can be freely zoomed, navigated and filtered by application or agency, showing detailed informations on the ongoing disservices for each individual agency via pop-ups, which highlight the name of the agency, the municipality to which it belongs, the affected application and statistical data on its response times. An overview section has also been provided for monitoring purpose, constantly reporting the most relevant disservices occurring, which are then shown in the foreground within a group of banners, with the aim of highlighting the agencies that present the most critical issues.