Reply is the place to meet an incredible variety of enthusiastic, passionate, ideas-driven people, who want to make a difference and an impact.Would you like to know more?
To predict the expected turbine condition, Random Forest Regressor Models are used, an advanced decision tree concept that is less prone to overfitting.
The models allow a deviation of the real data from the expected value to be determined with high accuracy. This makes it very easy to identify even small anomalies in the custom dashboard.
Data Reply's experts used open source tools for the development, among other things to avoid license fees. These include on-demand Spark Clusters in the Azure Cloud.
As soon as enough data is available, an automatic trigger creates a cluster and starts data processing automatically. After processing, the cluster is shut down again, resulting in high cost savings. Even if the solution with Spark would be possible as a 24/7 streaming job, the "On-Demand Batch Job" variant is preferred, since it is a cluster of about 2TB RAM, which is thus only switched on 2-3 hours a day.