Case Study

A harmonised Big Data management model

Data Reply implemented for RCS a harmonised big data management model to enable new digital business opportunities.

A customised platform

The solution implemented by Data Reply for RCS - a leading Italian multimedia publishing Group, operating in the full range of publishing sectors: from daily newspapers to magazines, from books to TV from radio to new media - has made it possible to launch a new monetisation channel that exploits the intrinsic value of digital information, thanks to a customised platform designed to easily handle the complex multitude of data relating to users and content. Today, RCS has a platform at its disposal that allows it to adapt to the Group’s diverse and growing digital business needs.

Needs and goals

The emergence and development of innovative techniques in the Big Data and Machine Learning fields has opened up new scenarios and business opportunities for the RCS MediaGroup to exploit and monetise data related to users and content, collected directly or through third-party software. In order to take its first steps in this direction, RCS turned to Data Reply, which capitalised on its considerable experience in these areas to design and implement a platform capable of collecting and aggregating the various data sources, to produce the first promising results in terms of advanced analysis using machine learning models. With a total of 15 million individual monthly users, RCS Media Group’s online offer generates an incredible amount of information that can be stored and enhanced using both external sources and synergetic relationships among existing sources, in order to generate value for the Group and for its users. In particular, the initiatives on which RCS and Data Reply collaborate aim to facilitate targeted online marketing initiatives, to significantly improve user experience and to promote customer retention. The platform has enabled RCS and Data Reply to work quickly in order to facilitate the first data monetisation projects:

  • Audience enrichment: predicting demographic parameters (e.g. gender, age, etc.) associated with anonymous browsing profiles in order to be able to administer their targeted content, as is the case for registered users;

  • Category intenders: identifying consumers interested in specific product categories, in order to target them with advertising campaigns or products in line with their preferences;

  • Propensity to click: identifying users who are more likely to click on the banners of a given web campaign;

  • Churn prevention: identifying the specific behavioural, social, demographic and subscription characteristics that prompt a user not to renew their subscription and, with the help of this information, trying to keep customers through targeted marketing campaigns;

  • Content and product recommendation: identifying products or content deemed to be of interest to a certain type of target users.


Starting from RCS’ requirements and drawing upon its extensive expertise gained through projects completed for numerous customers, Data Reply designed and implemented a Microsoft Cloud IaaS solution based on the Hadoop platform and in particular on Cloudera's distribution, a company of which Data Reply is a Silver partner. The information stored in RCS’ systems relating to user browsing is diverse and heterogeneous:

  • Data Management Platform: metadata about pages and their contents, navigation-related events (e.g. clicks on an article) and visualisation-related events (such as video playback), mapping of users on their interest segments, at a content level;

  • Web analytics solution: real-time navigation information (app, store, C+);

  • Datawarehouse: comprehensive view of users, as subscribers and buyers;

  • Semantic analysis platform: taxonomic and semantic information about textual content.

In order to integrate these data sources among each other and find synergies, a data ingestion layer interfacing with the Cloudera environment was designed and developed, which maintains the information synchronised with the source. Thanks to this approach, the Big Data environment becomes a natural playground for data exploration, data analysis and advanced analytics techniques, as a single point of access and integration of the information. The architecture and the horizontal scalability typical of Hadoop, which due to the addition of new commodity machines allows the computational power to grow as needed, also facilitates the handling of data based on an agnostic approach with respect to its volume, velocity and variety. This allows the data science team to freely explore and process the data, creating models within the context of initiatives targeted to key business objectives. The R and Python data science tools, used through an innovative collaborative data science platform, have allowed the “data scientists” team to achieve the first promising results in record time, viewed through business KPIs on a custom dedicated dashboard.


In Q2 of 2016 RCS was just embarking on this new journey and, with the help of Data Reply, the company implemented its Big Data platform and obtained the first significant results within the audience enrichment and category intenders areas. During the second half of 2016, the efforts focused on consolidating the existing platform and industrialising the already implemented use cases. RCS will continue to pursue new and ambitious initiatives throughout the course of 2017: migration of the platform to on-premise infrastructure to reduce operational costs and implementation of the various scenarios that were originally hypothesized.


Data Reply is the Reply Group company specialised in data management using Big Data & Advanced Analytics methodologies. Data Reply supports customers in the design and implementation of data platforms that aim to enhance and capitalise on corporate information assets. Data Reply is a team of Big Data Engineers and Data Scientists with extensive expertise and a high number of Big Data systems in production.