Reply is the place to meet an incredible variety of enthusiastic, passionate, ideas-driven people, who want to make a difference and an impact.Would you like to know more?
Every day, 2.5 quintillion bytes of new data are produced, many of which represent unstructured documents of various types written in natural language: requests, reports, complaints, medical prescriptions and claims written in different languages. Due to the unstructured nature of the data, it was estimated that organisations typically fail to capitalise on more than 8% of their information assets.
Today, Artificial Intelligence is experiencing a strong growth, driven by the potential offered by Cloud computing. Natural language processing has always been one of the most closely followed aspects of Artificial Intelligence. Within the landscape of the Artificial Intelligence technologies available, about a dozen solutions have excelled and are now leaders in the market. Blue Reply chose to adopt the IBM Watson technologies: compared to major competitors, these are characterised by a high degree of maturity in terms of Machine Learning capabilities, a wide range of products (both on-premise and Cloud-based) with many out-of-the-box features, internationalisation and a high degree of flexibility in designing the solutions.
Particular attention should be paid to the creation of the dataset (the sample of documents used to train the system). Performance percentages can be evaluated compared to a small set of documents that have been manually annotated through human intervention. Using fully manual data extraction processes, software development specialists and domain experts work in isolation, learning to interface with each another with difficulties attributable to the knowledge of the domain and/or dealing with the study of language that can often be ambiguous. The Watson technologies can help simplify and make this process intuitive: through the sharing of a collaborative platform, cognitive specialists and domain experts can collaborate by integrating products and APIs, in order to develop an automated solution designed to process large volumes of data.
The creation of the dataset using Watson is therefore: