Every day, 2.5 quintillion bytes of new data are produced, many of which represent unstructured documents of various types written in natural language: requests, reports, complaints, medical prescriptions and claims written in different languages. Due to the unstructured nature of the data, it was estimated that organisations typically fail to capitalise on more than 8% of their information assets.