Data Science and Cognitive Computing with HPE Haven OnDemand: The Simple Path to Reason and Insight
HPE Haven OnDemand is a diverse collection of APIs for interacting with data designed with flexibility in mind, allowing developers to quickly perform data tasks in the cloud. See why it is a simple path to reason and insight for data science and cognitive computing.
Data science and cognitive computing are 2 leading contemporary analytics computing paradigms. Though distinct areas of specialization, both models entail considerable overlap in technologies and tools, and often share similar goals, outcomes, and aspirations. Both archetypes are data-driven, are often large scale, and increasingly rely on high performance computing.
Data science encompasses machine learning and other analytic processes, statistics and related branches of mathematics, in order to extract insight from data and use it to tell stories. Data science employs all sorts of different tools from a variety of related areas, including data mining, artificial intelligence, and Big Data.
Cognitive computing aims to make human kinds of problems computable. Cognitive computing systems are probabilistic by nature, able to deal with uncertainty and fluid circumstance, and can move beyond simply providing answers to generating hypotheses.
Both data science and cognitive computing include tasks as relatively simple as the classification of images and as incredibly complex as using unstructured input data to design a system to beat humans at Jeopardy.
HPE Haven OnDemand
As such, there are a number of ways to approach problems in these realms, depending on problem definitions and goals. Given this gamut of potential exercises, is there some repository of tools which could be leveraged to solve problems in the cognitive computing and data science realms? This article introduces HPE Haven OnDemand for just such tasks.
HPE Haven OnDemand is cloud services platform which simplifies how you can interact with data, allowing it to be transformed into an asset anytime, anywhere. HPE Haven OnDemand provides a large collection of machine learning application programming interfaces (APIs) for interacting with structured and unstructured data in a variety of ways. Many of these APIs can be used for data science and cognitive computing tasks.
There is no doubt that data science and cognitive computing are not synonymous. However, given their considerable potential overlap, as alluded to above, this article will give treatment of these topics together, and reference HPE Haven OnDemand APIs which can be used in either domain. Keep in mind, however, that this is not a complete list of available APIs, nor is it an exhaustive collection of recipes for data science or cognitive computing.
This section will cover a selection of HPE Haven OnDemand APIs which are useful for both data science and cognitive computing. Given that much of the world’s data is unstructured (estimated to be approximately 80%), and given that both data science and cognitive computing often rely on such data as input, much of our discussion will be focused on APIs which process and otherwise make sense of unstructured data.
The Speech Recognition API is used for creating transcripts of text from audio or video files, and supports both broadcast and telephony quality content in several languages. It is easy to argue that speech is the most unstructured of all data; that Haven OnDemand is equipped to process this type of data, and turn it into actionable insight or derive from it reason, is incredibly useful.
Haven OnDemand provides a collection of Format Conversion APIs, which are able to process a wide range of files and extract their text for use in further processing tasks. Text can be extracted from files using the Text Extraction API, zipped files can be expanded via the Container Expand API, and text can be extracted from an image using the OCR Document API, to outline a few of the available APIs. Of particular note, the Text Extraction API and View Document API, which renders documents in HTML and highlights text within the document, can both handle over 500 different file formats.