An Introduction to OpenText Knowledge Discovery (IDOL)
The Data Mountain
As the amount of data in the world increases exponentially, it can be increasingly difficult to get a handle on what you have, let alone how you might use it. It’s hard enough for an individual to keep track of several years of social media, email, and photos… but what about big companies? With hundreds or even thousands of users generating content, using their own filing and organization methods, combined with employee turnover resulting in a loss of this institutional knowledge, there is an overwhelming mountain of data.
Some of it is usefully structured and searchable; the databases, the accounts. These have been carefully curated to make it easy to manage, and we rarely have difficulty finding what we need. The rest, however, is the kind of messy, unstructured data that rapidly accumulates—either because you are legally required to keep it, or because it might come in useful one day, or simply because storage is cheap and you don’t bother to delete it.
One person, devoting their entire life to the effort, would struggle to get a handle on it all, even if data production stopped today. Take into account that you are chasing a moving target, and it becomes impossible.
More importantly, it isn’t necessary for people to do this work. Most of the data mountain is the kind of dull, everyday content that can safely be ignored. We don’t need to reread a meeting invite from 2003, or watch thousands of hours of security footage of an empty corridor.
But in the midst of all that, there’s the ten minutes of security footage that shows someone breaking in to the secure room, or an email that breaks insider trading regulations.
So how do you flag the important things, while consigning the ordinary to the digital equivalent of the basement archives?
OpenText Knowledge Discovery (IDOL)
OpenText Knowledge Discovery (IDOL) is a suite of products designed around a simple philosophy:
We want you to be able to find and use the valuable data that you collect and store, with as little manual input as possible.
Knowledge Discovery (IDOL) is not a single product; rather it is this unifying principle. Under this umbrella, there are four core areas that make up the Knowledge Discovery (IDOL) product suite:
- Knowledge Discovery (IDOL) Text Analytics – search, analytics, and data enrichment for unstructured text sources.
- Knowledge Discovery (IDOL) Rich Media Analytics – search, analytics, and data enrichment for image, audio, and video sources.
- Knowledge Discovery (IDOL) File Content Extraction – file format detection, text extraction, and rendering
- Knowledge Discovery (IDOL) Ingest – data collection and enrichment
Each of these areas forms a part of the whole, which you can use individually or in combination to solve a variety of business problems. The power of IDOL is that you can mix and match compatible products in different groupings in a way that suits the problems you need to solve.
Knowledge Discovery (IDOL) Text Analytics
Text Analytics can start with a simple keyword search, but it also includes many different tools to help you make the most of your unstructured text data.
A few commonly used examples include:
- Entity Extraction – Knowledge Discovery (IDOL) Eduction allows you to extract valuable snippets of information (entities) from your text and use it to tag documents. For example, you can use this to find Personally Identifiable Information (PII) in your documents, to ensure your compliance with regulations such as GDPR.
- Query Analytics – Knowledge Discovery (IDOL) can take a simple search and provide insight into your data. The Find application provides visualizations so you can see at a glance the kind of results you have for a particular search, such as topic maps and timelines. You can also perform search comparisons, or create a geographical map of your results.
- Virtual Assistant – Knowledge Discovery (IDOL) Natural Language Question Answering provides a simple, automated tool to help your customers with common problems, reserving your support staff for the more unusual requests.
Importantly, Knowledge Discovery (IDOL) also provides document security to ensure your document access restrictions remain in place in the search results and analytics in the same way as in your original repositories.
Knowledge Discovery (IDOL) Rich Media Analytics
Rich Media Analytics provides tools for making the most of your images, videos, and audio files. Rich Media can process media from streams (such as broadcasts, or security cameras providing continuous content) or discrete files, and it can perform analytics such as text capture from images (OCR), face detection and recognition (finding and identifying faces in images), object recognition (such as logo detection), speech-to-text, and speaker recognition.
It has a wide variety of applications, including:
- Broadcast Monitoring – retrieving and analysing content from ongoing broadcasts to find salient news stories, and process content for analysis. You can use this for many things, from keeping track of developing news stories, to checking how often a brand logo appears in a broadcast.
- Security and Surveillance – automating security systems to detect particular events, such as abandoned luggage or traffic infractions, to augment human oversight and reduce human error.
- Personal Data Protection – finding PII in media such as text in images, faces, or number plates, and redacting it.
Knowledge Discovery (IDOL) File Content Extraction
On the surface, file format detection and text extraction does not sound particularly glamorous, but in practise KeyView is one of the workaday engines that makes a lot of Knowledge Discovery (IDOL)’s most powerful functionality possible.
KeyView can detect and categorize over 2000 file formats, which allows for appropriate routing for different types of files. It detects the format by using the file content, which is more accurate than using the file extension, and it can often detect different versions of a format, which might require different processing.
In addition, KeyView can extract subfiles from a variety of file formats, and filter the text from hundreds more. Text filtering allows you to create a Knowledge Discovery (IDOL) index from your raw data. Moreover, KeyView supports many old file formats that no longer have a native viewer, allowing you to recover otherwise inaccessible content.
You can also use KeyView to export files to XML, HTML, or PDF, and to render a document for easy viewing in a Web browser.
Knowledge Discovery (IDOL) Ingest
Knowledge Discovery (IDOL) Ingest is another important working part that makes the rest of IDOL possible, used to retrieve and enrich your content.
Knowledge Discovery (IDOL) Connectors allow you to retrieve data from over a hundred repositories, which you can then route to KeyView for file and text extraction, and then onwards to other tools for data enrichment or indexing.
Knowledge Discovery (IDOL) NiFi Ingest provides a front-end application based on Apache NiFi to allow you to easily visualize complex ingest chains and manage your document flow. Many of Knowledge Discovery (IDOL)’s data enrichment functions are available as NiFi processors, so you can manage tasks such as OCR, Speech-to-text, and Eduction all in one place as part of data ingestion.
Solving Problems
All businesses have unique problems, and Knowledge Discovery (IDOL) provides building blocks that you can connect together in a unique way to solve them. Knowledge Discovery (IDOL) allows you to make the most of all your valuable data. Choose a time that fits for you, follow the link and organize a demo with us to show you how.
Be sure to connect with OpenText on Twitter and LinkedIn.
We’d love to hear your thoughts on this blog. Comment below.
The OpenText Content Services team
Know your data | empower your people | drive your future
Join our community | @ContentServicesCloud | www.opentext.com