| Functionality |
|
Structured Data | | | Semi-Structured Data | | | Unstructured Content |
|
Support for Semi-Structured Content
IDOL natively ingests XML files and fully supports the searching, processing, and analyzing of semi-structured content. Standard Boolean operators can be used to help establish relevancy, such as WHEN (structural match), WHENn (nested structural match), and vWHEN (structural weighted search), and as in structured data queries, many other search operators are also supported.
IDOL allows organizations to eliminate the inefficiencies of the manual issues associated with creating XML tags by understanding the content and purpose of either the tag itself, related information, or both. Its key benefits include:
Adding Intelligence to XML
The use of XML is already widespread, but its deployment has significant limitations. Not only are tags often chosen manually in a costly and time-consuming process, but XML also has no built-in understanding of concepts that are similar to one another. In XML, for example, the tag <aircraft> and the tag <plane> are wholly unrelated items. Typically, this presents considerable problems because information from different sources that has been structured using different tagging rules cannot be reconciled, even if there are important conceptual similarities. This lack of conceptual understanding is a considerable handicap to the success of XML as the standard provider for information exchange.
IDOL addresses both issues directly. Its conceptual understanding enables it to automatically insert XML tags and links into documents based on the concepts contained in the information. This eliminates all manual cost. Secondly, IDOL enables XML applications to understand conceptual information independent of variations in tagging schemas or the variety of applications in use. This means, for example, that legacy data from disparate sources, tagged using different schemas, can be automatically reconciled and operated upon.
Seamless XML Interoperability
IDOL provides an infrastructure for complete and automatic interoperability between applications using different XML tagging rules. The IDOL infrastructure is based on a conceptual understanding of XML documents, rather than on the tags themselves.
The use and nature of XML varies hugely between implementations, and IDOL natively handles the full range of schemas. For example, many clients use a huge number of different tags within the schema, a situation that often causes issues for XML-handling software. Autonomy's enterprise-scaling means that such data causes no problems, with the servers switching into more appropriate modes of storage without any prompting.
The use of particular tags within a single schema also varies hugely; some contain full text, some contain product codes or other metadata, and some contain internal information. IDOL is able to treat each of these types separately and automatically so that its statistical processing of the information adapts to the exact data provided. In this way, fields are assigned properties that allow them to be interpreted as fields to perform tokenization on, fields to process numerically - whether they contain single or multiple values, fields whose value is to be stored for optimized retrieval or matching, or even fields that are to be hidden or ignored.
Furthermore, the language-independent nature of all of Autonomy's algorithms means that widely differing XML systems can be integrated, regardless of the language, script or encoding used in the data.
| Functionality |
|
Structured Data | | | Semi-Structured Data | | | Unstructured Content |
|


















