IDOL natively ingests XML files and fully supports the searching, processing, and analyzing of semi-structured content. Standard Boolean operators can be used to help establish relevancy, such as WHEN (structural match), WHENn (nested structural match), and vWHEN (structural weighted search), and as in structured data queries, many other search operators are also supported.
IDOL allows organizations to eliminate the inefficiencies of the manual issues associated with creating XML tags by understanding the content and purpose of either the tag itself, related information, or both. Its key benefits include:
Removing the need to manually insert XML tags
Allowing interoperability between applications that use different XML tagging rules
Allowing applications to use idea distancing (vital relationship between seemingly separately tagged subjects) to increase findability of information
Automating processes that were previously performed manually
Natively indexing XML directly into the engine
Accessibility by XQuery as a query language
Obtaining all output from the engine in XML format
Adding Intelligence to XML
The use of XML is already widespread, but its deployment has significant limitations. Not only are tags often chosen manually in a costly and time-consuming process, but XML also has no built-in understanding of concepts that are similar to one another. In XML, for example, the tag <aircraft> and the tag <plane> are wholly unrelated items. Typically, this presents considerable problems because information from different sources that has been structured using different tagging rules cannot be reconciled, even if there are important conceptual similarities. This lack of conceptual understanding is a considerable handicap to the success of XML as the standard provider for information exchange.
IDOL addresses both issues directly. Its conceptual understanding enables it to automatically insert XML tags and links into documents based on the concepts contained in the information. This eliminates all manual cost. Secondly, IDOL enables XML applications to understand conceptual information independent of variations in tagging schemas or the variety of applications in use. This means, for example, that legacy data from disparate sources, tagged using different schemas, can be automatically reconciled and operated upon.
Seamless XML Interoperability
IDOL provides an infrastructure for complete and automatic interoperability between applications using different XML tagging rules. The IDOL infrastructure is based on a conceptual understanding of XML documents, rather than on the tags themselves.
The use and nature of XML varies hugely between implementations, and IDOL natively handles the full range of schemas. For example, many clients use a huge number of different tags within the schema, a situation that often causes issues for XML-handling software. Autonomy's enterprise-scaling means that such data causes no problems, with the servers switching into more appropriate modes of storage without any prompting.
The use of particular tags within a single schema also varies hugely; some contain full text, some contain product codes or other metadata, and some contain internal information. IDOL is able to treat each of these types separately and automatically so that its statistical processing of the information adapts to the exact data provided. In this way, fields are assigned properties that allow them to be interpreted as fields to perform tokenization on, fields to process numerically - whether they contain single or multiple values, fields whose value is to be stored for optimized retrieval or matching, or even fields that are to be hidden or ignored.
Furthermore, the language-independent nature of all of Autonomy's algorithms means that widely differing XML systems can be integrated, regardless of the language, script or encoding used in the data.
Summary: ...the university’s visibility and accessibility to applicants outside the U.S. Despite these gains, faculty and staff in the reviewing units sometimes had to wait eight weeks to receive the paper file containing an application and supporting materials. This delay impacted ASU’s ability to be competitive...
Summary: ...Croatian Justice System Case Study. The Customer The government in the Republic of Croatia is organized on the principle of separation of powers into legislative, executive and judicial branches. Judicial power is exercised by the courts. The judiciary is autonomous and independent. The courts administer...
Summary: ...TeleForm software, was created to make these matches.” “Before any study starts, a ‘protocol’ or formal document defining the experimental plan is completed. At OSUCCC we administer over 200 protocols, each 40 to 50 pages in length.” The Solution Physicians often find the volume of information...
Summary: ...and accessibility of its information resources, the Department can now focus on the quality of its content to make sure that the full potential of its website is realized. Objective Manage a vast amount of information and ensure its fast and accurate accessibility to users throughout Queensland. Solution...
Summary: ...document libraries, a structure that made document search and retrieval an unwieldy process. Migrating to the leading edge After evaluating the various solutions on the market, O’Hara and Hyler decided it was time to switch to Interwoven WorkSite, with its fully-integrated e-mail management, extensive...
Summary: ...external). The IT department has quotas on mailbox size (1.4GB). These are relatively large due to the nature of academia and research, as both communities have the requirement to share large files (e.g, academic and research documents). When email is used as the primary means of sharing files, the result...
Summary: ...oil. Statoil is one of the world’s largest crude oil traders. Hutchinson says: “This means that Statoil can plan ahead and be ready to use new products when they become available and benefit from the adaptable nature of Meridio and Microsoft to ever-changing customer needs. For example, Statoil has...
Summary: ...Cardiff Case Study; American Express. [CDF AMX CS] www.cardiff.com way Linmar and (Cardiff) strove to address issues and improve capabilities.” For Mark King, senior manager, Customer Process Listening, the long-term success of the solution came from the constant commitment to improve the product. “TeleForm®...
Summary: ...the latest regulations and best practice advice but will also be able to cross-reference this with the internal policies and uncover, often hidden, stores of unstructured information contained in documents around the company network. By using the inherent intelligence of Aungate (powered by the Autonomy...
Summary: ...regulations and best practice advice but will also be able to cross-reference this with the internal policies and uncover, often hidden, stores of unstructured information contained in documents around the company network. By using the inherent intelligence of Aungate (powered by the Autonomy IDOL software)...
Summary: ...pieces of information contained within HOLMES 2. During major incidents, such as unsolved murders, IDOL is used to automatically compare all data to identify hidden connections that otherwise may have gone unnoticed , enabling new lines of enquiry to be opened. The technology complements officers’ existing...
Summary: ...BAE Systems Customer Case Study. And it automatically alerts BAE SYSTEMS employees to documents in the system that relate to what they're doing, or to other employees in the company whose interests and expertise match their own. BAE SYSTEM’S CEN Clustering. This intuitive java based user interface allows...
This is a small selection of the Autonomy case studies available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...information, or both. IDOL can automatically insert XML tags and links into documents based on the concepts contained in the information. IDOL’s meaning-based technology also provides an infrastructure for complete and automatic interoperability between applications using different XML tagging rules....
Summary: ...Boolean, natural language and other retrieval methods Dashboard for personalized views Review, assemble and edit content Playlists for ordering and sequencing Create, save and reuse personalized projects for easy organization Collaborate by sharing or e-mailing content Data export options for XML, ALE,...
Summary: ...with EDL Control Automated clipping and segmentation with AutoClip™ Identification and SmartClips™ Real-time information access using Boolean, natural language and other retrieval methods Fast, scalable and language independent retrieval and data processing with IDOL Server Dashboard for personalized...
Summary: ...utmost accuracy. Autonomy’s accuracy is rooted in highly sophisticated pattern-matching process that is based on concepts to categorize documents and automatically insert tag data sets, route content or alert users to highly relevant information pertinent to the user’s profile.
...
Summary: ...builds a time synchronized index providing immediate, specific retrieval of content Media Analysis Plug-Ins – Allow content owners to enhance indexing capabilities Database Plug-Ins – Enable communication between VideoLogger and any digital asset management systems based on XML or SQL standards ControlCenter-...
Summary: ...email and attachment viewing • TIFF on- demand • Conceptual, phoneme, keyword and Boolean search • Sophisticated duplicate and near-dupe filtering options • Full support for EDRM XML load file and all legacy load file formats • Full forensic metadata extraction such as: unhiding columns, extracting...
Summary: ...document widely accessible and usable by delivering Web-ready HTML and valid XML to end-users and applications. Convert Multiple Documents Simultaneously KeyView IDOL Export can be configured to convert files to XML and HTML in the same process as the calling application (in-process) or as a separate...
Summary: ...activity logging Tracks who downloads what and when; tags assets derived in Virage MediaBin for tracing back to the core asset XML support Shares data with other applications and systems through XML export and import routines High-volume processing Accommodates high-volume, complex imaging tasks from...
Summary: ...i.e, “lettuce,” “skateboarding,” and “documents." Using this information, the investigator can quickly refine the query to exclude the terms “lettuce” and “skateboarding” or tag them as likely non-responsive without sampling and reading irrelevant documents. Autonomy’s conceptual analytics...
Summary: ...of topics with visual navigation and cluster drill-down - Early understanding of hidden and language data • Native support for over 100 Languages & 1000 file types processed • Direct Discovery and Manage In-Place process • Full support for EDRM XML load file and all legacy load file formats • Petabyte...
Summary: ...through advanced natural language processing techniques, treating words as abstract symbols of meaning and deriving its understanding through the context of their occurrence rather than a rigid definition of the language and grammar. This means that IDOL has no problem understanding slang, industry specific...
Summary: ...either explicitly with a natural language description or Boolean expression. Most powerfully, an agent can be trained or re-trained by example, simply by being shown a document, video, or verbal conversation that matches a user’s interests. The Agent will then learn the concepts within the example and...
This is a small selection of the Autonomy Product Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...of working with native XML automatically. IDOL allows organizations to eliminate the inefficiencies introduced by many of the manual issues associated with creating XML tags by understanding the content and purpose of either the tag itself, related information, or both. IDOL provides the critical layer...
Summary: ...source or date, how often the connector downloads information from the moreover site, how much information it downloads, which words the information must contain or may not contain etc. Please note, the Moreover Fetch can only operate correctly if there is an agreement present with moreover.com to access...
Summary: ...results of the natural language retrieval, users can quickly refine their search to precisely focus on the context they require. • Cross-Language Search Autonomy delivers a language independent software infrastructure that enables content to be conceptually retrieved in any language delivering both...
Summary: ...ImportSlave, OmniSlave, BinSlave & PDFSlave • Combine data from any number of tables into a single document • Support for multiple jobs performing different actions • Schedule jobs independently of each other • Extract data as any text based format including HTML & XML • Extract binary document...
Summary: ...efficiencies never experienced before. Autonomy is capable of aggregating any form of structured, semi-structured and unstructured data. This "data agnostic" capability is facilitated through a variety of Autonomy connectors for a considerable number of proprietary data repositories and file formats....
Summary: ...quickly and easily. Features and Benefits Autonomy’s unique infrastructure, which carries out operations on unstructured content through understanding the context of the information, solves the issues of manually tagging information and the defocused approach of using legacy keyword systems to find...
Summary: ...experienced before. Autonomy is capable of aggregating any form of structured, semi-structured and unstructured data. This "data agnostic" capability is facilitated through a variety of Autonomy connectors for a considerable number of proprietary data repositories and file formats. Autonomy supports many...
Summary: ...license. [AUT DAT] 16.04.07 Architecture Client Machine DiSH Service Front End Interface DAH IDOL TM SERVER DIH Connector Data Repository Audio News Documentum Internet Internet PDF PDF SharePoint LiveLink Video XML MS Word MS Outlook MS Power Point MS Excel Lotus Notes Supports over 1000 data formats...
Summary: ...efficiencies never experienced before. Autonomy is capable of aggregating any form of structured, semi-structured and unstructured data. This "data agnostic" capability is facilitated through a variety of Autonomy Connectors (also referred to as Fetches) for a considerable number of proprietary data repositories...
Summary: ...therefore providing automated efficiencies never previously experienced. Autonomy is capable of aggregating any form of structured, semi-structured and unstructured data. This data agnostic capability is facilitated through a variety of Autonomy connectors for a considerable number of proprietary data...
Summary: ...never experienced before. Autonomy is capable of aggregating any form of structured, semi-structured and unstructured data. This "data agnostic" capability is facilitated through a variety of Autonomy Connectors (also referred to as Fetches) for a considerable number of proprietary data repositories and...
This is a small selection of the Autonomy Technical Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...itself, related information, or both. Its key benefits include: • Removing the need to manually insert XML tags • Allowing interoperability between applications that use different XML tagging rules • Allowing applications to use idea distancing (vital relationship between seemingly separately tagged...
Summary: ...insert XML tags and links into documents, based on the concepts contained in the information. This eliminates all manual cost. Secondly, IDOL server enables XML applications to understand conceptual information, independent of variations in tagging schemas or the variety of applications in use. This means,...
Summary: ...approach allows organizations to overcome problems that are typically associated with XML by: Removing the need to manually insert XML tags Allowing interoperability between applications that use different XML tagging schemes Indexing native XML directly into the engine Obtaining all output from the engine...
Summary: ...of rich media, widespread adoption of VOIP, growing use of IPTV and increased scrutiny of white collar crimes. This widespread adoption of rich media has necessitated the “findability” of such content, especially as it has seen increasing importance in eDiscovery cases. “Search of video files is...
Summary: ...powerful retrieval features, including natural language, conceptual search, refine by example, crosslanguage search and query by example. Autonomy also supports legacy retrieval mechanisms, such as keyword, Boolean, Proximity, Exact Phrase, Soundex and many others etc. 㼀 Active matching Proactively...
Summary: ...network in which apparently unrelated pieces of information are automatically linked via dynamic probabilities. The second reason is that the documentmatching algorithm itself within IDOL uses widespread “short-circuiting” and iterative calculation to ensure that it only performs exactly as much calculation...
Summary: ...network in which apparently unrelated pieces of information are automatically linked via dynamic probabilities. The second reason is that the documentmatching algorithm itself within IDOL uses widespread “short-circuiting” and iterative calculation to ensure that it only performs exactly as much calculation...
Summary: ...Autonomy Internationalization White Paper 20031003. Autonomy also supports legacy retrieval mechanisms, such as keyword, Boolean, Proximity, Exact Phrase, Soundex and many others etc. 㼀 Active matching Proactively link users with relevant information they require, accurately, in context and in realtime...
Summary: ...ad-hoc searching to look for specific words within the recording or for a complete review for relevancy or privilege tagging. Summary Traditionally, multimedia content has been considered an unwieldy resource, requiring considerable man-hours to extract tangible benefits and returns. As a consequence...
Summary: ...for users. While they are intended to be a convenient tool to manage and administer corporate email, the reality is that the use of PST fi les has a detrimental impact on user productivity and causes increased administrative overhead. PST fi les often contain private and proprietary information, but their...
Summary: ...technologies to automatically: • “read” and understand all content and information, whether structured or unstructured • identify and extract relevant concepts, categories, and keywords • classify and tag each discrete piece of content in relation to all available information • per form real...
Summary: ...weak when the search involves ver y shor t words that contain only one or two syllables due to the vast numbers of potential matches. Word Spotting Word spotting is the process of recognizing isolated words by matching them to the sounds that are produced. As with phoneme matching, word spotting techniques...
This is a small selection of the Autonomy White Papers available, please visit our publications site at http://publications.autonomy.com/ for more information.