For the last 20 years, much effort has been put into approaches to deal with unstructured information. These approaches, called parsing or semantic analysis, use rules of grammar and lexicons to try to explicitly understand textual information.
The Inherent Complexity of Language
In spite of more than two decades of research into semantic approaches, it is rarely used in real applications because its results and performance have yet to live up to expectations in real-world problems. The following cases illustrate the limitations of this approach, namely, the inability of parsing to handle ambiguity.
Example 1: 'The dog came into the room; it was white.'
It is unclear from the sentence whether it is the dog or the room that is white. On the other hand, a human being would have little problem deciphering the following examples because of his or her familiarity with both rooms and dogs:
'The dog came into the room; it was furry.'
'The dog came into the room; it was full of furniture.'
In this case the computer would be stumped. It lacks the understanding to solve such ambiguities. Some advanced systems will allow the construction of a set of rules for the machine to follow to resolve these uncertainties. However, the instruction set would be incredibly cumbersome and difficult to maintain, and would significantly degrade the system's performance.
Example 2: 'The fly, it's clear to me, can fly faster than the bee.'
The computer may be confused by the word 'fly', which is used in this sentence as both a subject and a verb. But that is an easy problem to solve. What about the word 'it'? How does one parse a word that refers to abstract thought?
These problems are exacerbated when a computer attempts to extract meaning by parsing full paragraphs.
Example 3: 'The president arrived by car to meet the Chinese premier.'
Like keyword-based approaches, semantic analysis cannot determine the relative importance of ideas. In other words, the computer will assign an equal level of importance to the President, his mode of transportation and the leader he is meeting with. In addition, parsing is designed to handle a few sentences. A strict parsing mechanism has great difficulty in extracting meaning from a full paragraph. On the other hand, Autonomy is able to understand the concepts underlying a large corpus of information, from a paragraph to a whole document, meaning that relevant emphasis is placed on each theme within the document.
Reliability
Because semantic analysis is based on a true/false decision tree and rules structure, one incorrect decision or the occurrence of an unknown construct can derail the entire analysis.
Language Dependent
The semantic approach is language specific and its reliance on the grammar of a given language means it is vulnerable to slang or grammatically incorrect constructions. As the system needs to be taught every new word or change in meaning, it cannot scale easily. More generally, the system will only support a very limited subset of languages, for example English, German and Dutch, and adding a new and very different language, such as Chinese, can be problematic. Autonomy is uniquely able to handle any language.
Question and Answer Systems
An increasing number of search vendors now offer users the ability to retrieve information through natural language questions. While this approach may work well for one sentence questions or queries concerning a known universe of information, the language model simply breaks down when employed on large documents with many concepts. This occurs because question and answer systems rely on the simple combination of manually defined "question forms" and a corresponding structured dataset that holds the relevant answers. As a result, these systems can only recognize precise questions and the matching answers that have been stored in the database. They cannot find concepts outside this manually defined structure that might supply relevant answers to users' questions. Equally, question and answer systems cannot understand questions that are phrased using slang or worded slightly differently, even if these queries would make perfect sense to a human.
Autonomy's Approach
Autonomy's pattern matching technology uses predictable statistical word patterns to represent concepts and functions independently of any given language.
Summary: ...irrelevant hits—everything from pigpens to pendants. With IDOL K2, users can specify “writing pens,” and the system intuitively knows exactly what they want. Customers can also type in a sentence, a part number, a manufacturer number or a contractor name. “By deploying K2, the tree structure can...
Summary: ...will spend less time on bail or remanded in custody. ●Prisoners awaiting sentence will have pretrial reports completed more quickly. ●Prisoners will be inducted into a correct prison regime on conviction. ●Fewer adjournments will take place in courts due to incorrect information. ●Judges and magistrates...
Summary: ...monitor agent performance and provide effective coaching. Findings: • etalk’s Lee found that Ecolab was “calibrating to the validity of the evaluation question.” • The calibration process was tedious, leading to frequently heated, subjective discussions occurring over whether evaluation questions...
Summary: ...Intranet. In addition to publishing decisions on the Internet and Intranet, SuPra enables preparation of the hard copy publication "Izbor odluka”. This contains a selection of the most important legal decisions with the text of the legal standpoints/sentences from legal experts attached, together with...
Summary: ...US National Parks Service Case Study. The pattern-matching algorithms that power IDOL’s concept-based search allow users to enter full-sentence queries or ask questions in plain language and get relevant results that simply isn’t possible with keyword search engines. For example, when a user enters...
Summary: ...and access all project-related documents, e-mails, correspondence, images, and other content within a single job-related fle, so consultants throughout the frm could fnd and collaborate around content more intuitively.” While this job-centric approach was new for the real estate industry, it was a natural...
Summary: ...Case Study: State of North Dakota. Case Study State of North Dakota The State of North Dakota uses Cardiff TeleForm to streamline the processing of tax returns. During a recent tax season, North Dakota manually processed more than 500,000 sales tax forms and 325,000 individual income tax forms, in spite...
Summary: ...with an Integrated Solution from Autonomy and Microsoft Overview Krieg DeVault had already planned its transition to Autonomy iManage-powered electronic matter files, including fully integrated e-mail, when an act of nature provided a compelling reminder of the project’s importance. A windstorm of historic...
Summary: ...knowledge management system. BioPAD by the nature of its business is required to assimilate, store and distribute a large volume of both electronic and paper based information. Time wasted searching for information and the impact of using incorrect outdated information or losing information is highly...
Summary: ...Krieg Devault Case Study. Case Study Industry Legal Challenges ■ The lack of a unified, electronic matter file limited attorney productivity and mobility ■ E-mails stored in PST files were difficult to manage and vulnerable to corruption ■ Paper-based content and calendars could not be backed up...
Summary: ...applications in 1999-2000 to 16,423 in 2001-2002, in addition to several thousand applications for non-degree study. About 46% of the degree applicants are international. The Challenge Due to physical space and budget limitations, the Graduate Admissions Office (GAO) sought to process an increased numbers...
Summary: ...through decentralized local initiatives led to an inconsistent online identity and a confusing user experience. Based on HTML pages with no underlying content management system, the site was riddled with broken links and orphan pages, while the different navigation logic used for various sub-sites made...
This is a small selection of the Autonomy case studies available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...through advanced natural language processing techniques, treating words as abstract symbols of meaning and deriving its understanding through the context of their occurrence rather than a rigid definition of the language and grammar. This means that IDOL has no problem understanding slang, industry specific...
Summary: ...modeling, treating words as abstract symbols of meaning rather than limited by a rigid definition of the language and grammar. IDOL, therefore, can understand slang, industry-specific words, and variants in spelling that are so prevalent in the social web, and can adapt to the dynamic nature of language...
Summary: ...Norwegian, etc.). As this technology is based on probabilistic modeling, it does not use any form of language dependent parsing or dictionaries. IDOL treats words as abstract symbols of meaning, deriving its understanding through the context of their occurrence rather than a rigid definition of the language...
Summary: ...of each and every question, whatever language or slang is used, and retrieve the most conceptually relevant answer or answers from a database of approved responses. Questions that cannot be answered automatically by the system within a specified confidence interval or that contain questions or concepts...
Summary: ...Universal Search is More than just Enterprise Search Universal Search extends far beyond simple search and retrieval. Traditional search engines cannot comprehend the meaning of information. Unfortunately, this inability to understand information means that other documents that discuss the same idea (i.e....
Summary: ...stores. In fact, the very nature of legal content makes it more difficult to search due to the frequent cut-and-paste repurposing of existing documents which results in a highly repetitive corpus in which documents are very similar in content, sentence structure and phrasing. While professionals in other...
Summary: ...of the files to their intent. IDOL Retina provides summarization in three forms, and the length can vary from a few words to several sentences • The conceptual summary displays a few sentences from the document that contain the most salient concepts (these sentences can be from different parts of the...
Summary: ...is essential. Legal professionals worldwide face an ever-growing information explosion that presents them with multiple content and risk management challenges. Legacy technology that relies on inefficient manual processes cannot scale to meet the needs of today’s legal professionals. Furthermore, this...
Summary: ...traditionally been unable to identify languages quickly or accurately, often using a speech-to-text methodology that relies on lengthy transcription processes and language identification based solely on grammatical structures and rules. Autonomy leads the market for processing multilingual speech communication...
Summary: ...Datasheet Website optimization is the process of s.... During “Delta uses Optimost as a verb; when we’re unsure about a decision, we say ‘Let’s Optimost it.” —Abby Stephenson, Manager of Usability, Delta Air Lines the multivariable experiment, the Adaptive Targeting engine collects rich multifaceted...
Summary: ...Naturally, this capability also allows IDOL Eduction to make use of previously defined terms, and IDOL Eduction provides the user with the ability to identify and extract elements from content based on predefined grammars and rules. It also permits custom entities to be built and deployed to meet specific...
Summary: ...An Agent represents a user’s persistent interests and can be defined or trained either explicitly with a natural language description or Boolean expression. Most powerfully, an Agent can be trained or re-trained by example, simply by being shown a document, video, or verbal conversation that matches...
This is a small selection of the Autonomy Product Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...therefore does not require any form of language dependent parsing, dictionaries or translation modules. Treating words as abstract symbols of meaning allows Autonomy's technology to derive understanding through the context in which symbols occur rather than a rigid definition of grammar. Slang and other...
Summary: ...or document’s owner. • Mime-Type (documents only) - The document’s MIME type. • Custom fields - can be added as required. • Natural Language Retrieval Autonomy's technology accepts a piece of content, such as a sentence, paragraph or page of content, as input and returns references to conceptually...
Summary: ...enterprises and across value chains, organizations will be provided with rapid access to information across the network no matter what their application, design or platform environment. The ACI API Object The fundamental construct in this API is the aciObject. This is implemented as a structure in the...
Summary: ...IDOL 7 Server Technical Brief. An Agent represents a user’s persistent interests and can be defined or trained either explicitly with a natural language description or Boolean expression. Most powerfully, an Agent can be trained or re-trained by example, simply by being shown a document, video, or verbal...
Summary: ...Autonomy Enterprise Archive Solution. Proactively managing corporate information with EAS minimizes the risk of legal fines and penalties as well as the lasting damage to corporate brand equity that can result from litigation Autonomy EAS Investigator Autonomy’s advanced analytics enables researchers...
Summary: ...well as perform real-time operations on the extracted text during data ingestion. Connector for SharePoint 2007 configuration options include: • Indexing of entire SharePoint farm, or a subset • Indexing for every version of a list item or document or latest published version only • Autonomy Records...
This is a small selection of the Autonomy Technical Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...language. Words are treated as abstract symbols of meaning, and understanding is derived through analysis of the context of their occurrence, rather than a rigid definition of grammar. Slang and other variations in language will not confuse the software; the technology is independent with regard to language...
Summary: ...content, and forgoes language-dependent parsing or dictionaries to form ideas. Because Autonomy treats words merely as abstract symbols of meaning, it is completely language independent. It does not rely on an intimate knowledge of a language’s grammatical structure, but rather derives its understanding...
Summary: ...terms to combinations of phonemes as they occur in the audio stream. While this approach does not necessarily require full dictionary coverage as the user is able to suggest alternative pronunciation via different text-compositions, it is limited in its accuracy and inability to make conceptual matches....
Summary: ...of language dependent parsing or dictionaries. Words are treated as abstract symbols of meaning and the engine derives its understanding through the context of their occurrence rather than a rigid definition of the language grammar. Autonomy's approach to concept modeling relies on Shannon's theory that...
Summary: ...for information or cannot find what they are looking for, chances are they won’t proceed. Use shorter text and keep it to the point. If possible, opt for one paragraph instead of five or even size down a paragraph into one sentence. You can also shorten text-heavy content by converting key concepts...
Summary: ...get control of this risk, by capturing the email, instant messages and corporate documents at risk for discovery, securely managing them and disposing of them, and supporting a business process when legal requests occur. Some of the more intangible benefits include: Reduction in Risk of Non Compliance...
Summary: ...to be evaluated in complex ways; as well as the matching of the basic terms within documents, it is able to “read between the lines” and determine conceptual matches that legacy search engines would be unable to locate. This advanced search method is used in conjunction with semantic parsing and other...
Summary: ...ways; as well as the matching of the basic terms within documents, it is able to “read between the lines” and determine conceptual matches that legacy search engines would be unable to locate. This advanced search method is used in conjunction with semantic parsing and other legacy approaches to yield...
Summary: ...the matching of the basic terms within documents, it is able to “read between the lines” and determine conceptual matches that legacy search engines would be unable to locate. This advanced search method is used in conjunction with semantic parsing and other legacy approaches to yield highly accurate...
Summary: ...for assets such as graphics, animation, and dynamic content, which cannot otherwise be recognized by spiders. n Marketers should ensure that the site includes plenty of keyword-rich content, with the desired search term included in the frst sentence of the home page or appropriate landing page. The number...
Summary: ...tagged and stored for retrieval Stories searchable in less than one second from the time they arrive on the newswire 34 languages including Arabic, Chinese and Spanish CNN Quick Facts 150 hours of news feeds every day 32 satellite feeds ingested by Virage Production assistance support reduced by 30% because...
Summary: ...aggregating and synchronizing code, content, and configuration changes from their development environments to the scheduled deployment of those changes across all Web environments, enabling IT to manage more strategic projects. Enforce Application Integrity Inaccurate or unavailable sites occur when there...
This is a small selection of the Autonomy White Papers available, please visit our publications site at http://publications.autonomy.com/ for more information.
There do not seem to be any press releases related to this page in 2010 at the moment, please visit the news section on www.autonomy.com for the latest news.
Relation
Date
Press Release
There do not seem to be any press releases related to this page in 2009 at the moment, please visit the news section on www.autonomy.com for the latest news.