Autonomy supports all legacy methods described below. However, we recognize the limitations of the following approaches in enterprise scenarios and uniquely offer conceptual retrieval to provide users with the most accurate and complete search results with minimal manual intervention.
Keyword and Boolean Searches
Keyword and Boolean searches return only those documents that contain the terms queried by the user. Because of this limitation, the success rate of the searches is heavily reliant on the skill of the user, use of precisely the right terms, adeptness with Boolean operators, etc. Keyword searches presuppose that the user already knows exactly what they are looking for (hence the precise terms required for a successful query). However, in an enterprise search scenario, the user often only has a general idea of the file they are looking for, and many valuable documents are found serendipitously not in the initial cycle but through clicking around different related files.
In addition to its heavy dependence on the user, other major flaws prevail:
It ignores the context in which the keywords were found, and therefore cannot accurately gauge whether the keywords found in the file also represent the main concepts of the file. Weighting of keywords (e.g. found in title vs. buried in the middle of the file; frequency of keywords) only mitigate this issue and does not remove this critical defect.
It cannot find files that are conceptually relevant to the queried terms but do not contain the keywords used in the query.
In order to maximize accuracy, it requires human intervention to manage and update keyword associations or categories.
It is unable to learn and adapt through use; it cannot retrieve files by being shown an example.
Parsing, or semantic analysis, uses rules of grammar and lexicons in order to explicitly understand textual information. In spite of more than two decades of research into semantic approaches it is rarely used in real applications because the associated results and performance have yet to live up to expectations in real-world problems:
Due to the inherent complexity of language, it is unable to handle ambiguity (e.g. "The dog came into the room; it was furry." What is the "it" it refers to?). Improving the algorithm requires the construction of a set of rules that are cumbersome and difficult to maintain.
It cannot determine the relative importance of ideas.
It is designed to handle a few sentences and has great difficulty extracting meaning from full paragraphs.
Since semantic analysis is based on a true/false decision tree and rules structure, one incorrect decision or the occurrence of an unknown construct can derail the entire analysis.
It is language-specific and its reliance on grammar makes it unable to understand slang or grammatically incorrect constructions.
It cannot scale easily since the system needs to be taught every new word or change in meaning.
An increasing number of search vendors now offer users the ability to retrieve information through natural language questions. While this approach may work well for one-sentence questions or queries concerning a known universe of information, the language model simply breaks down when employed on large documents with many concepts. This occurs because question and answer systems rely on the simple combination of manually defined 'question forms' and a corresponding structured dataset that holds the relevant answers. As a result, these systems can only recognize precise questions and the matching answers that have been stored in the database.
Manual tagging schemes are becoming an increasingly popular method of labeling and categorizing digital material. However, it suffers from the following flaws:
It is descriptively inconsistent due to its reliance on human contribution. Each person may tag a given document differently (especially when the content deals with multiple themes), and/or people can get lax in their tagging and categorize most content under "general."
XML is not a set of standard tag definitions; it is a set of definitions that allow for tags to be defined. This poses difficulty when organizations or departments with different practices interoperate.
Taxonomy creation and tagging involve costly manual labor, requiring input from librarians, users and IT staff.
Tags fail to highlight the relationships between subjects because they lack a conceptual understanding to form correlations. There are often vital relationships between seemingly separately tagged subjects such as wing design/low drag and aerofoil/efficiency, but this concept of "idea distancing" is not leveraged.
As the number of tags increases, so too does the likelihood of misclassification and the effort to maintain consistency. This approach is not scalable.
PageRank determines a web page's "importance" depending on the number of pages that link to it, and on how "important" these pages are considered. This is then used in conjunction with a keyword entered by the user to retrieve the most relevant results. It suffers from the following flaws:
Since oft-linked pages usually consist of general overview of topics, search results list the most general pages first, and finding specific information requires users to enter very specific keywords.
PageRank relies on manually added hyperlinks, which are rare in enterprise context and not always a good indicator of a file's importance.
"When applied to enterprise search, the effectiveness of PageRank ranges from limited to useless."
A thesaurus maintains a list of industry-specific terms and their synonyms. This can be useful in environments with a large corpus of industry-specific terms, abbreviations and jargon, such as the medical and scientific fields. However, thesauri are costly and time-consuming to create and definitions can be inaccurate because the meaning of words vary according to context. In addition:
The lists are static and the system cannot automatically update the changes in meaning or addition of words. In a costly and manually intensive fashion, the administrator must maintain the thesaurus.
The creation of the thesauri involves the painstaking and time-consuming work of an expert.
Enterprises that have embraced "socialware" have benefited from the enthusiastic wave of information creation as well as the connection it engenders between disparate members of the enterprise community. Unfortunately, social methods also have critical flaws in an enterprise context:
User-generated content and manual tags are subjective and privy to personal habits that can be exploited at the expense of accurate and reliable information. Tag spamming, in which people label their information inaccurately in order to generate interest, can also become an issue.
Their form of classification is too wild and unpredictable to deliver a stable and accessible taxonomy.
Not enough distinction is made between contributions from experts and thoughts from amateur enthusiasts who volunteer information beyond their expertise.
Granting editing access to a large number of employees on internal wikis invites greater potential for security breaches.
Social methods inherently require manual effort for creation and maintenance.
Ultimately, the enterprise has much to gain when social methods can be automated. Autonomy's IDOL platform incorporates a range of unique functions to automate and enhance social networking tools, automatically generating comprehensive user profiles, recommending appropriate tags and generating hyperlinks to related material. Autonomy can also monitor the evolution of content to alert management to vandalism and automatically repair damaged articles. At every stage, authorized administrators are able to modify entries and settings by a comprehensive range of parameters, delivering full control. In short, Autonomy's holistic approach provides all the benefits of social methods, but also negates the pitfalls of intensive maintenance and user bias.
"The biggest challenge in the information society is the fact that we are drowning in information. With Autonomy we can save time and costs that we used to spend on maintenance and information retrieval. Additionally we can support our users with personalized interfaces. When we trialled Autonomy we had already chosen a more traditional keywordbased technology, but Autonomy changed our mindset."
Peter Rasmussen, Danske Bank
Forthcoming Events for Limitations of Other Approaches
Archived Events for Limitations of Other Approaches
Summary: ...BAE Systems - The right information to the right people in real time - Case Study. Importantly, they wanted to avoid traditional technology approaches that are manually dependant, such as users describing their areas of interest using either a list of predefined keywords or through the filling out of...
Summary: ...view of data grouping similar items together to spot new or hot trends and topics; and categorisation of content automatically generates taxonomies and instantly organises the data reducing the need for human intervention. These unique analysis capabilities have empowered Precise with the tools to deliver...
Summary: ...work to rectify the resulting problems. The IT Manager at Glassolutions Saint-Gobain Ltd said: “Our previous backup procedures for the laptop workers were too dependent on the user, requiring them to manually run backups at regular intervals. We were finding it extremely difficult to enforce policies,...
Summary: ...continue to evolve to meet our emerging needs.” David Mulholland, IT Manager, Tait Walker Industry n Accounting Challenges n Paper files made information hard to find and cumbersome to access n With no definitive, centralized repository for complete client information, the firm faced difficulties with...
Summary: ...each Saturday to approximately 180,000 people at 26 sites throughout France. At the end of the presentations, the participants fill in an orientation evaluation form, which MOD uses to improve the orientations. Initially, MOD used an OMR system to process the data. However, the system was expensive to...
Summary: ...all agencies maintains consistency in content and navigation • Balance between central control and decentralized contribution minimizes need for IT intervention; reduces costs • Fast time-to-portal enables employees to quickly post new information • IBM and Autonomy technology deliver dynamic user...
Summary: ...IDOL K2 includes parametric selection. This enables users to focus their retrieval to select categories to produce the most precise results. “K2 provides the capability to quickly and continuously refine searches to retrieve the exact information you are interested in,” Graves says. “For example,...
Summary: ...from a single, intuitive interface. This is valuable to those golfers, who in many cases know exactly what they are looking for—whether it’s a left-handed men’s 2-iron with a graphite shaft or a golf bag that costs less than $100. With IDOL K2, shoppers can quickly retrieve information for a specific...
Summary: ...and the firm is working with Stria on back-file conversion as well. “Everything is secure, trackable, and available,” says Coart “WorkSite gives us the power not only to retrieve individual documents, but to search their content, so we can easily find and use any information anywhere in the firm.”...
Summary: ...GSA Advantage Case Study - Government Procurement Portal. With concurrent load-balancing brokers and servers, queries are routed to servers that are best suited to the task. This distributes load evenly, ensuring that response time never suffers because one server is sitting idle while others are overloaded....
Summary: ...Analyzing customer feedback and intelligence is critical for Orange to maintain its competitive advantage and continue to grow the business. Orange previously tracked trends and analyzed recorded customer conversations manually, requiring a huge number of man hours and depending on ‘human intervention’...
Summary: ...South Yorkshire Police Case Study. In addition, officers were finding that even when they were able to locate the relevant information, the lack of metadata and incorrectly weighted search parameters meant that searches were often unrepeatable and information was lost again within the system. The South...
This is a small selection of the Autonomy case studies available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...manual intervention. Keyword and Boolean Searches: Returns only those documents that contain the terms queried. This method is heavily reliant on user skill and adeptness with Boolean operators. It ignores the context in which the keywords were found. While weighting keywords only mitigates this issue,...
Summary: ...essential, however, to offer classification experts the freedom to edit and refine taxonomies. Autonomy offers a sophisticated array of controls to facilitate the manual maintenance of enterprise taxonomies.
...
Summary: ...of each and every question, whatever language or slang is used, and retrieve the most conceptually relevant answer or answers from a database of approved responses. Questions that cannot be answered automatically by the system within a specified confidence interval or that contain questions or concepts...
Summary: ...sample data, sentences and returns results to conceptually related documents ranked by relevance. Unlike traditional search engines that only search and retrieve information based on keywords, Universal Search has the ability to automate operations based on the context and concepts contained in the targeted...
Summary: ...sample data, sentences and returns results to conceptually related documents ranked by relevance. Unlike traditional search engines that only search and retrieve information based on keywords, Universal Search has the ability to automate operations based on the context and concepts contained in the targeted...
Summary: ...Portlet provides a fullyautomated and precise means of retrieving information. It allows content to be searched in any language and any format, wherever it is stored, and presented with hyperlinks to similar information, automatically and in realtime. Unlike ordinary searches that only permit keyword...
Summary: ...weighting of metadata to weight individual keywords, special metadata fields or entire documents more or less than others • AFTER - similar to NEAR, only the first (left-hand) term before this operator has to occur within a specified word distance AFTER the term on the right side of this operator in...
Summary: ...the generation of taxonomies and automatic categorization without the need to identify keywords. By augmenting IDOL technology with more traditional approaches to classification such as cornerstone detection, barcode reading, keyword identification and layout detection, TeleForm IDC ensures that all documents...
Summary: ...and built on Autonomy IDOL to take businesses effortlessly from monitoring to meaning. Autonomy IDOL leverages advanced mathematical techniques, statistical analysis, and pattern-matching to extract meaning from every interaction without over-reliance on language definitions or cumbersome keyword entry...
Summary: ...Autonomy IDOL Server 7. Existing legacy taxonomies can be either maintained or enriched with contextual understanding. The information that IDOL has automatically aggregated and categorized is presented to users in the form of channels. Technical Brief Automatic Hyperlinking Automatic Categorization Automatic...
Summary: ...with unknown information. Search is only as good as the places it looks, and if the search engine is not able to index all file types, it is bound to miss many critically relevant documents. This flaw is wholly unacceptable (and costly) in today’s litigious environment. Since Autonomy indexes all enterprise...
Summary: ...Autonomy KeyView IDOL - Product Brief. This flaw is wholly unacceptable (and costly) in today’s litigious environment, where companies are required by law to produce all electronically stored information (ESI) relevant to a case. In another example, if the technology is unable to filter through all...
This is a small selection of the Autonomy Product Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...Grammar/lexicon parsing uses grammar and lexicon rules to understand what was said. It cannot, however, give the right weighting to multiple ideas that co-exist in the same sentence: “I installed some software on my laptop and it isn’t working.” Parsing would have difficulty determining whether...
Summary: ...patterns that naturally occur in text, voice or video files based on the usage and frequency of terms that correspond to specific concepts. This enables IDOL to build a precise probabilistic map of the different concepts held within a piece of data. Using this concept map, IDOL can work out the conceptual...
Summary: ...without the user being needlessly diverted from their work in progress to perform a search or retrieval operation ? Automatic hyperlinking Completely removing the requirement to manually insert hyperlinks into content, Autonomy IDOL generates hyperlinks in real-time to all types of data, ensuring they...
Summary: ...Autonomy User Experience White Paper. Legacy keyword searches, in particular, have the inherent flaw of retrieving the most popular rather than the most accurate results. Thus, traditional approaches fall short of locating desired products or pieces of information in a hassle-free and efficient manner....
Summary: ...portion of the resources because SharePoint cannot intelligently manage or understand video or audio assets. Users can only retrieve rich media based on attached metadata such as the filename or information on the page on which it is embedded. However, all this metadata must be manually added by users...
Summary: ...be evaluated in complex ways; as well as the matching of the basic terms within documents using patented weighting algorithms, it is able to develop the terms to "read between the lines" and determine conceptual matches that legacy search engines would be unable to locate. However, IDOL is able to perform...
Summary: ...load across a number of computers, allowing you to spread the work of the map and reduction operations across multiple machines. To compare this with basic keyword search, you can match words and count occurrences. However, this approach lacks the sophistication to perform advanced keyword search where...
Summary: ...Autonomy White Paper: BIG DATA? No Big Problem.. To correct this flaw, businesses have long yearned for a hybrid analytics platform as comfortable with unstructured information as it is with structured data. At last, the wait has finally ended. The HumanInfo rmation Explosion Of course, collecting, storing...
Summary: ...enables a reduction in the costs and risks of legal discovery, investigation or audit, and the ability to better meet compliance and regulatory requirements. However, there are potential risks associated with a flawed ERM selection process. While risks are attached to any enterprise application or change...
Summary: ...Autonomy XML White Paper 20031003. XML is likely to feature prominently in the future development of all applications from on-line information sources to B2B transaction servers however, like all tagging schemas it suffers from a number of limitations. page three 2.2 Technology XML White Paper page four...
Summary: ...When Knowing What Happened Is Not Enough. Each data center is SAS 70 compliant (or equivalent outside the USA) and biometric and physical security system are complemented with maintenance and monitoring systems that enable Autonomy to offer guaranteed service levels that exceed most in-house IT capabilities....
Summary: ...firms, the need to keep track of, and comply with, these regulations is a major headache (see Box: 'The burden of compliance.') However, imposition of fines and penalties by the FSA for noncompliance means that this is an issue that simply cannot be ignored. In addition, there are sound business reasons...
This is a small selection of the Autonomy White Papers available, please visit our publications site at http://publications.autonomy.com/ for more information.
There do not seem to be any press releases related to this page in 2013 at the moment, please visit the news section on www.autonomy.com for the latest news.
Relevance
Date
Press Release
There do not seem to be any press releases related to this page in 2012 at the moment, please visit the news section on www.autonomy.com for the latest news.
Relevance
Date
Press Release
There do not seem to be any press releases related to this page in 2011 at the moment, please visit the news section on www.autonomy.com for the latest news.