The Vector method is concerned with the partitioning of data, or categorization. This is done by imagining documents as points in a multidimensional space which are then divided into categories. Categories must be taught to the system so that the more training that occurs, the more accurate the categorization can be. Many of today's search engines use a combination of Vector and Boolean methods.
Language Dependent
The system needs to be trained in its target language, and will only recognize words it has been taught. There is no inherent understanding of synonyms or related words. For example, it would be unable to deduce that "Creutzfeldt-Jakob" and "mad cow" are related terms.
Inaccurate
The Vector method is inaccurate because it is unable to perfectly divide categories and has particular trouble with documents that fit into more than one category. It will classify such documents under one category or another, but not both. There is also no notion of threshold or relevance, so if a document is put into a particular category, there is no indication of how relevant it is within that category. Does it mention the topic only a couple of times, or is it entirely focused on it? The Vector Method is unable to tell.
Manual
All categories must be defined manually by administrators and the system requires constant monitoring and maintenance to ensure it keeps functioning. Any time there is a change in the categorization, the whole training process must begin again from scratch as there is no ability to make updates to just one area of the system.
Ranking Discrimination
The importance and relevance of one word compared to another is not understood. To combat this effect, common words can be ignored, and the focus placed on rare words, assuming they will give more insight into the theme of a document. However, this is not always accurate and can result in weight being placed on inappropriate words resulting in categorization errors.
Autonomy's Approach
Autonomy's technology can understand the content of a document probabilistically, without depending on an understanding of a particular language, and create categories accordingly. Where necessary, a document can be classified in more than one category. Autonomy's automatic categorization functionality ensures that taxonomies are created and maintained with as much or as little human interference as desired.
"We were attracted to Autonomy because it can process information from a wide range of differently structured data sources, which similar products cannot do. It can also combine internal and external information. The users like it and so far we have been impressed."
Summary: ...interaction. In both instances, this type of survey results in a small sample size and inaccurate results. The contact center faced two key challenges: Managers required a more accurate way to measure service and an effcient method of implementing training for their agents to raise member satisfaction...
Summary: ...they had been able to learn on their own in a year.” Stagg welcomed the additional training. “I recommend to anyone purchasing TeleForm to invest extra money in training. It’s an easy-to-use application, but it just doesn’t make sense if you can’t use it to its maximum potential.” The Benefits...
Summary: ...and best practices ■ Legacy systems lack the granular security controls needed to meet client and regulatory compliance standards ■ Genpact required a centralized knowledge repository that also provided secure, partitioned data for each of their clients Benefits of Using Interwoven ■ The ability...
Summary: ...over the slow, inaccurate and inelegant method we had in place,” says Battle. “Every study meant development and implementation of a custom application. TeleForm allowed us to standardize, use a single program to quickly, accurately capture all of the incoming data. It also meant we could standardize...
Summary: ...organize information the way we organize our business.” IDOL K2’s advanced retrieval functionality also simplifies the information-finding experience at Softlab. With powerful retrieval features that accommodate commonly misspelled words, synonyms, and industry terminology, Softlab professionals can...
Summary: ...produced by the PARS administrative procedures. In particular pesticides registration procedures require a lot of electronic documentation which must be stored in a proper and secure way and a part of these documents must also be accessible to different institutions. This is now easily managed through...
Summary: ...per month. At foodtv.com, as many as 93 million page views are recorded each month. The company’s legacy search solutions struggled to meet the demands of the fast-growing number of visitors. Information retrieval proved both slower than desired and often inaccurate or inconsistent, frustrating visitors...
Summary: ...the Historic Preservation Learning Portal. The portal serves as a clearinghouse for records and data on historic preservation information and training resources. The Need The U.S. National Park Service must help federal government agencies meet their responsibilities for complying with the National Historic...
Summary: ...media ranging from digital survey data to weed control information to explosives licenses—all of which must be made readily accessible to the community in a timely and accurate manner. To encourage community self-service and save time and money for the state, the Department undertook to make more effective...
Summary: ...and creating the online customer experience around it. We wouldn’t do business any other way.” Key Lessons Learned No preconceived notions: “Don’t have preconceived notions about what will work best for a given page. By listening to what the data tells you about the experience your customers are...
Summary: ...Our real pain was that there was no way to easily access comprehensive information about a particular product. Now, we have one place to view the product image, image variations as well as 65 felds of specifcation data—such as upc, sku, descriptions, weights and dimensions etc.—all of the information...
Summary: ...over competing solutions for a number of reasons. Most notably, the company was impressed with Interwoven’s extensive range of proven features. In particular, Interwoven’s patented project management capabilities were crucial to enabling Rogers to shorten development cycles, reduce the need for IT...
This is a small selection of the Autonomy case studies available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...English, but the system can also be trained on the patterns of any language (German, Portuguese, Arabic, Italian, French, Japanese, Norwegian, etc.). As this technology is based on probabilistic modeling, it does not use any form of language dependent parsing or dictionaries. IDOL treats words as abstract...
Summary: ...available operators and modifiers • Ability to select training documents from query results • Ability to use any text as training text • View/Edit/Delete generated terms and weights • One window in which to select training documents, recommend documents for another category or exclude documents...
Summary: ...tion technologies require visitors to fll out profle information and log into their online account in order to recognize their profle information, IDOL leverages everything shared by the customer across any touch point—not just information shared online— to form a conceptual and multidimensional understanding...
Summary: ...Autonomy Collaborative Classifier. Recommends documents below a category’s auto-publish threshold but above a specified lower “propose” threshold be assigned to that category, but routes the document to the managing editor for approval. • Reject: Documents below the propose threshold are rejected...
Summary: ...as well as comparable products from other brands. This accelerates the shopping experience and puts the right information in front of the customer at the right time. Once they have settled on a particular model, they are presented with the recommended accessories for that model, such as different lenses,...
Summary: ...based on user similarities to improve targeting and mark-et ing effectiveness across diverse audiences based on their interests, attitudes, or buying behaviors. Profling Autonomy accurately understands individuals’ implicit interests based on their browsing history, buying behaviors, product reviews,...
Summary: ...Automatic Clustering Using Autonomy’s conceptual understanding of information, the technology clusters disparate pieces of information according to their conceptual relevance. Its unique k-ey word clustering capabilities automatically identifies concepts and patterns as they emerge on the web. For instance,...
Summary: ...Qfiniti Web Access. This physically protects the recording from being compromised and supports the Payment Card Industry (PCI) data security standards. Extensive Localization With the growth of international markets, especially as a major source for outsourcing, contact center applications must be understood...
Summary: ...Autonomy Information Governance. Autonomy Systems ProducL t dB.rief Highlights · Massively scalable pan-enterprise platform for information governance · Key business information understood in context · Real-time automation to eliminate costly, error-prone manual processes · Manage records either ‘in-place’...
Summary: ...similar. Introspect Review and Production A flexible workflow engine, together with powerful visualization tools and the most robust and granular security model in the industry, make Introspect very attractive for companies with large scale litigation, hundreds of simultaneous cases, or for parties in...
Summary: ...form a conceptual and contextual understanding of all content, independent of language or format, thereby enabling a more intuitive search yielding better results. • Related Concept Generation and Idea Distancin– g automatically categorizes concepts in relationship to one another by identifying vital...
Summary: ...trip planning dates, destination city, and search terms n Site Arrival Context—Example: keyword search terms used in search engines or referrer ID n Geographic Location—Examples: country, state, and zip code n Sentiment—Example: positive or negative comments posted on a particular product n Demographic...
This is a small selection of the Autonomy Product Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...be trained on the patterns of any language. The more information IDOL server is given about a particular type of information (for example, legal terms, pharmaceutical developments, technology and so on), the more understanding it gains of those topics. A new language can be thought of as simply another...
Summary: ...only the first (left-hand) term before this operator has to occur within a specified word distance AFTER the term on the right side of this operator in order for the source document to be scored as a result • Phrases - combined words or terms that must appear directly ADJACENT to one another and in...
Summary: ...8. Directed Navigation 9. Eduction 10. Personalized Agents 11. Profiling 12. Conceptual Retrieval 13. Full Metadata, XML and Structured Information Handling 14. Business Console 15. IDOL ECHO IDOL dynamically based on the query and the time of query. No manual process is required to teach, train or configure...
Summary: ...multiple versions) Lotus AMI Pro (multiple versions) Lotus AMI Professional Write Plus Lotus Word Pro (multiple versions) Lotus SmartMaster (multiple versions) Microsoft Word PC (multiple versions) Microsoft Word Windows (multiple versions incl. 2007) Microsoft Word Macintosh (multiple versions) Microsoft...
Summary: ...Write (multiple versions) (.wri) • Microsoft Word PC (multiple versions) (.doc) • Microsoft Word for Mac (.doc) • Microsoft Word Windows (multiple versions) (.doc) • Microsoft Word Windows XML (.docx, .dotx, .dotm) • Microsoft Works (multiple versions) (.wps) • Microsoft Works Spreadsheet...
Summary: ...Tool The Category Administration Tool Portlet allows users who have administrative permissions to manage the categorized information that is displayed to ordinary users via the Channels Portlet. • Create hierarchical categories • Train categories using text, documents and terms and weights • Restrict...
Summary: ...Autonomy Connector for SharePoint. At the heart of Autonomy’s solution lies the Intelligent Data Operating Layer (IDOL). Using complex pattern-matching algorithms and probabilistic modeling, IDOL forms a conceptual and contextual understanding of all content in an enterprise, indexing and automatically...
This is a small selection of the Autonomy Technical Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...any other project, for that matter— keep two sayings in mind: “Don’t try to boil the ocean all at once!” and “Claim victory early (and often).” In other words, consider your goals over the long term, prioritize them and divide the project into workable phases with concrete results along the...
Summary: ...concept to only calls that have low, medium or high levels of emotion. It is important to note that some emotion-based systems are not refined by the spoken word. Explore allows a user to combine all of the aspects together when performing a search – the spoken word, the concepts used, the level of...
Summary: ...as "the use of arbitrary symbols, such as the spoken and written word, in organized combinations and patterns in order to express and communicate thoughts and feelings." As its definition implies, language is used in set patterns to express the abstract notion of knowledge and information. To fully realize...
Summary: ...and triggered at different points in the process—for example, to tell a Receiver to shut down and bring up applications before and after a deployment. However, this capability could theoretically be misused by an unauthorized user to configure OpenDeploy to trigger a damaging or inappropriate task....
Summary: ...the speech engine to recognize the probability of a specific sound translating to a word or part of a word. The language model builds upon this to enable the system to determine the probability of one word following another to produce an accurate hypothesis of the spoken words. For example, “the bog...
Summary: ...together, online. Let’s see what ya got.” 3. Community tagging Giving users the ability to organize and classify information on your Website via “tagging” can be a very powerful attraction. Think of a tag as a key word or category associated with your content. Many companies leverage tags today...
Summary: ...is able to continuously develop and learn automatically. Rather than needing to be taught new words, phrases or concepts and shown how to categorize them, Qfiniti Assist can intelligently deduce the significance of these new units of meaning automatically, adding them to the conceptual understanding,...
Summary: ...to a current average of 75 Kb In addition, a decade ago fi le attachments were rare, but today over 20 percent of email comes with attachments. Graphic fi les, spreadsheets and Word documents are common. As a result, email messages with attachments average around 100Kb today, and analysts predict that...
Summary: ...customer movement makes a big difference in whether or not marketing investments produce a payback. The figure below depicts two crucial dimensions of the marketing process and its effect on short-term revenues. Business models tell a story about how and why a firm targets particular customer groups,...
Summary: ...Autonomy User Experience White Paper. Legacy keyword searches, in particular, have the inherent flaw of retrieving the most popular rather than the most accurate results. Thus, traditional approaches fall short of locating desired products or pieces of information in a hassle-free and efficient manner....
Summary: ...post by Forrester, “coping with notions of site collections, lists, file arrangements, performance of folder hierarchies versus flat files, and automatic versus manual partitioning, the bottom line seems to be that even on the new 64 bit architecture with 4 screaming Intel processors, and SQL 5 - the...
Summary: ...solutions that enable them to easily recognize and address performance issues in the contact center. Explore can enhance quality assurance by quickly retrieving specific types of calls needed for agent evaluations, coaching, and training programs. Explore automatically retrieves calls based on the defined...
This is a small selection of the Autonomy White Papers available, please visit our publications site at http://publications.autonomy.com/ for more information.
There do not seem to be any press releases related to this page in 2010 at the moment, please visit the news section on www.autonomy.com for the latest news.