With an upswing in enterprise portal use, it is imperative to create taxonomies that address various information types including documents, structured data, HTML, XML and multimedia. Manual tagging schemes are becoming an increasingly popular method of labeling digital material.
Descriptive Inconsistency
Each person will categorize or tag a given document differently. In order to retrieve it, a user must guess the category under which it was tagged. This often results in the correct document not being found.
Another problem inherent in this approach is that humans can get lax in their tagging, leading to the large majority of content being tagged under the category 'general', making it difficult to find anything and rendering the whole taxonomy system useless.
Further complications arise when subjects incorporate multiple themes. Should an article about "technology development in Russia within the context of changing foreign policy" be classified as (i) Russian technology; (ii) Russian foreign policy, or (iii) Russian economics?
The decision process is both complex and time consuming and introduces yet more inconsistency, particularly when the sheer number of options available to a user is considered. For example, over eight hundred tags for general newspaper subjects make the task of choosing a potentially basic subject description in a reasonable time scale an even more challenging process.
Interoperability of Tagging
XML is not a set of standard tag definitions; it is a set of definitions that allow you to define tags. This means that if two organizations are going to interoperate and apply the same meaning to the same tags, they have to explicitly agree upon their definitions in advance.
While this may prove possible for small groups of cooperating agents working over public networks, doubts remain as to whether this will scale to support an extended network of industry trading partners.
Idea Distancing
Tags also fail to highlight the relationships between subjects. There are often vital relationships between seemingly separately tagged subjects such as wing design/low drag and aerofoil/efficiency, a concept known as "idea distancing." Obviously, there will be a degree of overlap between these categories, and because of this a user may be interested in the contents of both. However, without understanding the meanings of the category names there is no clear correlation between the two.
Not Scalable
In order to be very specific in the retrieval and processing of tagged documents, the number of tags will need to be very high. For example, tag numbers in a company such as Reuters run into the tens of thousands. However, as the number of tags increases, so too do the effort required and the likelihood of misclassification.
High Labor Costs
Taxonomy creation and tagging is still a predominantly manual task requiring input from librarians, users and IT staff. This means that large labor costs are involved in making sense of information.
Autonomy's Approach
Autonomy adds a layer of intelligence to the management of XML and understands the content and purpose of either the tag itself, or related information or both.
Summary: ...and intuitively sort through thousands of stocks to uncover undervalued gems. Any Format, Any Source IDOL can search across the mosaic of financial data, wherever it is stored. This includes email servers, web servers, XML news feeds, and research PDFs in document management systems. Agents and Alerting...
Summary: ...of research. “The perception of our end-users is that things are much faster—and one of the huge contributing factors of this is K2,” says Graves. “It’s definitely a much more pleasant experience for visitors who come to our site.” The IDOL K2-powered portal has paid off for ACM with healthy...
Summary: ...and images; metadata such as Alt Tag and Description are extracted via MediaBin’s SOAP interface. MediaBin is also used for the popular “Guess My Age” microsite, where images uploaded by users are transformed into multiple formats from which visitors guess the age of the person shown. Says Blumfeld,...
Summary: ...cs_honda_20080821 Along the way, RPA has gained valuable insight into the kinds of design factors that influence a site’s effectiveness. “Each Honda model has a different audience and buyer, and each responds differently to things like text, color, and layout,” says Shlauter. “The same types of...
Summary: ...management practices frm-wide diffcult to achieve. Weak search functionality meant that Worldox was unable to support effcient knowledge management and re-use of work product. “It took so long to complete a full text search that it was essentially useless and no one used it,” says Ellingson. On the...
Summary: ...IDOL K2 infrastructure Functionality Parametric selection Classification Taxonomy management Repository Interfacing Business-strength security Languages English French Spanish Italian German Dutch Hungarian Russian Japanese Chinese Legal Know-how System M K ? ? ? ? P r i n t e d i n C a n a d a . Autonomy...
Summary: ...laminators, and record storage products, to a full-line of mobile products, media labeling, storage, and desktop products. “At Fellowes, we strive to help our retailers by providing them with frst-class service,” says Brad Hillebrand, manager of enterprise technology at Fellowes. “It is imperative...
Summary: ...stores of unstructured information contained in documents around the company network. By using the inherent intelligence of Aungate (powered by the Autonomy IDOL software) and the structured content taxonomies provided by Complinet, compliance professionals are able to organize their internal information...
Summary: ...of unstructured information contained in documents around the company network. By using the inherent intelligence of Aungate (powered by the Autonomy IDOL software) and the structured content taxonomies provided by Complinet, compliance professionals are able to organize their internal information against...
Summary: ...Ticketmaster Case Study. "Before, we could monitor each agent once a month, now it’s every 10 days," he says. "We didn’t change anything else, except installing etalk’s agent evaluation technology." As a result, TicketMaster has increased the number of agent evaluations and increased the overall...
Summary: ...to Fujitsu to supply integrated services and project management based on a DMS solution by Meridio. This was developed with Meridio’s high-productivity Application Framework, which is built on SOAP, XML and Microsoft .NET standards. Customs documents are now scanned at 130 TNT scan depots and sent to...
Summary: ...annually. For those business areas within the CSO where the system is rolled out, it provides a substantially more efficient way of processing forms and capturing data. To illustrate this better, every two years the CSO undertakes a large scale agricultural survey involving an eight page questionnaire...
This is a small selection of the Autonomy case studies available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...found. Additional complications arise when subjects incorporate multiple themes. Interoperability of Tagging If two organizations are going to interoperate and apply the same meaning to the same tags, they have to explicitly agree upon their classification schemes in advance. Scale As the number of tags...
Summary: ...measures through ideas distancing, where vital relationships between seemingly uncorrelated subjects such as wing design/low drag and airfoil/efficiency can be established. IDOL realizes there is a degree of overlap between these categories, and because of this a user who has expressed interest in wing...
Summary: ...information, or both. IDOL can automatically insert XML tags and links into documents based on the concepts contained in the information. IDOL’s meaning-based technology also provides an infrastructure for complete and automatic interoperability between applications using different XML tagging rules....
Summary: ...Powers, Senior Analyst, Forrester Research • The USA PATRIOT Act • Military grade data disposition Advanced eDiscovery Web Content Archiving & Compliance offers both automatic and manual policy-driven retention and disposition management, in which assets can be explicitly tagged with retention periods...
Summary: ...or identifying a need for PCI compliance or a governance policy. By automatically tagging, classifying, or applying a policy to an interaction, ICE alleviates the man-hours needed to do this manually and eliminates the likelihood of human error. PCI Compliance and Governance for Audio Autonomy ICE delivers...
Summary: ...the likelihood of misclassification. Autonomy Virage IPTV is the world’s first technology to carry out deep video indexing and automatically generate extensive metadata from television, video or audio, eliminating the need to tag multimedia files manually. Autonomy Virage adds a layer of intelligence...
Summary: ...Autonomy Interwoven Merchandising & Recommendation product brief. IDOL realizes there is a degree of overlap between these categories, and because of this a user who has expressed interest in wing design is connected to -an other user who enjoys studying airfoil. Alerting IDOL can automatically generate...
Summary: ...Autonomy TeleForm. Eliminating the requirement for human intervention or pre-tagging prevents the errors and inconsistency that are frequently associated with manual classification. The Autonomy IDR solution automatically extracts index and field information from paper-based documents of nearly any format...
Summary: ...of documents against a classification schema, which powers information management policies as well as some necessary records management processes. Applying legal hold policies within MOSS are cumbersome, as they require manual search and tag methods. Organizations also recognize that they have no centralized...
Summary: ...for: • web analytics integrations • general debugging of TeamSite SitePublisher page components • repository access for fetching assets • authentication • web services access and SOAP RPC/XML messaging • access to JDBC databases • HTTP requests • tags for the insertion of targeted content...
Summary: ...Autonomy Interwoven Profiling & Personalization broduct brief. Automatic Clustering With automatic cluster analysis, IDOL takes data and identifes inherent themes or patterns across disparate customer data and information. Vital relationships between seemingly uncorrelated content and customers are clustered...
This is a small selection of the Autonomy Product Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...within the stored assets. Users can additionally search the archive using IDOL’s advanced functionality. EAS provides automatic and manual policy-driven retention and disposition management, which automatically and explicitly tags assets with retention periods for a highly granular control over the...
Summary: ...therefore does not require any form of language dependent parsing, dictionaries or translation modules. Treating words as abstract symbols of meaning allows Autonomy's technology to derive understanding through the context in which symbols occur rather than a rigid definition of grammar. Slang and other...
Summary: ...retention enabling users to find information that they didn’t know existed, based on relevance to the original article in any data format • Remove the need to carry out manual tagging and categorization of content • Stop information overload – by allowing the user to identify relevant content...
Summary: ...weighting of metadata to weight individual keywords, special metadata fields or entire documents more or less than others • AFTER - similar to NEAR, only the first (left-hand) term before this operator has to occur within a specified word distance AFTER the term on the right side of this operator in...
Summary: ...to Big5 Japanese - contains ASCII character set (JISRoman) Japanese. Supplement to JIS0208 Japanese Extended UNIX Code Simplified Chinese Extended UNIX Code Traditional Chinese Extended UNIX Code Korean Extended UNIX Code Hebrew Hebrew (old) Pakistan (Urdu) Russian Adobe-Japan1-2 character collection...
This is a small selection of the Autonomy Technical Briefs available, please visit our publications site at http://publications.autonomy.com/ for more information.
Summary: ...Autonomy XML White Paper 20031003. 3 page five 3.2 Idea distancing Tags also fail to highlight the relationships between subjects. Termed ‘idea distancing’, there are often vital relationships between seemingly separately tagged subjects such as for example, /wing design/low drag/ and /aerofoil/efficiency/....
Summary: ...itself, related information, or both. Its key benefits include: • Removing the need to manually insert XML tags • Allowing interoperability between applications that use different XML tagging rules • Allowing applications to use idea distancing (vital relationship between seemingly separately tagged...
Summary: ...XML automatically. Autonomy’s approach allows organizations to overcome problems that are typically associated with XML by: Removing the need to manually insert XML tags Allowing interoperability between applications that use different XML tagging schemes Indexing native XML directly into the engine...
Summary: ...Today’s Imperative for Intelligent Content Targeting: WCM with Robust Search and Advanced Analytics White Paper. • Understand the visitor ’s intent by aggregating all user information, both explicitly stated and implicitly derived, from such diverse sources as CRMs, business intelligence (BI) systems,...
Summary: ...What is an acceptable error rate? Auto-classification schemes, regardless of how rigorously defined, are no substitute for the ability of a human being to understand the context for a matter and where a document should be filed. Claims of 90% accuracy in classification seem impressive on paper, but in...
Summary: ...and Rule 37(f) in particular. Part II of this article suggests a framework to guide courts in determining whether the element of good faith has been met. Finally, Part III provides some practical suggestions on measures corporations should implement in order to establish good faith under the proposed...
Summary: ...pre-computed list of possible tags pre-designed by organizational librarians or editors who construct an overall taxonomy that the tags are references into. Ideally, the taxonomy that is designed would anticipate the categories that users would like to see their search results grouped into. Instead of...
Summary: ...sometimes through less polite processes (politics), and other times through sheer force of convenience. Further, because tests are perceived to be time-consuming, complicated, inconclusive, or expensive, they are not always used. The purpose of this article is to: i) Show how to properly do basic A/B...
Summary: ...automatic categorization. By understanding the information in the enterprise, IDOL automatically generates taxonomies and instantly organizes the data into a familiar child/parent taxonomical structure. IDOL automatically identifies names and creates each node based on an understanding of the concepts...
Summary: ...more severe as increases in trade volumes, derivative applications and cross-border trading further complicate communications between counterparties. Unfortunately, brokered connectivity can only scale by enforcing strict, singular data formats, thus forcing non-standard data transfers to be processed...
Summary: ...Strategies for Simplifying .NET Application Deployment. ■ XML schema definition (XSD) files define and validate XML content and the structure of XML data. If an application needs to access the schema, the XSD files must be deployed. For example, if an application accesses a Web service that returns...
Summary: ...navigation; clear, organized menu items; taxonomy management; and the use of breadcrumbs. Page templates can also be used to ensure pages are consistently created for SEO across the site. n All content needs to be tagged with appropriate keywords to ensure accurate indexing. This is especially important...
This is a small selection of the Autonomy White Papers available, please visit our publications site at http://publications.autonomy.com/ for more information.