Go to www.autonomy.comGo to www.autonomy.co.jpGo to www.autonomy.com.cn
IDOL Modules
 

Products Go to previous page... Products Overview | IDOL™ Server | KeyView IDOL & Connectors Go to next page...

IDOL Server

At the heart of Autonomy's infrastructure software lies the Intelligent Data Operating Layer (IDOL) Server. The IDOL Server collects indexed data from connectors and stores it in its proprietary structure, optimized for fast processing and retrieval of data. As the information processing layer, IDOL forms a conceptual and contextual understanding of all content in an enterprise, automatically analyzing any piece of information from over 1,000 different content formats and even people's interests. Over 500 operations can be performed on digital content by IDOL, including hyperlinking, agents, summarization, taxonomy generation, clustering, eduction, profiling, alerting and retrieval.

IDOL enables organizations to benefit from automation without losing manual control. This complementary approach allows automatic processing to be combined with a variety of human controllable overrides, offering the best of both worlds and never requiring an "either/or" choice.

IDOL integrates with all known legacy systems, eliminating the need for organizations to cobble together multiple systems to support their disparate components.

"Security is a key differentiator for IDOL. IDOL offers "mapped security" and near real-time synchronization of security entitlements with source content repositories - making it a great fit for highly secure search scenarios"
The Forrester Wave™: Enterprise Search Platforms, Matthew Brown
Further Reference: Autonomy Technology White Paper
Further Reference: IDOL Server Technical Brief

An Open Architecture

With the unabated explosion of information, the world's largest global enterprises rely on Autonomy's technology to meet massive scalability and performance needs. Autonomy's distributed and modular architecture routinely supports millions of documents, hundreds of thousands of users, and hundreds of thousands of transactions on commodity hardware, with the largest installation exceeding 10 billion documents. Moreover, encrypted intermachine and intraprocess communication protocols are woven into the fabric of Autonomy's modular design at a fundamental level, providing secure transmission of information throughout the architecture. In fact, Autonomy has earned military-level certifications for the provision of security technology.

Autonomy Service Oriented Architecture (ASOA)

Autonomy's infrastructure product is heavily predicated on the design principles and foundations of re-use, granularity, modularity, componentization, interoperability and performance. The ASOA is a natural extension of these design principles. All Autonomy modules are discoverable services and interface using SOAP as a standard, providing a huge number of meaning-based functions as a service.

Data communication and transport scenarios are often varied and potentially unpredictable in their restrictions, whether these are operational, based on business requirements or simply dictated by the usage patterns inherent in the organization. Autonomy as an enterprise application has developed and utilizes the Autonomy Enterprise Messaging Bus (AEMB) as a means to effectively manage all messaging across the Autonomy infrastructure as well as delivering and building the ASOA. The AEMB is based on TCP/IP and maximizes the available computational resources.

Further Reference: Autonomy Technology White Paper
Further Reference: IDOL Server Technical Brief

Security

The world's largest and most secure intelligence organizations have deployed Autonomy's Intellectual Asset Protection System (IAS) Connectors to safeguard their most sensitive information assets. Autonomy provides all aspects of security management, including front-end user authentication, back-end entitlement checking and secure encrypted communication between the IDOL Server and its client applications with 128-bit Block Tiny Encryption Algorithm (BTEA). IDOL's mapped security model is the only empirically proven index security model that Unmapped Security scales in the enterprise.

"One factor that has set the Autonomy search apart from the crowd for Fineagan is security. Whatever security exists on the application layer, she says, Autonomy acknowledges it."
Carol Fineagan, CIO of EnergySolutions, CIO Magazine, July 2008
Unmapped Security
Unmapped Security

There are three general security models currently available:

1. Unmapped Security

Unmapped security is the traditional method used by source repositories and search engines. For every potential match to a given query, a call is made via the native repository's API (e.g. Documentum) to ascertain the access privileges for that particular document. A single query consequently bombards the native repository with document privilege requests as the retrieval system attempts to assemble a relevant results list from thousands of candidate hits. This method presents significant performance and scalability problems.

Autonomy recommends mapped security but also offers the choice between mapped, unmapped and a hybrid of both. Autonomy also supplies plug-in sample code, so that customers, OEMs and partners are able to develop and implement their own form of security plug-in.

2. Cached Security

Cached security is the method of choice for legacy systems. Cached security only marginally relieves the scalability problem of unmapped security by storing results for queries it has already seen. Consequently, when a user repeats a query, the result set can be retrieved from the cache rather than triggering a network-mediated request. However, this approach still relies on calling out across the network directly to the repository for each new query. In addition, it also misses potential results, as the result sets stored within its memory do not dynamically update new information.

3. Autonomy's Unique IAS Mapped Security

Only Autonomy offers mapped security - a highly configurable, secure, accurate, and fast method for respecting third party security entitlements. IDOL maps the underlying security model in the form of ACL, group, role, protective markings, etc. from all of the underlying repositories directly into the kernel of the IDOL engine itself, and stores the information in an encrypted field. As a result, IDOL does not need to send any requests across the network to the data stores when building up a results list. What the user is allowed to see is assessed "inline" within the IDOL kernel at speeds that exceed the response times of the native repository. Unlike other techniques, the security model is never out of date as the transitional signaling mechanism within the connector layer informs IDOL in real-time of any updates or changes to permissions within the underlying content.

Mapped Security
Mapped Security

Since IDOL's architecture is inherently modular by design, it requires multiple subsystems to communicate with each other, often across insecure networks. All communication between these processes may be encrypted (Secure Sockets Layer), so that packet sniffers who are able to break past a firewall are unable to read the content of traffic between IDOL modules. All of the system's modules are capable of operating in a secure communications mode providing, at minimal processing overhead, the protection of 128-bit encryption. Additionally, IDOL can leverage SSL for both aggregation and querying of content, including access to SSL encrypted sites.

Further Reference: Autonomy Security White Paper
"IDOL offers a broad set of capabilities for content classification. IDOL supports advanced techniques for automatic content categorization like vector-based taxonomies, concept modeling, and entity extraction"
The Forrester WaveTM Enterprise Search Platforms, Matthew Brown

Scalability and Performance

The management of structured and unstructured content requires a platform that can meet the most rigorous performance requirements and be easily resized commensurate to business needs. IDOL scales to support the largest enterprise-wide and portal deployments in the world, with presence in virtually every vertical market. Since IDOL's scalability is based on its modular, distributed architecture, it can handle massive amounts of data on commodity dual-CPU servers. For instance, only a few- hundred entry-level enterprise machines are required to support ChoicePoint's 10 billion record footprint. By comparison, competitor uses 150,000 machines to handle the same amount of data.

A single IDOL engine can:

Support over 470 million documents on 64-bit platforms
Accurately index in excess of 110 GB/hour with guaranteed index commit times (i.e. how fast an asset can be queried after it is indexed) of sub 5ms
Execute over 2,600 queries per second, with subsecond response times on a single machine with two CPUs when used against 70 million pieces of content, while querying the entire index for relevant information
Support hundreds of thousands of enterprise users, or millions of web users, accessing hundreds of terabytes of data
Save storage space with an overall footprint of less than 15% of the original file size

This enhanced scalability results in hardware cost-savings as well as the ability to address larger volumes of content. Though IDOL scales extremely well on commodity servers, its flexible architecture can take full advantage of massive parallelism, SMP processing capabilities, 64-bit environments (such as Intel Itanium 64-bit architecture), software platforms (such as Solaris 10, Linux 64, Win64, etc), distributed server farms, and all common forms of external disk arrays (i.e. NAS, SAN etc) to further improve performance. This flexibility extends to being able to leverage one or a combination of these different environments.

How It Works

Content from various repositories is aggregated by connectors and then indexed into the IDOL Server or for dissemination across multiple IDOL Servers, through the Distributed Index Handler (DIH). The DIH can efficiently split and index copious quantities of data into multiple IDOL Server instances, optimizing performance by batching data, replicating all index commands and invoking dynamic load distribution. The DIH can perform data-dependent operations, such as distributing the content by date, which allows for more efficient querying. Performance is augmented by the Distributed Action Handler (DAH), a distribution server that allows the user to distribute action commands, such as querying, to IDOL Servers. Multiple copies of IDOL Servers to which the DAH propagates actions further ensure uninterrupted service in the event of server failure. For flexibility, both the DAH and the DIH can be configured to run in mirroring mode (IDOL Servers are exact copies of each other) and non-mirroring mode (each IDOL Server is configured differently and contains different data). In addition, the Distributed Service Handler (DiSH) component allows effective auditing, monitoring and alerting of all other Autonomy components.

Linear Scalability

Performance and capacity can be doubled by simply replicating the existing machine. This allows scaling predictions to be made without worry about bottlenecks.

Load Balancing

Data is automatically replicated across multiple servers and user requests are load-balanced across these replicas, guaranteeing performance, reducing latency and improving user-experience.

Mirroring / Failover

Automatically generated replicas are used to provide a pool of servers, the primary resource is automatically selected and the system switches to secondary systems if it fails so that service continues uninterrupted.

Distribution

For organizations that are geographically distributed, local replicas are automatically created and utilized where possible. Remote copies are only used when a local system fails, thereby building fault tolerance, the benefits of local performance and a reduction of resource overhead into a single, seamless service.

Adaptive Probabilistic Concept Caching

Frequently used concepts are maintained in memory and query results are returned as quickly and efficiently as possible.

Multi-dimensional Index & Query Throttling

By using a multi-dimensional index to provide valuable information to the distribution components, IDOL precludes bottlenecks and unbalanced peak loads during the indexing and query process.

Autonomy provides prioritized throttling based on:

Time: maximize index/query performance based on the time of day (i.e. work hours)
Location: prioritize activity based on the server landscape
Status: arbitrarily assign prioritized status for processing
"We have worked with Autonomy for a number of years due to their ability to offer a next-generation enterprise search platform that doesn't necessitate a trade-off between performance, security and scalability."
Mr. K. Sriram, Senior Vice President, Satyam Consulting and Enterprise Solutions Practice, 2007
Further Reference: Autonomy Performance and Scalability White Paper

Global Language Support

Today's enterprise is dynamic, with operations across the globe that conduct business in numerous local languages. The need to manage content in varied languages has never been more acute. It is no longer ideal for the enterprise to restrict communications to a single language. To this end, Autonomy provides extensive language support within a single platform (IDOL Server).

Autonomy supports over 30 languages on the BBC websites
Autonomy supports over 30 languages on the BBC websites

The IDOL Server develops a statistical understanding of the patterns of any language, such as German, Spanish, Arabic, Japanese and Norwegian, using sophisticated probabilistic modeling and pattern matching techniques. Although IDOL currently supports an impressive 106 languages, it is trivial to add support for more because the technology is fundamentally language independent. IDOL uses non linear adaptive digital signal processing that exploits high performance probabilistic modeling to extract a document's meaning. Since the technology is mathematically-based and free from linguistic restraints, it need not use any language-dependent parsing or dictionaries to extract meaning. While many enterprise platforms rely on pre-existing knowledge of grammar and linguistic rules, IDOL allows indexed content to dictate the model as it develops a statistical understanding of patterns that occur in the content over time. Hence even slang or industry-specific jargons do not pose a barrier for processing. True to its Bayesian roots, the more content IDOL collects about an industry - e.g. legal terms, pharmaceutical developments - the more understanding it will form about that domain.

A new language can be thought of as simply another type of information from which IDOL needs content to learn. It is therefore possible to mix more than one language in IDOL as long as enough content from those languages is available. Moreover, cross- lingual systems can be configured in which a user can query a subject in English and automatically be provided with similar information in both English and another language such as Spanish. Automatic language detection of incoming content as well as language identification of queries are both offered.

Autonomy's technology is founded on the principle of learning and adapting to an influx of new information. It automates processes that were once labor intensive such as metadata tagging and taxonomy creation. Where other solutions need to be taught new words, phrases or concepts and shown how to categorize them, IDOL can automatically deduce the significance of these new units of meaning, add them to relevant categories and create new categories where necessary.

The core IDOL technology is fully data agnostic and can support all forms of single and multibyte languages, providing equal functionality worldwide, in any idiom, and automatically adapt to the dynamic evolution of language itself. This capability unlocks an enormous potential for a global enterprise, as its brightest talents can now collaborate and provide expertise to everyone in the organization. Fully modal and scalable, Autonomy's software is fundamentally internationalized and is able to function anywhere, at any capacity, without any degradation of functionality. Autonomy provides optional language packs to further enhance localization including stemming, stoplists, transliteration, multiple encoding support and term decomposition.

Further Reference: Autonomy Internationalization White Paper

Cross Platform

Autonomy supports the following operating systems:

AIX
FreeBSD
HP-UX
IBM AS/400
IBM System/38
Intel Solaris
IRIX
Linux
Mac OS X
OpenBSD
OS/400
Sun Solaris
SUSE 9
Tru64
Unixware
Other POSIX
Windows 2000
Windows 2003
Windows XP
Windows Vista

IDOL K2

Following the acquisition of Verity, Autonomy is committed to continuing the development and support of Verity K2. K2 7 unites K2 and the IDOL Server by delivering K2's robust enterprise search capabilities on the IDOL kernel, improving performance and scalability.

Highlights of K2 7 include:

Existing K2 applications run unchanged; no need to migrate
Linear scalability through a multi-threaded, 64-bit architecture
Access to more than 500 advanced IDOL functions simply by adding API calls to their existing K2 application
Out-of-the box support for automatic hyperlinking, agents and alerting, automatic clustering, Implicit Query, Automatic Query Guidance, video and audio analysis
Secure query performance - single instance
Secure query performance - single instance
Index time - single instance
Index time - single instance

The Options for K2 Customers

Customers can choose to continue to use their K2 applications as they do today or migrate to the new IDOL-based kernel in v7.

Seamless Upgrade to K2 7

Organizations can continue using their K2 application as it is today, but Index time - single instance with the benefits of enhanced scalability and performance.

Users Work with Familiar User Interfaces

The supported client-side APIs protect the K2 users' existing investments in user interface development. With API support for C, PHP, COM/+, HTTP, Java, and .NET, all users can continue to work with familiar user interfaces.

Investments in Queries are Protected

All of the K2 users' custom queries will work transparently with K2 7. All Verity Query Language (VQL) calls remain operational; no changes are required. Performance will also improve since queries will be executed natively.

Accelerate Deployment with Sentient Configurations Wizard

K2 7 is deployed rapidly with an intuitive and interactive wizard. This "environment aware" framework can step through the scan, ingest and upgrade an existing K2 footprint for direct use in K2 7. In addition, the upgrade of K2 database files is native and automatic. K2 7 can elect to query existing collections or automatically import all relevant metadata and security information from within a collection. And in all cases, the collections can automatically be rendered into the new K2 7 format to allow full use of the extended operation set that the upgrade will allow.

Further Reference: IDOL K2 Roadmap White Paper
Products Go to previous page... Products Overview | IDOL™ Server | KeyView IDOL & Connectors Go to next page...
Further References:
 
 
Discover More...
 
 

Limitations of Other Approaches
IDOL Modules