InfoQ

News

Compass 2.0: Simplification, integration, and performance improvements

Posted by Ryan Slobojan on May 14, 2008 12:00 PM

Community
Java
Topics
Search
Tags
GigaSpaces,
Terracotta,
Coherence,
Lucene,
ORM,
Compass

The Compass project, an open source project based on Lucene which aims to simplify the integration of search into Java applications, recently released version 2.0. InfoQ spoke with Compass founder Shay Banon to learn more about this release and about what Compass provides to the Java community.

Banon identified the major features of this release as:

  • Simplification of O/R Mapping (ORM) integrations -- All of the integration features, such as real-time mirroring of ORM changes in the search index and complete mapping-based indexing of the database, are now available through configuration properties in the Hibernate/JPA configuration file
  • Distributed data grid integration -- Integration with GigaSpaces, Terracotta and Coherence are now supported as methods for enabling Lucene index storage as part of a data grid, and the Lucene index can still be used transparently by Lucene libraries and tools
  • Searchable data grid capability -- Changes which occur on the data grid are automatically mirrored to the Lucene index through the Object to Search Engine mapping and the integrations with Coherence CacheStore and the GigaSpaces mirror service
  • Performance improvements -- Major internal enhancements in Compass combined with the improvements in Lucene 2.3 have resulted in a major increase in Compass performance
  • Easy upgrade from Compass 1.2 -- The main API has remained the same, with configuration tweaks and minor API changes being documented in the upgrade notes. A reindex is also needed due to internal changes, but overall the upgrade process should be fairly simple

Compass also has a new project website at http://www.compass-project.org, and a complete listing of changes available.

Banon also described the Compass core features, and how Compass compares with Solr, Nutch and base Lucene:

Compass, at its core, aims to simplify the integration of search into any Java application. Compass tries to simplify the API when working with a search engine. The API should be very familiar for people who are used to ORM libraries. Another main feature of Compass is the ability to easily map a Java object model into the search engine as well as other formats such as XML and Map like structures. On top of that, Compass steps even further and provides seamless integration with ORM libraries, data grids, and other.

Regarding Lucene, Compass is built on top of Lucene. All of Lucene features are exposed and can be used with Compass, but Compass tries to simplify its usage, especially within your typical Java application. Regarding Solr, I guess it also aims at simplifying Lucene, but in a different way. It exposes an http service for indexing and searching, but I heard that an "embedded" version of it will be available as well. I guess the main difference stems from a different viewpoint on how search is integrated into an application. I will just note that creating an http search service on top of Compass is very simple and many users have done just that.

When asked about future plans for Compass, Banon stated that most of the features are driven by user demand. Potential future ideas include looking at different indexing formats such as JSON, more comprehensive and full-featured data grid integration to enable colocated indexing and searching, and a UI layer search integration to create a better out-of-the-box experience. Banon also added that all Compass feedback and help is greatly appreciated.

1 comment

Reply

Kewl! by ARI ZILKA Posted May 14, 2008 10:09 PM
  1. Back to top

    Kewl!

    May 14, 2008 10:09 PM by ARI ZILKA

    Great work Shay! --Ari

Exclusive Content

Rationalizing the Presentation Tier

Thin client paradigm characterized by web applications is a kludge that needs to be repudiated. Old compromises are no longer needed and it's time to move the presentation tier to where it belongs.

Agile Project Management: Lessons Learned at Google

In this presentation filmed during QCon 2007, Jeff Sutherland, the creator of Scrum, talks about his visit at Google to do an analysis of Google's first implementation of Scrum.

AtomServer – The Power of Publishing for Data Distribution

In this article, Bryon Jacob and Chris Berry introduce AtomServer, their implementation of a full-fledged Atom Store based on Apache Abdera, which is now available as open source.

An Introduction to Virtualization

It is easy to think that virtualization applies only to servers. In reality the recent resurgence of the concept is also being applied to networking, storage, and application infrastructure.

REST Anti-Patterns

In this article, Stefan Tilkov explains some of the most common anti-patterns found in applications that claim to follow a "RESTful" design and suggests ways to avoid them.

Choosing between Routing and Orchestration in an ESB

In this article, Adrien Louis and Marc Dutoo discuss the differences and relative merits of using orchestration vs. routing in a typical ESB setup, and discuss various implementation options.

Enterprise Batch Processing with Spring

Wayne Lund discusses batch processing, Spring Batch objectives and features, scenarios for usage, Spring Batch architecture, scaling, example code, failures and retrying, and the future roadmap.

User Story Estimation Techniques

Developer Jay Fields draws on his experiences as a ThoughtWorks consultant to describe effective user story estimation techniques.