BT
x Your opinion matters! Please fill in the InfoQ Survey about your reading habits!

Apache Lucene and Lucene.Net – Full Text Search Servers

by Jonathan Allen on Nov 06, 2008 |

Ten years ago, relying on open source projects was unimaginable in most Windows shops. These days, .NET programmers are awakening to the world of enterprise class software developed and proven on the Java platform. Today we look at the popular Full Text search engines, Apache Lucene and Lucene.Net.

Apache Lucene and its port, Lucene.Net are battle-tested products used to provide search capabilities for big name sites such as Wikipedia, CNET, and Monster.com. With references like that, their capabilities and future are not in doubt.

Lucene is not a crawling search engine, nor does it automatically index content. The text of documents to be indexed have to extracted prior to loading into a Lucene index. The standard pattern for doing this is to instantiate an Analyzer, open an IndexWriter, and then add each document one by one. Once done, the index can be optionally optimized before it is closed and the changed committed. This process is probably more hands-on than developers are used to, but it does give you a lot of flexibility on what data is indexed.

Searching can be done via an object model, with the query built up term or term. Alternately, a plain text search string, perhaps entered by an end-user, can be parsed and executed. .NET developers using .NET 3.5 and later also have a third option, LINQ to Lucene. Their project page has a nice map between Lucene's search syntax and the corresponding LINQ to Lucene syntax.

If you want to try it out, Andrew Smith has an Introduction to Lucene.NET. And regardless if you choose the .NET or Java version, also take a look at Erik Hatcher's Lucene Intro.

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Educational Content

General Feedback
Bugs
Advertising
Editorial
InfoQ.com and all content copyright © 2006-2014 C4Media Inc. InfoQ.com hosted at Contegix, the best ISP we've ever worked with.
Privacy policy
BT