[Quick Start] Lucene

Lucene is a full-text search library in Java.It does so by adding content to a full-text index. It then allows you to perform queries on this index, returning results ranked by either the relevance to the query or sorted by an arbitrary field such as a document’s last modified date.The content you add to Lucene can be from various sources, like a SQL/NoSQL database, a filesystem, or even from websites.

Searching & Indexing:

Lucene is fast because it uses indexes for searching instead of searching for the text directly. It uses a Inverted index for searching as it inverts the page centric searching(page->word) to keyword centric searching(word->page). It is similar to searching for keyword related pages at the end of the book instead of searching all pages for the keyword.

Document is the unit of search and index. It consists of one or more fields(key-value pair). Indexing involves adding Documents to an IndexWriter, and searching involves retrieving Documents from an index via an IndexSearcher.

IndexProcess

Query:

Querying is done by its own mini language. The Lucene query language allows the user to specify which field(s) to search on, which fields to give more weight to (boosting), the ability to perform boolean queries (AND, OR, NOT) and other functionality.

SearchProcess

More on how to use Lucene in programs can be found in tutorialspoint

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s