[Quick Start] Lucene

Lucene is a full-text search library in Java.It does so by adding content to a full-text index. It then allows you to perform queries on this index, returning results ranked by either the relevance to the query or sorted by an arbitrary field such as a document’s last modified date.The content you add to Lucene can be from various sources, like a SQL/NoSQL database, a filesystem, or even from websites.

Searching & Indexing:

Lucene is fast because it uses indexes for searching instead of searching for the text directly. It uses a Inverted index for searching as it inverts the page centric searching(page->word) to keyword centric searching(word->page). It is similar to searching for keyword related pages at the end of the book instead of searching all pages for the keyword.

Document is the unit of search and index. It consists of one or more fields(key-value pair). Indexing involves adding Documents to an IndexWriter, and searching involves retrieving Documents from an index via an IndexSearcher.



Querying is done by its own mini language. The Lucene query language allows the user to specify which field(s) to search on, which fields to give more weight to (boosting), the ability to perform boolean queries (AND, OR, NOT) and other functionality.


More on how to use Lucene in programs can be found in tutorialspoint