What is Solr

Solr is a Web based search engine application, you can build your own full text search REST Web service with it. A Solr application is designed to run as a standalone Web application, your application system interact with Solr server through HTTP protocol.

It's a supporting service behind the search box in your application. Searching is a fundamental component for most Web or even desktop applications. Google is a general search engine the target is the whole Web, there are also lot of vertical search applications, for example eCommerce like Amazon needs to provide search services for their products and books.

Other vertical search application include: job searches, hotel searches, rent house searches, enterprise searches, even your own private documents searches, for example the recipes.

Lucene the core engine of Solr

Solr is a wrapper of the full text search API Lucene. Lucene provides the core features like indexing, query, scoring, etc. And Solr provides an user friendly GUI for high level applications. It provides the Web administration GUI, XML based configuration file and REST API. So you don't have to write code to use it.

Instead writing Java code to manipulate the index in your application, you send and receive HTTP request between your application and Solr server. Let Solr handle the complexity and call the Lucene API. You will configure Solr with high level abstractions to make it to do things you wanted.

ElasticSearch is a product very similar to Solr, both were built based on Lucene but provide different interfaces.

Solr it's the first widely adopted Lucene wrapper, but it seems ElasticSearch drew more attentions these days. But Solr is still a powerful and mature solution with an active development and user community.

Where and When Solr started

Solr started in 2004, then it was contributed to Apache Software Foundation. New version keeps coming out and new features and improvements are added constantly. The Newest version is Solr 6.

It was designed for CNET Networds who needs a new search engine the replace the old one. Then it was made open source in 2005.

Now the Lucene and Solr project has been merged, so you can always use the newest Lucene in Solr.

What Solr can do

Solr can be used to perform full text search on all kinds of data: Web pages, Pdf, Docs, products, from small content manage system to large corporations. It's reliable, scalable , fault tolerant, support distributed index storing and load balance.

It can index your data in XML or JSON format through HTTP protocol, it can index data directly from databases like MySQL. Or retrieve text from HTML, it will strip the HTML tags for you. All those features just need a few lines of XML configuration.

It can be used in a Web application for public use or used in an enterprise for private use.

Solr and NoSQL

NoSQL database usually is used to process big data, for example the MongoDB. Data usually unstructured and with no strong relations with each other.

The full text search engine is a perfect domain for NoSQL. The data is a large corpse of text centric documents(millions of docs for example). Data comes from various sources which can't be easily normalized. Needs to serve heavy user queries.

Solr is a NoSQL technology that handle problems like this. It accept text data and create index for them and return results by relevance according to the keywords in query.


For user and developer, Solr can be used to implement many features , from the search box on you web page to generate related posts fro your blog posts, and complex search interface geared with auto suggestion, spell correction, facet, geospatial search, etc.