What is Solr cores

Solr core is a way to represent a Lucene index and a set of configurations that control the accessing and using of the index with Solr. It's the main object you will interact with when working with Solr. You will create it, configure it, indexing data in it and preform queries on it. A Solr core is a Lucene index but wrapped in Solr related configurations.

In Solr 4.4 or before, you need to define Solr cores in solr.xml. It's a centralized mode.

In solr.xml you will see the following configurations. The main content of the configuration file is core. The solr.xml can be found in Solr home directory.

 
<solr persistent="true">
  <cores adminPath="/admin/cores" defaultCoreName="collection1" host="${host:}" hostPort="${jetty.port:}" hostContext="${hostContext:}" zkClientTimeout="${zkClientTimeout:15000}">
    <core name="collection1" instanceDir="collection1" />    
  </cores>
</solr>
 

A core consists of index data and a bunch of configuration files. Simply speaking , a Solr core is just a set of indexed documents.

Theoretically you can put any kind of documents into one single index, because Solr is schema free, just like MongoDB document, but in reality you will put different kind of documents to different collections, both for management and query convenient.

Suppose you have a set of web page crawled from the web, and you also have a set of pdf documents on file system. You will search them separately.

For example, in Google you can search web pages or search books or just search contents from blogs. These kinds of search should have their own index data.

You can think of Solr core as a table in RDBMS, the data has similar structure, but don't have to be identical.

Solr core and Solr instance

A Solr instance can contains many Solr cores. A Solr instance is much like a database which contains many tables.

Create new core

To create a new core, you prepare the directory for the core and configure it in solr.xml

 
<core name="collection2" instanceDir="collection2" />
 

A core directory should contains conf and data directory. The data directory will hold the index data and conf will contains configuration files like schema.xml and solrconfig.xml.