How to sort Lucene search results

By default, for any query, Lucene sort the results by the score and this is user wanted in most cases. But there are exceptions, suppose user want to search news, it's obvious the freshness is more important than relevance. You may need to sort the matched documents by date in descending order.

Actually Lucene can sort the matched documents by any criteria you specified, the default one is just a special case. Even the calculation of the scores is optional if you don't need it to sort.

The overloaded search method

To sort customized criteria you need to use an overloaded search method of IndexSearcher

 
  public final TopFieldDocs search(
      Query query,
      int n,
      Sort sort,
      boolean doDocScores,
      boolean doMaxScore
) throws IOException 
 

You should already familiar with the first and second parameter, we focus on the rest parameters.

Sort

The Sort object encapsulate sorting criteria for matched documents. To make a field sortable, the field should not be analyzed, that means the field only contains one term. The value of the term either be a number or a string, sort numerically or alphabetically. Store the field is optional.

See more Lucene index option analyzed vs not analyzed

Something like this:

 
document.add (new Field ("byNumber", Integer.toString(x), Field.Store.NO, Field.Index.NOT_ANALYZED));
 

doDocScores

Whether to compute score for matched documents. When you sort by other criteria, you may not need the score any more, disable this can improve the performance.

You can enable it if you want to combine score and other sorting criteria together, for example, first order by date then sort by relevance.

doMaxScore

Whether the max score for matched documents should be calculated. The max score is more costly than document score. Disable it if you want more performance gain.

This is how max score is tracked

 
          float score = Float.NaN;
          if (trackMaxScore) {
            score = scorer.score();
            if (score > maxScore) {
              maxScore = score;
            }
          }
 
 

Every time it encounter a document score, it compares to the current max score, if it's bigger, the max score is update to it.

When doMaxScore is false and doScores is true, the actual size of matched documents that needs to score is small.

Set it accordingly to your requirement.

Sort by date

Below is an example that uses Lucene 5.3.0 to sort by date, descending or ascending. The date will be a string, since the alphabetical order to date string is the order of the actual date.

Now let's add a date field to the document and sort by the field.

 
package com.makble.luceneexample;
 
import java.io.IOException;
 
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.SortedDocValuesField;
import org.apache.lucene.document.StoredField;
import org.apache.lucene.document.TextField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.search.FuzzyQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.Sort;
import org.apache.lucene.search.SortField;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.SortField.Type;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.BytesRef;
 
public class SortExample {
    public static Analyzer analyzer = new StandardAnalyzer();
    public static IndexWriterConfig config = new IndexWriterConfig(analyzer);
    public static RAMDirectory ramDirectory = new RAMDirectory();
    public static IndexWriter indexWriter;
 
    public static void main(String args []) throws ParseException {
        createIndex();
        searchSingleTerm("title","lucene");
        ramDirectory.close();
    }
 
    public static void createDoc(String author, String title, String date) throws IOException {
        Document doc = new Document();
        doc.add(new TextField("author", author, Field.Store.YES));
        doc.add(new TextField("title", title, Field.Store.YES));
        //doc.add(new Field ("date", date, Field.Store.YES, Field.Index.NOT_ANALYZED));
        doc.add(new SortedDocValuesField ("date", new BytesRef(date) ));
        doc.add(new StoredField("date", date));
 
        indexWriter.addDocument(doc);
    }
 
    public static void createIndex() {
        try {
                indexWriter = new IndexWriter(ramDirectory, config);    
                createDoc("Sam", "Lucece index option analyzed vs not analyzed", "2016-12-12 20:19:57");    
                createDoc("Sam", "Lucene field boost and query time boost example", "2016-03-16 16:57:44");    
                createDoc("Jack", "How to do Lucene search highlight example", "2016-03-16 17:47:38");
                createDoc("Smith","Lucene BooleanQuery is depreacted as of 5.3.0" , "2015-04-30 11:44:25");
                createDoc("Smith","What is term vector in Lucene", "2015-04-10 20:33:53" );
 
                indexWriter.close();
        } catch (IOException | NullPointerException ex) {
            System.out.println("Exception : " + ex.getLocalizedMessage());
        } 
    }
 
 
     public static void searchIndexNoSortAndDisplayResults(Query query) {
         try {
             IndexReader idxReader = DirectoryReader.open(ramDirectory);
             IndexSearcher idxSearcher = new IndexSearcher(idxReader);
 
             TopDocs docs = idxSearcher.search(query, 10);
             System.out.println("length of top docs: " + docs.scoreDocs.length);
             for (ScoreDoc doc : docs.scoreDocs) {
                 Document thisDoc = idxSearcher.doc(doc.doc);
                 System.out.println(doc.doc + "\t" + thisDoc.get("author")
                         + "\t" + thisDoc.get("title"));
             }
         } catch (IOException e) {
             e.printStackTrace();
         } finally {
         }
     }
 
    public static void searchIndexAndDisplayResults(Query query) {
        try {
            IndexReader idxReader = DirectoryReader.open(ramDirectory);
            IndexSearcher idxSearcher = new IndexSearcher(idxReader);
 
            Sort sort = new Sort(SortField.FIELD_SCORE,
                    new SortField("date", Type.STRING));
 
            TopDocs docs = idxSearcher.search(query, 10, sort,true, true);
            System.out.println("length of top docs: " + docs.scoreDocs.length + " sort by: " + sort);
            for (ScoreDoc doc : docs.scoreDocs) {
                Document thisDoc = idxSearcher.doc(doc.doc);
                System.out.println(doc.doc + "\t" + thisDoc.get("author")
                        + "\t" + thisDoc.get("title")
                        + "\t" + thisDoc.get("date"));
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
        }
    }
 
    public static void searchSingleTerm(String field, String termText){
        Term term = new Term(field, termText);
        TermQuery termQuery = new TermQuery(term);
 
        searchIndexAndDisplayResults(termQuery);
        searchIndexNoSortAndDisplayResults(termQuery);
    }
 
 }
 
 

To add the field

 
        doc.add(new SortedDocValuesField ("date", new BytesRef(date) ));
        doc.add(new StoredField("date", date));
 

In Lucene 5, sort field is not stored, to store the field, you need to add the field with the same name and set it as stored. See Lucene sort unexpected docvalues type NONE for field 'date'.

And this is how to call search method of IndexSearcher

 
Sort sort = new Sort(SortField.FIELD_SCORE,
                    new SortField("date", Type.STRING));
 
            TopDocs docs = idxSearcher.search(query, 10, sort,true, true);
 

It will sort the results first by score and then by the field date, both the doDocScores and doMaxScore are true.

Output

 
length of top docs: 4 sort by: <score>,<string: "date">
4    Smith    What is term vector in Lucene    2015-04-10 20:33:53
3    Smith    Lucene BooleanQuery is depreacted as of 5.3.0    2015-04-30 11:44:25
1    Sam    Lucene field boost and query time boost example    2016-03-16 16:57:44
2    Jack    How to do Lucene search highlight example    2016-03-16 17:47:38
 
 
length of top docs: 4
3    Smith    Lucene BooleanQuery is depreacted as of 5.3.0
4    Smith    What is term vector in Lucene
1    Sam    Lucene field boost and query time boost example
2    Jack    How to do Lucene search highlight example
 
 

The first search is sorted by date, the second only by score.

The default order is ascending , to make it descendant, add the third parameter to SortField and set it to true

 
            Sort sort = new Sort(SortField.FIELD_SCORE,
                    new SortField("date", Type.STRING, true));
 
 
length of top docs: 4 sort by: <score>,<string: "date">!
3    Smith    Lucene BooleanQuery is depreacted as of 5.3.0    2015-04-30 11:44:25
4    Smith    What is term vector in Lucene    2015-04-10 20:33:53
2    Jack    How to do Lucene search highlight example    2016-03-16 17:47:38
1    Sam    Lucene field boost and query time boost example    2016-03-16 16:57:44
length of top docs: 4
3    Smith    Lucene BooleanQuery is depreacted as of 5.3.0
4    Smith    What is term vector in Lucene
1    Sam    Lucene field boost and query time boost example
2    Jack    How to do Lucene search highlight example