Create an AEM index utilizing Solr

Last month I discussed the options to use Elasticsearch as search engine for AEM content. The presented approach required a custom implementation of a replication agent available on Github1.

As an alternative I’ll demonstrate how the build in functionality of Jackrabbit Oak2 to index into Solr3 can be used. Just like Elasticsearch[^elasticsearch] Solr is a search platform based on Lucene4.

By default Jackrabbit uses the embedded Lucene index in AEM to create an index both for internal queries and custom, application-specific queries (using XPath and SQL-2).

Both Adobe5 and the Jackrabbit documentation6 provide a good guides about the required configurations and use-cases of an Solr index. That is why I’ll keep this part rather short and only list the key points.

I assume that you have already a running Solr server (I currently use 6.4.1) and a core named oak.

Basic setup

First of all you need to decide, if you want to use the embedded Solr server (AEM 6.2 provides Solr 4.7) or an remote Solr server. Given the fact, that a remote Solr server has multiple advantages (e.g. easier configuration, scalability by clustering/sharding, admin ui, …), you’ll most likely prefer this setup.

The complete configuration which Solr server is used and how the index-fields are build is made in the System Console7 where you can find multiple Apache Jackrabbit Oak Solr configurations.

For example the following settings should give you a good starting point:

Apache Jackrabbit Oak Solr remote server configuration

Solr HTTP Url: http://your.solr:8983/solr/oak Zookeeper Host: leave empty

Apacke Jackrabbit Oak Solr server provider

ServerType: Remote Solr

By now Jackrabbit knows about your Solr server but will not use it. So trigger the creation of an index, you need to create a node in /oak:index:

Open CRXDE Lite8, browse to oak:index and create a new node with the primary-type oak:QueryIndexDefinition and the following properties:

PropertyTypeValue
typeStringsolr
asyncStringasync
reindexBooleantrue

As soon as you hit Save, Jackrabbit will reindex your complete repository into Solr.

The following post will show the limitations of this index and how to tweak it by setting up a custom schema.xml in Solr.

Footnotes

Related Posts