As an alternative I’ll demonstrate how the build in functionality of Jackrabbit Oak2 to index into Solr3 can be used. Just like Elasticsearch[^elasticsearch] Solr is a search platform based on Lucene4.
By default Jackrabbit uses the embedded Lucene index in AEM to create an index both for internal queries and custom, application-specific queries (using XPath and SQL-2).
Both Adobe5 and the Jackrabbit documentation6 provide a good guides about the required configurations and use-cases of an Solr index. That is why I’ll keep this part rather short and only list the key points.
I assume that you have already a running Solr server (I currently use 6.4.1) and a core named
First of all you need to decide, if you want to use the embedded Solr server (AEM 6.2 provides Solr 4.7) or an remote Solr server. Given the fact, that a remote Solr server has multiple advantages (e.g. easier configuration, scalability by clustering/sharding, admin ui, …), you’ll most likely prefer this setup.
The complete configuration which Solr server is used and how the index-fields are build is made in the System Console7 where you can find multiple
Apache Jackrabbit Oak Solr configurations.
For example the following settings should give you a good starting point:
Apache Jackrabbit Oak Solr remote server configuration
Solr HTTP Url: http://your.solr:8983/solr/oak Zookeeper Host: leave empty
Apacke Jackrabbit Oak Solr server provider
ServerType: Remote Solr
By now Jackrabbit knows about your Solr server but will not use it. So trigger the creation of an index, you need to create a node in
Open CRXDE Lite8, browse to
oak:index and create a new node with the primary-type
oak:QueryIndexDefinition and the following properties:
As soon as you hit
Save, Jackrabbit will reindex your complete repository into Solr.
The following post will show the limitations of this index and how to tweak it by setting up a custom schema.xml in Solr.