Overview
Zoie Solr Plugin enables real-time update functionality for Apache Solr 1.4+. (http://lucene.apache.org/solr/)
Configuration
Build:
Checkout Zoie's latest code from: http://github.com/javasoze/zoie , e.g.
git clone git://github.com/javasoze/zoie.git
and from the zoie top level directory, do:
ant zoie-solr
Copy the following jars to solr/lib directory:
- dist/zoie-<version>.jar
- dist/zoie-solor-<version>.jar
- lib/master/fastutil.jar
- lib/log4.jar
solrconfig.xml:
Configuring Zoie Solr Plugin in the solrconfig.xml:
IndexReaderFactory:
Class: proj.zoie.solr.ZoieSolrIndexReaderFactory:
<indexReaderFactory class="proj.zoie.solr.ZoieSolrIndexReaderFactory" />
UpdateHandler:
Class: proj.zoie.solr.ZoieUpdateHandler:
<updateHandler class="proj.zoie.solr.ZoieUpdateHandler">
ZoieSystem:
A global ZoieSystem is created for the above plugins, and it can be configured with the following properties also in solrconfig.xml:
<zoie.batchSize>25000</zoie.batchSize> <!-- default: 1000 --> <zoie.batchDelay>200000</zoie.batchDelay> <!-- default: 300000 e.g. 5 min --> <zoie.realtime>true</zoie.realtime> <!-- default: true -->
Zoie default Analyzer and Similarity are the same as the ones defined in schema.xml.
These settings can be skipped to use the default settings.
Unique Key:
Zoie assumes every record to be indexed must have a unique key and of type long. Thus, in schema.xml, there must be a declaration for unique key, e.g.:
<uniqueKey>id</uniqueKey>
And make sure every record's unique can be parsed as a long.
JMX support:
Zoie JMX MBean is registered with the name:
zoie-solr:name=zoie-system
Dynamic configurations into Zoie can be made with your favorite JMX console.
Example:
We can try this with the solr example application.
- Setup zoie plugin as the above.
- in solr/example/exampledocs directory, edit books.csv to make sure all ids are of type long.
- go to the url: http://localhost:8983/solr/select/?q=game, you will see 0 hits
- now update example docs from exampledocs directory:
curl http://localhost:8983/solr/update/csv?commit=true --data-binary @books.csv -H 'Content-type:text/plain; charset=utf-8'
- go back to the url: http://localhost:8983/solr/select/?q=game, you will now 2 hits
Future work:
You will see a warning from Solr:
WARNING: The update handler being used is not an instance or sub-class
of DirectUpdateHandler2. Replicate on Startup cannot work. ...
Because Zoie handles updates as a stream and manages IndexWriters internally, thus it does not make sense for ZoieUpdateHandler to be forced to derived from DirectUpdateHandler2.
The following issue has been filed for the Solr team:
