- What is it?
- Optimistic concurrency / Document centric versioning constraints
- Solr rejects older documents with a 409 error.
- Why?
- If you are ingesting multiple versions of a document but you don't have a guarantee of the order in which they will be ingested.
- How?
- Simply add some configuration to your solrconfig.xml
<config> <!-- ... --> <updateRequestProcessorChain name="DocCentricVersioningOnDate"> <processor class="solr.ParseDateFieldUpdateProcessorFactory"> <str name="defaultTimeZone">Etc/UTC</str> <arr name="format"> <str>yyyy-MM-dd HH:mm:ss Z</str> <str>yyyy-MM-dd HH:mm:ss</str> </arr> </processor> <processor class="solr.DocBasedVersionConstraintsProcessorFactory"> <str name="versionField">record_update_date</str> <bool name="ignoreOldUpdates">false</bool> </processor> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> </config>
- Make sure to index configured field with appropriate format
- This is a good opportunity to fine tune the meaning of what is considered the latest update time
- Simply add some configuration to your solrconfig.xml
- ✨ ✨ Demo ✨ ✨
- Caveats and mitigations:
- Solr errors out do to max skipped.
- Allow unlimited skips
provide "solr_writer.max_skipped", -1
- Allow unlimited skips
- But there's still a performance hit and traject still exits with -1:
- Or use custom SolrJsonWriter:
provide "writer_class_name", "CobIndex::SolrJsonWriter"
- Or use custom SolrJsonWriter:
- Solr errors out do to max skipped.
Created
October 10, 2019 17:07
-
-
Save dkinzer/bd8ed353dbacb988a8419979a8aab43b to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment