My MARC indexing does the following processing (working with marc4j records):
- normal fields. 29 fields that can be described using nothing but the normal solrmarc syntax. This might be extensive (15-20 tags, lookups in either hashes or sets of regexp matches for transformation) but doesn't require custom code. This generic processor is written in Ruby.
- custom fields. 10 (or 14 if it's a serial) fields that require custom code. These are also all Ruby.
- all fields The single "get all the text in all the fields numbered 10 to 999" method. Ruby.
- xml Turn the record into a MARC-XML string. This uses JRuby to get a java.io.StringWriter and javax.xml.transform.stream and then call marc4j's MarcXmlWriter method to write to it. Which, I just looked, isn't exactly how solrmarc does it. I'll have to benchmark it both ways. Both Ruby and Java