lucene - How to extract solr index docs -


i need transformations on docs before indexing them in solr. texts come various resources , it's diffcult transformations before indexing because have adapt several programs parse files. i'm thinking of indexing them in solr, extract text fields, transformations , reindex again.

i tried :

curl 'http://localhost:8983/solr/collection1/select?q=*&rows=20000&wt=xml&indent=true'  

but output results xml file while i'm looking way extract docs fields in posting format. possible? how should do?

thanks

i recommend using 1 of solr clients listed on integrating solr page. allow use programming language of choice extract , transform solr documents , reload them index.


Comments