I have a permanent plain index that has a sql_file_field that holds very large PDF and Word documents. The index also has a bunch of other fields. I’d like to incrementally update the permanent plain index without re-indexing the sql_file_field, for performance reasons. The PDF and Word contents never changes, but the other fields in the index do. So far I have:
The problem is the above command overwrites the sql_file_field with the value in incremental_index; I want to update all the fields, but keep the pre-existing value of sql_file_field in the permanent index.
Any help and guidance would be greatly appreciated.
But fields need to be ‘stored’ for it work. The whole document is deleted and reinserted (using the stored data to reinsert the fields not updating!)
… so ultimately it not going to be any more ‘efficient’ that just reindexing the entire PDF documents like in merge.
The partial REPLACE would just allow to do it ‘piecemeal’ one document at a time, rather than as a big merge operation.
you could use stored_only_fields for big documents for only retrieve its content back to client. And also use attributes for all data that changes then use UPDATE statement to update attributes values
i might be worried too much about performance; I indexed an entire Staging database of documents in only 20 minutes. So, I think I should be okay with the way I have it. I noticed the newest version of Manticore beta version supports relations between indexes, so maybe in the future I can migrate to that and have a searchable index for just the PDF contents, and the main index for all the other fields.