For customers who do not use Libre office, the java/poi extractor can be used to extract content from the most common office document formats. The supported formats are: doc, docx, ppt, pptx, xls, xlsx.
Enable the java/poi extractor for office documents:
In HQ, select the Formats menu on the left
Search for doc
Select Microsoft Word Document
Select the Extractors tab
For java/poi, select the move up option until it is at the top (if java/poi is missing, see below)
Repeat this process for the following formats: docx, ppt, pptx, xls, xlsx
Select the System menu on the left
Select, Restart/Shutdown
Restart HQ
NOTE: If the java/poi option is missing from the menu, use the following procedure:
On the file system, navigate to HQ_HOME/config
With HQ stopped, delete or rename mimes.json
Start HQ and enable java/poi
Additional information / troubleshooting:
If any changes have been made to the format settings previously, deleting mimes.json will cause those changes to be lost. If losing those changes is not an option, the java/poi extractor can be added to the mimes.json file manually, if required. Please note, this is an advanced procedure and can cause HQ to not load, if done improperly. It is recommended to copy the original mimes.json file to a new location before deleting it. Using the old mimes.json file as a reference, add the settings to HQ using the UI as described above.
In some cases a rebuild of the mimes index may be required. This can be done using the mimes api rest endpoint).