Adding Natural Language Processing to a Pipeline Step

If your data includes text records of any kind, including things like PDFs, you can use Natural Language Processing (NLP) in a Pipeline step to fine-tune Voyager's ability to identify most-relevant content.

Before you begin, make sure that the NLP Service is enabled.  You may need to download the NLP Python library and configure the NLP Service if you have not already done so. 

To add NLP to the Indexing Pipeline for a location:

  1. Go to Manage Voyager > Discovery > Locations

  2. Click the Edit icon to the right of the location to which you'd like to add NLP

  3. Select the Pipeline tab

  4. Uncheck Use Default Pipeline Configuration

  5. Click Add under First Steps

6. Select nlp-worker from the Step drop-down menu

7. Click Add

8. The First Steps will display voyager.pipeline.steps.RunExternalPythonStep nlp_worker

9. Click Save at the bottom of the dialog to close the Pipeline tab.

You will need to rescan or reindex that location in order for the NLP results to be incorporated into Voyager's relevancy calculations.