Document Transformers

Document Transformers are field-mapping functions that are part of the Indexing Pipeline, along with Geotagging and Metadata Extraction. Document Transformers allow you to manage and configure metadata during the indexing process. You can create, modify or duplicate fields as well as set field values using transformation functions.

IMPORTANT


Document Transformers are case-sensitive. When you are configuring a Transformer, the field names must match the case and text of the existing fields EXACTLY.  Mismatched cases can lead to duplicate field names as well as other serious errors in subsequent queries. These errors may be extremely difficult to diagnose and track down after the fact, and may ultimately require re-indexing your data after the issue has been resolved, so it is critical that you verify the field name and case before proceeding with any queries.


Configuring Document Transformers

To configure Document Transformers:

  • Go to Manage Voyager > Discovery > Pipeline

  • Select Document Transformers

  • Choose a Transformer from the Document Transformer drop-down menu

  • Currently configured Transformers will be processed in order from top to bottom during indexing. 

  • For each entry, you can move it up or down, test it, edit it or delete it.

Copy Field

This transformer copies the contents of an existing index field (source) into another field (target). The source field is not modified.

Parameters

  • Field – The name of the source field that will be copied. This field should already exist in the index.

  • Destination – The name of the target field that the source will be copied into. This field may or may not exist in the index. If it does not exist, the field will be created during indexing. The field name should not have any spaces, and should be have the appropriate prefix for the target field data type. 

  • Skip If Exists – This will prevent copying data to a target field if it already contains data. 

  • Append Field – This appends values from the source field to the values in the existing target field.

  • Warn on Replace – Displays warnings when target field values are replaced.

Move Field

Moves the value from one field (source) into another (target). The source field is then removed from the index. In essence, this is equivalent to renaming a field.

Parameters

  • Field – The name of the source field that will be copied. This field should already exist in the index.

  • Destination – The name of the target field that the source will be copied into. This field may or may not exist in the index. If it does not exist, the field will be created during indexing. The field name should not have any spaces, and should be have the appropriate prefix for the target field data type. 

  • Skip If Exists – This will prevent copying data to a target field if it already contains data. 

  • Append Field – This appends values from the source field to the values in the existing target field.

  • Warn on Replace – Displays warnings when target field values are replaced.

Set Field

Set Field sets the value of a field in the index (target).

Parameters

  • Field – The name of the source field that will have its value set. If this field does not already exist in the index, it will be created.

  • Value – The value of the new field in the index.

  • Skip If Exists – If checked and values already exist in the target field, values will NOT be added from the source field.

  • Append Field – If checked, values from the Value entry will be appended to values in the existing target field.

  • Warn on Replace – If checked, and values are replaced (not appended) in the target field warnings “indexWarning” (indicating a replaced happened) and “indexingErrorTrace” (tracing the nature of the replacement) are added as facets to the index.

Split Field

Split Field splits the values of a string field (target) into an array of (string) values at each of one or more delimiting characters.

Parameters

  • Field – The name of the source field that will be split along each character in the subsequent list of delimiters. This field must exist in the index for this transformer to be effective.

  • Delimiters – A raw list of delimiters used to split the (target) field. The target field is split by each character in the list.

Drop Field

Drop Field removes a field (source) from the index.

Parameters

  • Field – The name of the field to be removed from the index. This field must exist in the index for this transformer to be effective.

Add Data Quality Warning

Add Data Quality Warning is a transformer that populates the empty_file and zero_rows filters for the Debug Properties index field. This allows a user to quickly find source content that needs additional review for potential removal from the index. 

Normalize Spatial Reference

Normalize Spatial Reference can translate multiple spatial reference naming conventions into a common spatial reference system name (e.g. WGS84) in the target spatial reference system field. For example, one dataset's spatial reference naming convention uses GCS WGS 84 and the next dataset uses the convention WGS-84. This transformer would rename both to WGS84 in the target spatial reference field.

Remove HTML Tags

Remove HTML Tags removes hypertext markup tags from a target field. Removing markup can help clarify the content of rich content in Voyager results sets. If this function is used after a Copy Field transformer, the original HTML is preserved in the Copy source for display where appropriate.

Parameters

  • Field – The field that contains the hypertext markup that users wish to have stripped. This field must exist in the index for this transformer to be effective.

Append All User Tags

Append All User Tags is a transformer used to reassociate all tagging updates made to the index. In the event that all or part of the index is cleared and rebuilt this setting will reapply user-applied Tags, Flags and Field Editing so that no tag information is lost.

Transform with Javascript

Applies Javascript as the transformer action.  Enter your Javascript in the box provided.

Document Transformer Profiles

You can also combine one or more of these in profiles that can be applied globally or to a specific location(s).  See Document Transformer Profiles for more information.

See Also