Voyager supports metadata extraction from standard XML documents using XPath queries. It not only supports many standard metadata specifications out of the box, but also allows you to enter your own XPath queries to specific metadata elements and map them to searchable field names within Voyager's index. These field names can exist already, or be created on the fly. This topic provides an overview of the Voyager Metadata Extraction page, explains how to define XPath queries to metadata elements, and how to specify field mapping parameters.

To access the Metadata Extraction page, open Voyager Server’s Manage UI and go to Manage > Discovery > Pipeline > Metadata.

Testing Your Mapping

To map the fields, configure these parameters:

  1. Choose the Selector: This specifies XPath query to a specific metadata record element to be selected.

  2. Enter the Field Name: This is the target field in Voyager that gets mapped to the specified metadata output.

  3. Confirm the Type: This refers to the data type of the field name. For example, if field name is set to “name”, data type automatically gets set to “text.”

  4. Choose an Action: Users can select from five different functions:

  5. Converter - Converter settings are optional and if the user does not specify one, Voyager, by default, assigns an appropriate converter to the field.

 6. Properties

Using the XML Box

The XML box allows you to enter in an XML document to test your XPath queries to paired elements.

Step 1: Click the XML tab and paste the contents of a valid XML document here. Click Save to save the XML contents.

In this case, the element we want extracted from the XML tab is City

Step 2: Specify values for Selector, Field Name and Action.  

Since we want to extract the field City, we copy the XPath Query from the XML document into the Selector box.  "/metadata/metainfo/metc/cntinfo/cntaddr/city"

Step 3: Specify the corresponding Field Name that the queried element is mapped to. Voyager automatically detects the (Data) Type for the Field Name.

For example, here the Field Name is City, whose data type is String.

NOTE: when selecting a field name you'll need to either select an existing field name or you can also enter a custom field name as long as it uses a prefix "meta_", "id_".

Step 4: Click Test. The extractor searches the XML document for the queried metadata element, and retrieves the value for the field City. The results are presented in the Output tab.

In this specific example, "Washington D.C.", which is the value for the City query. is retrieved from the XML tab and displayed in the Output tab. When included in the index in this way, users can use this output result to search for XML documents through Voyager's search UI. 

Step 5: Click Save to add the XPath query to the list.