Indexing Esri Geodatabases and Shapefiles

Dependencies

To index Esri Geodatabase items, ArcGIS and Python are required. NOTE: If ArcGIS is installed, the correct version of Python is likely to already be installed. However, if Python is not installed, it will be required to install the correct version of Python that is supported by ArcGIS:

ArcGIS 10.0 requires Python 2.6
ArcGIS 10.1 and 10.2 require Python 2.7

Indexing Data

After confirming the dependencies, it’s necessary to add a new location. Here are the steps to create a new location and some sample configurations.

Go to Manage Voyager
Click Locations
Click New Location
Click the Databases (Advanced)
Select ArcGIS Geodatabase in the Connections drop-down list
Edit the configuration.

Configuration Examples

Example 1

This configuration indexes all the tables and feature classes in an Esri File Geodatabase. It includes all the fields and maps only one field for all tables. The use of the asterisk (*) in this configuration means to include all fields, all tables and apply the mapping to all tables. It is not recommended to set the multiprocessing option to true unless the Geodatabase contains a large set of tables and these tables contain a large number of records; for large Geodatabases, multiprocessing may help improve performance.

{
    "name": "TemplateGDB",
    "type": "python",
    "config": {
        "fields": {
            "include": ["*"]
        },
    "tables": [
    {
        "name": "*",
        "action": "INCLUDE"
    },
    {
        "name": "*",
        "map": {
            "NAME ": "name"
            }
    } ],
    "multiprocessing": "false",
    "path": "C:\\GISData\\TemplateData.gdb"
    }
}

Example 2

There are two ways to index a single table or feature class.

Option 1: Set the path to the full catalog path:

{
    "name": "WORLD_CITIES",
    "type": "python",
    "config": {
        "fields": {
            "include": ["*"]
        },
    "tables": [
    {
        "name": "*",
        "action": "INCLUDE"
    },
    {
        "name": "*",
        "map": {
            "NAME ": "name"
            }
    } ],
    "multiprocessing": "false",
    "path": "C:\\GISData\\TemplateData.gdb\\World\\City"
    }
}

Option 2: Set the table name in the tables section:

{
    "name": "WORLD_CITIES",
    "type": "python",
    "config": {
        "fields": {
            "include": ["*"]
        },
    "tables": [
    {
        "name": "City",
        "action": "INCLUDE"
    },
    {
        "name": "*",
        "map": {
            "NAME ": "name"
            }
    } ],
    "multiprocessing": "false",
    "path": "C:\\GISData\\TemplateData.gdb "
    }
}

Example 3

This example demonstrates the following:

Index only tables starting with the prefix STATE or CITIES
Include only fields that are prefixed with STATE or CITY
Mapping field, STATE_NAME for all tables and map field CITY_NAME for only the City table
Use queries and constraints to limit the number of rows that are indexed. See the Usage notes below for further explanation.

{
	"name": "USA_FDS",
	"type": "python",
	"config": {
		"fields": {
			"include": ["STATES*", "CITY*"]
    },
    "tables": [
		{
			"name": "STATES*",
			"action": "INCLUDE"
		},
		{
			"name": "CITIES*",
			"action": "INCLUDE"
		},
		{
			"name": "*",
			"map": {
				"STATE_NAME": "name"
			},
			"query": "STATE_NAME = 'California'"
		},
		{
			"name": "CITIES",
			"map": {"CITY_NAME": "name"},
			"constraint": "POP1990 > 100000"
		}
    ],
    "multiprocessing": "false",
    "path": "C:\\GISData\\TemplateData.gdb\\USA"
	}
}

Usage Notes

In Example 3, the query is performed for all tables and the constraint and query are combined when indexing the City table. Therefore, the expression used when indexing the City table will be “STATE_NAME = ‘California’ AND POP1990 > 100000”. If a query was used for City instead of a constraint, then the expression would only be “POP1990 > 100000”.
A table cannot have a query and a constraint.
The asterisk (*) is used to perform wild card searches to limit tables, fields and field mapping.
Geographic information is included in the index for feature data. For point features, the X and Y coordinate location is indexed. For other geometries, the extent coordinates (bounding box) is recorded.
An Esri Shapefile can also be indexed. This can be done by setting the path option to the path of the shapefile. Other data types such as dBASE, CAD and SDC can be indexed.
After you add a location, you can edit the generalization value of geometries. This value determines how much the geometry will be simplified before being indexed. The default value is 0.5, which maintains the geometries' shape as much as possible but reduces the number of vertices in an attempt to reduce the size of the feature in the index.