Data Catalog Vocabulary (DCAT) Connector
The Data Catalog Vocabulary (DCAT) is a standard developed by the W3C (World Wide Web Consortium) designed to facilitate interoperability between data catalogs on the web. It enables the discovery and reuse of datasets by providing a common vocabulary for describing datasets and data catalogs.
See full DCAT specification here: https://www.w3.org/TR/vocab-dcat-3/
How to index DCAT:
The content of DCAT can be indexed by creating a DCAT Repository. The connector will create entries for these DCAT elements:
Catalog: Represents a data catalog, which is a collection of datasets. It includes metadata about the catalog itself.
Dataset: Represents a collection of data, published or curated by a single agent, and available for access or download.
Distribution: Represents an accessible form of a dataset, such as a downloadable file, an API, or a web service.
Â
DCAT Repository provides these parameters:
Name - Name of the repository
URL - Url of the catalog. E.g. https://ct-deep-gis-open-data-website-ctdeep.hub.arcgis.com/data.json
Limit - Maximum number of documents to index
Â
Advanced parameters:
File Size Limit - Maximum size of file, that can be downloaded and further extracted by HQ (in megabytes).Â
Index Distributions - Whether to index each DCAT Distribution object (e.g. linked files). If unchecked these files will not be further extracted and only the DCAT Datasets objects will be indexed.