Metadata Harvesting
Metadata harvesting is the process of automatically discovering, extracting, and cataloging metadata from data assets across an organization’s ecosystem. It creates a comprehensive view of the data landscape, capturing critical details such as schema, relationships, data lineage, and usage patterns. This process ensures that metadata remains current and accessible, enabling organizations to identify, classify, and understand their data assets more effectively.
Metadata Bridges
An unrivaled array of 230+ deep linking connectors to databases, file systems, streams, BI tools, dictionaries, repositories, and data integration tools. Explore the full list of connections.
Profiling and Sampling
Harvesting includes column-level counts, patterns, inferred data types, and value frequencies. Model-level summarization includes statistics and charts.
Logic Parsing
Parsers for SQL, Python, Scala and data integration tools (such as PowerCenter, DataStage, Talend) are built into the bridges and contribute to the harvested metadata which is used for lineage.
Incremental Harvesting
Source change capture detection is built into the bridges, making it possible to sync metadata on a massive scale.
Auto Classification
A data classification learning methodology is used for the identification, auto-tagging, and proposal of classification for data assets.
Remote Harvesting Agents
Agents are a configuration option that supports hybrid and cloud architectures by separating the collection agent and bridge from the MetaKarta server.