The 6 Types of Metadata and their Use Cases (+ Intro to Active vs Passive Metadata)
Share this article
Metadata is the “what, where, when, how, and who” of data.
According to Gartner, metadata is information that describes various facets of an information asset to improve its usability throughout its life cycle. It unlocks the value of data by helping to answer the “what, where, when, how, and who” of data.
Modern data problems require modern solutions - Try Atlan, the data catalog of choice for forward-looking data teams! 👉 Book your demo today
There are several types of metadata. For instance, NISO (National Information Standards Organization) classifies metadata as descriptive, administrative, structural, and markup languages. The author of The Data Warehouse Toolkit Ralph Kimball identifies three types of metadata — descriptive, structural, and administrative.
The types of metadata represent its use cases. That is why we will focus on six types of metadata that translate into essential use cases for modern data organizations.
Are you curious about how to effectively use metadata for various use cases? We’ve gathered the best insights from our team. Check it out here.
Types of metadata
These are the six main types of metadata:
- Technical metadata
- Governance metadata
- Operational metadata
- Collaboration metadata
- Quality metadata
- Usage metadata
Table of contents
- Types of metadata
- 1. Technical metadata
- 2. Governance metadata
- 3. Operational metadata
- 4. Collaboration metadata
- 5. Quality metadata
- 6. Usage metadata
- Summing up on types of metadata
- Types of metadata: Related reads
1. Technical metadata
According to the University of Warwick,
Technical metadata provides information on the technical properties of a digital file or the particular hardware and software environments required in order to render or process digital information.
So, technical metadata can include information about the rules, structure, and format for storing data, such as the location, data source, row or column count, data type, and schema.
For data to be discoverable and useful, it must have a logical structure and properties such as data type, name, size, owner, etc. Schemas, an example of technical metadata, store this information.
2. Governance metadata
Governance metadata provides information on how data is created, stored, accessed and used. So, it includes governance terms, data classification, ownership information, etc. With governance metadata, you can control how your data can be used, who can access them, and for what purpose.
For instance, you can set up user restrictions for PII data stored in Snowflake and define who can access what, and how the data can be used. This helps you ensure the security and credibility of data while guaranteeing regulatory compliance.
3. Operational metadata
According to IBM,
Operational metadata describes the events and processes that occur and the objects that are affected.
So, operational metadata tracks everything related to the flow of data throughout its lifecycle. So, it can include information on the flow of data such as dependencies, code, lineage, ETL logs, and runtime. It adds an extra level of detail to data repositories and ETL processes.
Data provides value only if you can trust it. For that to happen, you should be able to have end-to-end visibility on everything from how data was ingested to the various forms in which it was used. That’s where operational metadata can help.
4. Collaboration metadata
Collaboration metadata is social metadata that contains insights on conversations around data. This includes data-related comments, discussions, chat transcripts, tags, bookmarks, and issue tickets.
For instance, understanding an unknown column within a table in the regional customer’s report could be an issue that several people have raised via Slack chats. Collaboration metadata would help chronicle such information so that your data team can update the column within the report and notify everyone who reported the issue.
5. Quality metadata
According to an exploratory study on data quality metadata for decision-making,
Quality metadata is ”information about the quality level of stored data in organization databases, and is measured along different dimensions such as accuracy, currency, and completeness.”
So, Quality metadata can include quality metrics and measures, such as dataset status, freshness, tests run, and their statuses.
For instance, if you’re studying multiple tables on customer order-related data, you can look at quality metadata such as last test status, runtime, and percentage of test success to determine whether the data is credible.
Modern data catalogs also attribute a certificate of verification to tables, making it easier for data consumers to pick the right data and rely on it for decision-making.
6. Usage metadata
Usage metadata records information about how much a dataset is used. So, it can include an asset’s view count, popularity, top users, frequency of use, and more. Usage metadata can be used to understand how an asset gets consumed, look for access patterns, and get rid of data that isn’t used much but is taking up warehouse space.
For instance, you can look at access patterns — how many times was that asset accessed, when, and by whom — to gauge whether a certain asset is popular. Let’s assume that two members of your marketing team have run several queries for the table containing customer information.
After studying usage metadata, you realize that your marketing team analyzes this table at the beginning of each quarter. You can work on making the data more accessible and comprehensive for the marketing team, and get their input to optimize it further. This promotes greater efficiency, productivity, and cost savings.
While managing metadata, you can think of the different types of metadata listed above as active or passive metadata.
Types of metadata management:
- Active metadata
- passive metadata
Passive metadata is the metadata collected and managed via manual processes. It is static and requires human effort to curate and document. As a result, it doesn’t offer complete visibility into what’s happening inside data pipelines in real time.
Meanwhile, active metadata is data that defines data, and everything that happens or is done to it. Unlike passive metadata, active metadata is captured from sources in real-time, so data practitioners and business leaders can easily identify, track, manage, trust, and understand data assets.
Read more on -> Active metadata
Summing up on types of metadata
Understanding the various types of metadata that exist within your organization is the first step to documenting its use cases and setting up an ecosystem that leverages them.
After documenting the types of metadata, the next step is to ensure effective handling with an active metadata management platform like Atlan.
Atlan is built for the modern data stack, clearly shifting from the traditional metadata management to the new era of: “Collaboration-focused, intelligent, and open by default”.
Modern data problems require modern solutions - Try Atlan, the data catalog of choice for forward-looking data teams! 👉 Book your demo today
Share this article