By: Imad Qureshi
What is data governance?
In an increasingly data-driven world, organizations use data to remain competitive, find new opportunities, and provide better service to their customers. While using this data, organizations must comply with laws and regulations applicable to data privacy, storage, and processing and extract valuable insights. Data governance comprises a set of rules and processes that organizations adapt to manage data used to support business operations. This includes:
- People (how the data organization is structured): centralized or decentralized, data ownership and stewardship, data users, data custodians
- Processes: automation, reporting, audits
- Technology: data security, auditing, monitoring, and alerts
Why is data governance important?
Whether you are running a country, a large enterprise, or a small business, organizations strive to achieve good governance. This involves ensuring the efficient use of resources to maximize value to citizens/ customers, and generating optimum return to investors or taxpayers. To that end, we assign responsibilities, implement policies, use effective tools and technologies, and provide incentives to people in charge to achieve the desired goals. Without governance, there would be chaos.
Similarly, data governance is critical for organizations that want to maximize the value of their data by unlocking its insights. Effective data governance ensures that data resources are utilized effectively to produce a maximum return on investment; however, it is only very recently that leaders in both business and IT organizations have begun to think about the role effective data governance can play in transforming their enterprises into modern, data-driven enterprises.
Elements of effective data governance
Data governance is more than just data security. It encompasses the entire data life cycle–from data acquisition, to storage, processing, serving, and retirement.
To implement effective data governance, organizations need to assign responsibilities to the right people at the right level in the organization. Companies need to create incentives to extract insights from data, while remaining compliant with regulations, and use the right tools and technologies to make it easier to find and use data in a secured environment.
Roles in data governance
Effective data governance begins with clearly-defined data governance roles:
- Domain (e.g., customer, product, provider) data owners and stewards: own and manage data
- Data custodians (e.g., DevOps, database administrators, IT): manage data on behalf of data owners
- Data users (e.g., analysts, scientists): extract insights from data
Data owners and stewards are responsible for determining what data to collect, store, and share in the organization to drive business value, while ensuring compliance with all the regulations.
IT serves as the data custodians, rather than data owners. This distinction is important, as one of the most common challenges faced by organizations today remains the lack of proper data ownership, resulting in IT owning the data by default and using ad hoc criteria to grant data access. This creates continuous friction between the business and IT organizations–which leads to data silos, increased risk of breaches or unauthorized access, and compliance violations that could negatively impact brand image and shareholder value.
Without clearly defined data owners and stewards, organizations end up spending more time on data storage and processing and replicating the same work in multiple business units (a common complaint among users, such as data scientists and analysts, is that they spend more time looking for data than on analyzing it to find embedded insights). This leads to missed opportunities to share data, inconsistent data quality standards, and increased risk of compliance violations, as the data is scattered across the organization. Data owners and stewards need to have allocated budgets to drive digitization efforts. These leaders should be rewarded and held accountable for transforming the business into a data-driven enterprise.
Under the leadership of data owners and stewards, acceptable data quality standards are established. Data that does not meet the quality standards is quarantined, enriched, cleansed, and validated to ensure quality standards are met. Only data that meets these standards is stored in the data lake or data warehouse that drive analytics. High-quality data ensures insights can be trusted to make informed business decisions.
Metadata collection and lineage
Collecting metadata and tracking lineage enables data stewards to easily share data, create audit reports, and track root causes issues that can corrupt data. Mature data organizations collect comprehensive metadata and store it in a searchable catalog to enable users to easily find the data they need to do their jobs.
Securing data starts the moment it is acquired and continues throughout the data lifecycle until the data is retired safely. Data security includes: data encryption at rest and on wire, data access control, data sharing agreements with partners and vendors, and monitoring and alerts of suspicious activity.
Auditing is required to ensure compliance with regulatory requirements. Organizations must perform internal and external audits regularly to ensure they are in compliance and report on their state of data management. Internal audits prevent lapses in required processes, identify unauthorized data access, report suspicious activity, and close security gaps that may exist due to prior loopholes in data authorizations policies.
An effective data governance platform needs to support an organization’s analytic requirements, while ensuring robust security and governance. This platform needs to provide the capability to collect metadata and trace lineage, search data via a data dictionary or catalog, secure data access, encrypt data, and implement data quality standards. Additionally, an effective data governance platform also needs to provide reporting that can be used by the internal governance team and shared with auditors to satisfy compliance requirements.
Given the broad landscape of data governance, multiple technologies are used to implement security and governance requirements. For example, companies like Collibra, Informatica, and Alation provide tools to store and share metadata in order to make it easier to find data. Privacera, Thales, Micro Focus, and IBM provide platforms to encrypt data at rest, while companies like Talend and Informatica offer products to standardize and automate data quality.
The Privacera Platform simplifies and centralizes this process for hybrid and multi-cloud infrastructures by offering automated data discovery, fine-grained access controls, a sensitive data catalog, field-level encryption, and comprehensive audit reports to ensure compliance — all as part of a unified platform.
Stay tuned: in our next blog, we will discuss the challenges of implementing effective data governance in the cloud and how Privacera helps accelerate this process securely. Have questions or comments? Contact us for more information or schedule a demo today!