An Overview of Unity Catalog Data Governance Software
Unity Catalog is a powerful governance solution for all AI and data assets including files, tables, and machine learning models in your Databricks lakehouse on any cloud. With Databricks / Unity Catalog access control, users can now easily access the same database and table from any workspace by using both Database SQL Endpoints and Databricks Spark Clusters. Unity Catalog access control features also provide rich APIs to enrich and automate the management of access policies.
Security and Access Governance
Within the enterprise, Security and Access Governance needs to be implemented consistently across your Databricks Lakehouse, Unity Catalog, and all your data and analytics sources. This solution also needs to keep up with the ever-changing regulatory compliance landscape. Doing It Yourself (DIY) and maintaining a secure yet accessible data environment can be daunting. Privacera makes all this possible by centralizing the management of permissions and policies across all your data sources and tools, providing uniform management for data access, security and auditing.
Simplified Data Access Management for Data Stewards, Owners, and Project Managers
Business teams need to make timely and confident decisions when sharing data. Privacera’s Governed Data Sharing (GDS) brings a paradigm shift in how access management and compliance is implemented to support timely and confident sharing, allowing access to be managed at the Dataset rather than the individual resource (table/column) level. This allows grouping of various resources under a single Dataset for simplified policy creation and application giving Data Stewards, Data Owners, and Project Managers the ability to provide Data Analysts, Data Scientists and other users discrete access. This simplified access workflow to manage policies can help ease governance from a central UI access point for users/groups across platforms and data silos, providing significantly faster user onboarding and application of policies to new data.
Behind the scenes, GDS automates the entire process by breaking down rules at the data level into tag-based policies, attribute-based policies, and resource-based policies to enforce complex compliance requirements. An example of this would be restricting access to data records in tables and masking/de-identifying sensitive or PII fields that users don’t have the privilege to view. In addition to building policies, Privacera keeps track of who has made changes to them and users who have accessed the data for instant audit reports to meet compliance and security requirements.
PolicyOps and Automation Support
Privacera is designed to work as an intuitive user interface to manage access, and also provides a rich API based on the power of open source Apache Ranger to codify the management of access policies. Thousands of customers have used Apache Ranger’s REST APIs to manage access policies through automation using YAML, Git and other tools to implement a robust and highly scalable PolicyOps solution.
How does Privacera Integrate with Unity Catalog Access Control and Data Governance Software?
Privacera supports dozens of integrations with our open standards based platform. As one of the first Databricks partner integrations, Privacera integrates well with Unity Catalog, translating key high-level access policies into complex Unity Catalog constructs. This reduces the complexity of managing these constructs and eases the implementation of access governance as you apply these policies across your data sources. With the current integration with Unity Catalog, Privacera supports the following:
- Attribute-Based Access Control (ABAC)
- Tag-Based Access Control (TBAC)
- Role-Based Access Control (RBAC)
- Resource-Based Access Control
- Column Level Access Control using ABAC, TBAC, RBAC, and Resource-Based Policies
- Dynamic Unity Catalog Row Level Security and Access Control using ABAC, TBAC, RBAC, and Resource-Based Policies
- Dynamic Unity Catalog Data Masking using ABAC, TBAC, RBAC, and Resource-Based Policies
- Simplified Approval Workflow
- Integration with Active Directory, Azure AD, and Okta
- Centralized Auditing and Reporting
Use Case #1: Dynamic Column Masking using Tag-Based Access Controls
Due to compliance reasons, your data analysts may not have the privilege to view a customer’s email address regardless of which table or column contains it. This use case can be implemented using the combination of Unity Catalog and Privacera.
- Privacera’s Discovery tool scans and tags all columns which have email addresses.
- The governance team creates a tag-based access policy in Privacera where only certain groups from Active Directory have access to email addresses, while for others it is hashed.
- Privacera creates a single Secure View in Unity Catalog which translates the tag-based policy for the column by using case statements to apply Unity Catalog’s inbuilt User Defined Functions (UDF) for users who don’t have the privilege of the email addresses.
- When the users run the queries for any table which contains columns with email, the appropriate UDF will be applied automatically as needed.
In this case, we are defining the policies in the Privacera UI and running the query using DatabricksSQL Endpoint.
Use Case #2: Dynamic filtering of records using Attribute Based Access Controls
In this case, there is a global sales table with sales data from across the world. Data Analysts are only supposed to see data from the country they belong to. This policy can be applied more appropriately by using attribute-based access controls. Privacera extends Unity Catalog to enforce this policy by following these steps:
- In the corporate Active Directory, there is an attribute for users which contains the country the user belongs to. This could also be a group attribute.
- The Data Steward creates a Dynamic Row Level Filter policy using User Attribute.
- For the table, Privacera either reuses an existing secure view or creates a new secure view if needed and adds the complex predicate which filters only sales records of the country the users belong to.
- When the users run the query on the table, only the sales records of the customer(s) associated with that user (by country) will be retrieved.
In this case, we are defining the policies in Ranger YAML format, which will is used by Privacera’s PolicyOps to automate the management of the policies.
What’s Next?
Unity Catalog will be releasing native enforcement of attribute-based access control (ABAC), tag-based policies, dynamic column masking, external functions for UDF and other low level enforcement integration hooks. As these features are made available to organizations, Privavera will further simplify the management of these policies by translating the enterprise’s global policies into Unity Catalog native enforcement layer.
Conclusion
Using Privacera, you can simplify and automate the management of policies with Unity Catalog access control while meeting complex Security and Governance requirements across your entire data landscape. Streamline compliance and accelerate data sharing to better enable data science projects with Databricks + Privacera.
Want to learn more? Schedule a demo with us to see Unity Catalog and Privacera in action.