Databricks
PRIVACERA FOR DATABRICKS

Continuous Security, Privacy and Governance for Databricks Environments

Enable consistent governance and security across your machine learning and artificial intelligence workloads.

Deep Data Discovery & Classification

Privacera automatically profiles and scans data in Amazon S3, Azure Data Lake Store (ADLS) as well as across tables/schema created in Databricks. The files and tables are tagged and the data classifications are stored in the Privacera catalog.

Access Management and Fine Grained Access Control

Privacera leverages Apache Ranger to enable row, column and file-level access control and to enforce centralized access policies across Spark SQL and other workloads in Databricks.

Data Anonymization and Masking

Privacera de-identifies sensitive data with masking or encryption methods. Data can be anonymized before it lands in cloud storage or Privacera can dynamically de-anonymize the data based on user-level policies when it is accessed in Databricks or other services.

DataBricks Technology Partner

Row, column and file-level access control in Apache Spark.

Deep data discovery and classification for data governance.

Comply with privacy and security regulations such as CCPA, GDPR and others.

Balance Governance and Security with the Need to Use Data for Analytics

Privacera integrates seamlessly with Databricks at the infrastructure level and provides continuous security and privacy across the stack, including enabling data anonymization and masking for analytics.

Integrate with Databricks and Cloud Services

Privacera natively integrates with Databricks, Amazon S3, Azure Data Lake Store and other cloud services.

Row, Column, and File-Level Access Control in Spark

Privacera leverages Apache Ranger-based plugins to provide column, row and file-level access control across Spark functions.

Scan and Profile All Data in Databricks and Cloud Storage

Privacera scans and profiles any new data landing in cloud storage and across databases created in Databricks. Privacera runs on a Databricks cluster and uses machine learning and rules to accurately identify specific data types and apply tags.

Anonymize and De-Anonymize Data in Databricks

Privacera enables compliance with privacy and security regulations by anonymizing sensitive data as it is stored in the cloud and accessed using Databricks. The data can be de-anonymized for select users based on policies.

Frequently asked questions

Does Privacera work with Databricks?

Privacera plugins, based on Apache Ranger, can enforce fine-grained access control in Databricks and Apache Spark. Privacera plugins are automatically initiated when a Databricks cluster is started.

Does Privacera access management add any performance overhead?

Privacera differs from other solutions that try to manage data requests from Apache Spark and access data on behalf of the service. Privacera’s lightweight access enforcement points quickly check a request and let it process if there is a corresponding policy granting access.

Is Privacera integrated with Hive metadata store and Glue?

Privacera works across any metadata store for Databricks, including Hive metadata stores and AWS Glue. Privacera can also enable tag-based access policies based on data classifications.

Resources & Latest News

Whitepaper

Security and Privacy for Modern Data Platforms

Learn how to enable comprehensive security, privacy and governance in big data and cloud environments using Privacera.

Get Started Today

Contact us to learn more about Privacera for Databricks and get a FREE risk assessment.