As companies migrate their analytic workloads to the cloud in shared environments, there is an active debate between the various approaches to implementing data access controls — specifically whether attribute-based access control (ABAC) offers superior security and governance than role-based access control (RBAC).
Because modern data infrastructures are becoming increasingly sophisticated and complex, the effectiveness of IT solutions cannot be based on just one criterion — leaving the ABAC versus RBAC debate misguided for many practitioners.
Which data access control solution is the best?
The answer to this question depends on which approach is most efficient for your business.Various solutions currently on the market can only be compared by their ability to empower administrators to make data widely available in their organizations and comply with industry and privacy regulations without adversely affecting the performance of the data platform.
Organizations’ use cases, existing infrastructure, experience with various cloud environments, and the user learning curve are only a few variables that make the RBAC vs. ABAC debate one dimensional. Companies in the middle of digital transformation projects are less concerned about whether the solution is based on role- or attribute-based controls and more concerned with how quickly they can migrate existing data and access control policies to the cloud.
But let’s start with the current state of the market and a few misconceptions. The majority of companies migrating to the cloud are implementing hybrid architectures. These companies either have an on-premises implementation of a traditional database, like Teradata, DB2 Oracle, etc., or they have a Hadoop-based data lake they want to migrate to the cloud to take advantage of reduced operational burden, increased agility, and elasticity.
It may not be common knowledge that Apache Ranger is the original ABAC solution for heterogeneous data services,and this blog published by the Apache Software Foundation in 2017 attests to this fact. In fact, Ranger started its journey as an open source, ABAC solution. In addition to empowering data administrators to define access policies based on roles and users, Ranger also offers the flexibility to authorize policies based on a combination of subject, action, resource, and environment. Using descriptive attributes such as AD group, Apache Atlas-based tags or classifications, and geo-location, Ranger provides a holistic approach to data governance that encompasses both ABAC and RBAC approaches.
Data Access Control Must Easily Integrate With Existing Business Processes
As data accumulates, it becomes increasingly difficult to move; this is also true for data access policies. Companies invest significant effort and resources to build unique access control policies over a long period of time. In fact, these access policies are a part of, and reflect, they way they do business. Therefore, it is unreasonable to expect companies migrating to the cloud to drop all of their existing data governance policies and start from scratch to recreate them in the cloud. To the contrary, when we speak with customers, they routinely ask how to leverage their on-premises data lake access policies for their cloud environments.
Can your choice of access control accelerate secure data sharing?
A more practical way for data administrators to think about this debate is to determine if the solution they are using helps accelerate access to data to data analysts and data scientists with proper controls in place.
Among the data lakes built by utilizing big data vendors like Cloudera, Hortonworks, and MapR, Apache Ranger has been the predominant mechanism of administering access control. Ranger has been effectively deployed at thousands of companies to define, authorize, and administer access control policies from a single-pane across various open-source compute engines, including: Apache Spark, Apache Hive, and Apache Kafka.
According to a McKinsey survey, 69% of organizations cite that “implementing stringent security guidelines and code review processes can slow developers significantly.” This is especially troubling, because one of the primary drivers of cloud migration is the speed and agility of data access and analysis.
To avoid delaying data access to data analysts and scientists, rewriting data access policies from scratch, or retraining users on a new platform, companies should use their existing Ranger policies in the cloud. By doing so, they will enable faster user onboarding; fast, secure access to data for analysis; and most importantly, faster access to the benefits of digital transformation.
Unlocking business acceleration in a hybrid cloud world, McKinsey Digital, Aug 5th, 2019.
Contact us today to schedule a demo with one of our technical experts.