The United States federal government has some of the largest amounts of data of any entity in the world. Data is one of its greatest assets and, by the same token, one of its biggest liabilities. Much of this data is extremely sensitive. There are endless possibilities for harm if it gets into the wrong hands. Additionally, the public sector is highly regulated with constant audits. Non-compliance results in costly penalties and potential public security issues.
The Contradictory Mandates
The dichotomy between analyzing this data to fulfill mission critical objectives and the inherent risks of accessing it leads to conflicting objectives throughout the federal space indicative of those throughout the public sector. On the one hand, data democratization is realized through a centralized data platform that’s a single place where all users access data and accompanying resources (like compute) for any application. It’s designed to eliminate silos and make it easy for analytics users to pull data securely. Almost every federal agency has one; in the Centers of Medicare and Medicaid Services (CMS) it’s called EDL, in the Department of Defense (DoD), it’s called Advana.
On the other hand, many data security and control requirements contradict the ends of the central platform concept because of the sensitive nature of the data. The Centers of Medicare and Medicaid Services, for example, must preserve personally identifiable information (PII) and personal health information (PHI). The DoD has highly sensitive data. Thus, no matter how much federal entities want to migrate to the cloud for a unified platform, security and compliance mandates are a clear conflict of interest in doing so.
Balancing these opposing interests so that they’re both met—data is readily accessible via the cloud and equally secure while doing so—requires the combination of centralized data access governance and decentralized enforcement in which policies are created once, yet consistently implemented and enforced in any source or setting.
Resolving the Conflicts
Privacera’s centralized data access governance and privacy framework uses the approach of securing data assets at the data level to resolve the government’s conflicting data interests in several ways. The first is by a flexible architecture designed for cloud services that resonate with this space because many federal entities are familiar with Apache Ranger––an open source security and authorization solution adopted by Fortune 500 companies––and have existing Ranger policies. Due to Ranger’s robustness and proven scalability, Privacera has chosen Apache Ranger as its underlying engine for access control and has advanced its capability with many out-of-the-box features. With native integrations with all modern cloud data platforms like Databricks, Snowflake, BigQuery, Azure Synapse, Redshift and more, the Privacera architecture is perfect to layer into federal compute platforms, enabling single-pane, centralized policy management with distributed, native enforcement of data policies and access in individual cloud data systems. The distributed policy enforcement swiftly authenticates users to support the performance of thousands of users simultaneously accessing and querying data at petabytes/terabytes scale. This fortifies authentication, access controls, and data governance to enhance security within individual data sources, operating at the data (instead of the perimeter) level. This is key for satisfying the dual mandate of data democratization with a light architectural footprint in sources that supports performance needs while delivering the security for dependable access controls and regulatory compliance.
Cost Effective Compliance
With other approaches, scaling to meet governmental challenges of expansive repositories and data quantities is cost prohibitive. Such scalability should be affordable enough to deploy across all systems while sustaining performance requirements. Traditional approaches involving customized cross-domain security solutions aren’t scalable because they cost upwards of $10 million; many also result in silos. Privacera is a more cost-effective solution via its automation, which is founded on building policies once in a centralized solution and automatically applying them into any source. For example, platforms like Databricks, S3, and Snowflake have unique ways of administering security. With Privacera, users can efficiently enforce the same policy into each of them according to their mechanism, versus building security for each one.
Additionally, regulatory requirements in the federal space are unparalleled. The surplus of data systems and regulations means audits—requiring reports and justification of actions—are always occurring. Privacera significantly reduces the time and cost of audits by simplifying compliance measures. It creates log files every time anyone attempts to access data across its central platform—whether they were given access to requested data or not. These auto-generated reports document the applicable policies for approval or denial of access, giving administrators, for instance, real-time visibility into their clusters.
Closing Remarks
The public sector’s data security issues are characterized by massive quantities and sources of data that either are currently or soon will be accessible via the cloud. To learn more on how Privacera helps our public sector customers overcome the various challenges in their cloud transformation, please read our latest white paper.
Learn more about Privacera here, or contact us to schedule a call to discuss how we can help your organization meet its dual mandate of balancing data democratization with security to maximize business insights while ensuring privacy and compliance.