Wednesday, April 17, 2024

Data Governance In Google Cloud Platform

Don't Miss

Evolving With Google Cloud

Data Governance in a Cloud-First World (Cloud Next ’19)

Google Cloudâs approach towards data security and acquiescence is inclusive and ethical, which organizations can rely on to increase the usage of the Google Cloud beyond a BigQuery service. A successful data governance policy paves the way for organizations to institute control and retain visibility into their data resources, creating a competitive edge over their peers. It also generates immense business value by instituting a data-driven practice that further enables them to improve decision-making, manage risks better, and stay on top of regulatory compliances.

Data Discovery Classification And Data Sharing

Ability to find data easily is crucial to enable an effective data driven organization. Data governance programs leverage data catalogs to create an enterprise repository of all metadata. These catalogs allow data stewards and data users to add custom metadata, create business glossaries, and allow data analysts and scientists to search for data to analyze across the organization. Certain data catalogs also offer users to request access within the catalog to data which can be approved or denied based on policies created by data stewards.

Google cloud offers a fully managed and scalable Data Catalog to centralize metadata and support data discovery. Googles data catalog will adhere to the same access controls the user has on the data . Further, Googles Data Catalog is natively integrated into the GCP data fabric, without the need to manually register new datasets in the catalog the same search technology that scours the web auto-indexes newly created data.

In addition, Google partners with major data governance platforms e.g. Collibra, Informatica to provide unified support for your on-prem and multi-cloud data ecosystem.

What Are The Benefits Of Data Governance

Make better, more timely decisions

Users throughout your organization get the data they need to reach and service customers, design and improve products and services, and seize opportunities for new revenues.

Improve cost controls

Data helps you manage resources more effectively. Because you can eliminate data duplication caused by information silos, you dont overbuyand have to maintainexpensive hardware.

Enhance regulatory compliance

An increasingly complex regulatory climate has made it even more important for organizations to establish robust data governance practices. You avoid risks associated with noncompliance while proactively anticipating new regulations.

Earn greater trust from customers and suppliers
Manage risk more easily

With strong governance, you can allay concerns about exposure of sensitive data to individuals or systems who lack proper authorization, security breaches from malicious outsiders, or even insiders accessing data they dont have the right to see.

Allow more personnel access to more data

Strong data governance allows more personnel access to more data, with the confidence that these personnel get access to the right data and that this democratization of data does not negatively impact the organization.

Recommended Reading: Safelink Free Phone Replacement

Cloud Migration And Data Governance

A data governance approach is a set of processes and procedures that enable organizations to manage their data in a secure and useful way throughout its life cycle, from acquisition to usage to disposal. It is important for the organization to consider the entirety of the data process chain which includes, data acquisition, analysis, insights generation, and deployment.

In fact, this entire process needs to adhere to the highest levels of governance across the organization to be truly effective. An effective data governance program integrates a central body or commission, a determined set of practices, and a plan to carry out those procedures. Therefore, it includes the means that people, processes, and technology could act together to facilitate auditable acquiescence with described and agreed-upon information policies.

Good data governance has several benefits. It can instigate customer trust and result in vast developments in user experience. However, with organizations rapidly moving to a cloud-based architecture, there are new data governance challenges. Since organizations generate and use data at extraordinary rates, the diversity of information and sources requires them to deal with information access, security, and control, governance, along with regulatory compliance.

Data Classification And Management

Cloud data governance  Collibra

In an enterprise data warehouse, the data volume, velocity, and variety beingingested can create challenges for data classification and management.

Data classification is the process of categorizing data into types, forms, orcategories by using their distinct characteristics. Being able to classify yourdata effectively is essential for you to apply appropriate governance policiesto different types of data.

For example, depending on the content of a business document, you can classifyit with a sensitivity level such as unsecured or confidential. Each of thesetypes can then have specific policies for their managementfor instance,confidential documents are accessible only to a certain group of users and are retained for 7 years.

Within data management, you have several aspects to consider:

  • Managing data change: Controlling the effects when data dimensionschange. Although they change infrequently, the modification can have aripple effect because the data in fact tables might no longer be true forthe updated dimensions.
  • Managing reference data: Ensuring that all systems across yourorganization have an accurate and consistent view of your reference data.
  • Retention and deletion: How to prevent users from modifying or deletingdata, and how to expire data automatically.

Read Also: Goverment Jobs In Nevada

Data Governance In The Cloud

As cloud adoption accelerates, questions inevitably arise about how it impacts data governance. Enterprises have concerns that:

  • Their data will be secure:Businesses may be concerned about storing data in the public cloud. They are still responsible for controlling data governance on the data for their on-premises systems, but need to know that their cloud provider will protect against its exposure or theft when it is stored in the cloud.
  • The cloud vendor will comply with regulations: Enterprise compliance officers and data stewards responsible for adhering to regulations and standards need to feel confident that their cloud vendor will also adhere toGDPR,CCPA,PCI DSS,HIPAA, and other regulations and need to provide them with tools to help vendors comply when their data is in the cloud.
  • They will have visibility and control: Public cloud providers know that their ability to help with data governance can inspire customer trust and massively enhance the customer experience. As a result, leading cloud vendors offer tools for data assessment, metadata cataloging, access control management, data quality, and information security as core competencies to companies that use their platforms.

Our Shared Security Partnership

Looker can connect to your organizations database and is designed to leave your data in that database, or as otherwise instructed by you. Because Looker connects to technology that you are responsible for maintaining, security is a shared responsibility between Looker and you. We publish our security posture and best practices publicly here. If you use embedded analytics functionality, Looker has developedsecurity best practices you can leverage to help mitigate security concerns.

Application data shared by Looker

While there is no permanent storage of your data in the Looker application, Looker utilizes a number of first- and third-party tools in order to provide and improve the service. Unless stated, all services share data with locations in the United States.

Application services include:

NOTE: We regularly review both our internal services and third-party service providers to ensure that the data we collect is aligned with the services intent, and that the security measures employed meet our high security standards.

Google responsibilities

Your responsibilities

Cloud security

You are responsible for configuring secure access between the Looker application and your database. Google provides multiple recommendations on how to configuring this access, including:

Product security

You are also responsible for controlling access and permissions for users of your Looker instance. Google recommends:

Cloud security architecture

Corporate security

Read Also: Grb Platform

Data Masking And Tokenization

While data encryption ensures that data is stored and travels in an encrypted form, end users are still able to see the sensitive data when they query the database or read file. Several compliance regulations require de-identifying or tokenizing sensitive data. For example, GDPR recommends data pseudonymization to reduce the risk on data subjects. De-identified data reduces the organizations obligations on data processing and usage. Tokenization, another data obfuscation method, provides the ability to do data processing tasks such as verifying credit card transactions, without knowing the real credit card number. Tokenization replaces the original value of the data with a unique token. The difference between tokenization and encryption is that data encrypted using keys can be deciphered using the same keys while tokens are mapped to original data in the tokenization server. Without access to the token server, data tokens prevent deciphering of the original value even if a bad actor gets access to the token.

Googles Cloud Data Loss Prevention automatically detects, obfuscates and de-identifies sensitive information in your data using methods like data masking and tokenization. When building data pipelines or migrating data into the cloud, integrate Cloud DLP to automatically detect and de-identify or tokenize sensitive data and allow data scientists and users to build models and reports while minimizing risk of compliance violations.

Automatically Provision Clusters And Grant Permissions

Episode 7: How KeyBank speeds queries up to 16x with Google Cloud

With the addition of endpoints for both clusters and permissions, the Databricks REST API 2.0 makes it easy to both provision and grant permission to cluster resources for users and groups at any scale. You can use the Clusters API 2.0 to create and configure clusters for your specific use case.

You can then use the to apply access controls to the cluster.

The following is an example of a configuration that might suit a new analytics project team.

The requirements are:

  • Support the interactive workloads of this team, who are mostly SQL and Python users.
  • Provision a data source in object storage with credentials that give the team access to the data tied to the role.
  • Ensure that users get an equal share of the clusterâs resources.
  • Provision larger, memory optimized instance types.
  • Grant permissions to the cluster such that only this new project team has access to it.
  • Tag this cluster to make sure you can properly do chargebacks on any compute costs incurred.

Read Also: Federal Grants For Dental Implants

Where Does Royal Cyber Step In

Cloud governance is the primary role of the cloud service partner. As partners of , we have the expertise to leverage the tools necessary to ensure that cloud governance is a smooth and dynamic process. Most importantly, our experts have the essential business perspective. Cloud governance is the intersection of IT departments, cybersecurity concerns, and business concerns therefore, it is only natural for companies to avail the services of a trusted third-party organization that helps them with this aspect of cloud computing. Some of the cloud governance best practices our team engage in include:

  • Eliminating increase of costs through constant optimization of cloud services
  • Consulting with the organizations team to ensure a planned automated cloud infrastructure pipelines process.
  • 24×7 support and constant monitoring of cloud environment eases governance, thus allowing for focus on core business development.
  • Develop governance plans continuously in line with the latest innovations and compliance requirements in business.

For more information, contact us at or visit us at www.royalcyber.com

Fine Grained Access Control

BigQuery supports fine grained access control for your data in Google Cloud. BigQuery access control policies can be created to limit access at column and row level controls in BigQuery. The combination of column and row level access control combined with DLP allows you to create datasets that have a safe version of the data and a clear version of the data. This promotes data democratization where the CDO can trust the guardrails of Google cloud to allow access correctly according to the user identity, accompanied by audit logs to ensure a system of record. Data can be shared across the organization to run analysis and build machine learning models while ensuring that sensitive data remains inaccessible to unauthorized users.

Don’t Miss: What Is The Best Free Government Cell Phone

How Google Cloud Keeps Cloud Governance Proactive

Public cloud service providers also give access to the resources that ease the processes of cloud governance. For instance, Google Cloud Platform has a product known as Policy Intelligence. Businesses can access policy controls that reduce security risks and data loss without increasing workloads with this product. For enforcing role-based access, the IAM Recommender utilizes machine learning to provide recommendations for resource accessibility, thus reducing the attack surface. And when it comes to diagnosing lack of access to a resource, Policy Troubleshooter helps admins quickly diagnose root cause and means to remediate if needed. In addition, with Policy Analyzer, admins will have data visibility and quick means to analyse various issues around access to services. At the same time, Policy Simulator helps admins and personnel understand the consequences of changing access for specific users.

IaC is key to maintaining consistency and automating configuration processes. IaC ensures that administrators can focus on aspects of the cloud governance program that help meet business objectives without excessively focusing on the setup of cloud assets. Furthermore, with the help of tools such as Recommender for sizing VM instances, one can automate the pipeline, thus keeping track of the changes within the organization. Therefore, a business can stay ahead of the curve with ease with the help of these tools.

Implement Table Access Control

Cloud data governance  Collibra

You can enable table access control on Databricks to programmatically grant, deny, and revoke access to your data from the Spark SQL API. You can control access to securable objects like databases, tables, views and functions. Consider a scenario where your company has a database to store financial data. You might want your analysts to create financial reports using that data. However, there might be sensitive information in another table in the database that analysts should not access. You can provide the user or group the privileges required to read data from one table, but deny all privileges to access the second table.

In the following illustration, Alice is an admin who owns the shared_data and private_data tables in the Finance database. Alice then provides Oscar, an analyst, with the privileges required to read from shared_data but denies all privileges to private_data.

Alice grants SELECT privileges to Oscar to read from shared_data:

Alice denies all privileges to Oscar to access private_data:

You can take this one step further by defining fine-grained access controls to a subset of a table or by setting privileges on derived views of a table.

Don’t Miss: Rtc Careers Las Vegas

Key Considerations For Building A Google Cloud Data Governance Practice

As more organizations link their data resources to Google Cloud, the demand for auditable systems for assuring data service will continue to emerge. To deal with these directives, it is important for organizations to frame their data governance practices based on certain components including:

  • A framework that empowers people to classify, embark on, and implement data policies
  • Effective systems for control, error, and stewardship over every data resource across on-premises systems and storage on the cloud
  • The right tools and technologies for engaging data policy compliance

Efficient Acld Search In Bigquery

As mentioned, Data Catalog is now integrated into the main BigQuery UI. This means you can now access the powerful search capabilities of Data Catalog in context, while working within the BigQuery UI. Enable the search/autocomplete preview to allow Data Catalog search into the resources panel. When a partial table name is specified, Data Catalog will helpfully show a list of matches, without the need to explicitly run a search and wait for results.

Data Catalog search works with the resource permissions the user is assigned, and can surface tables shared explicitly with the user even if the containing dataset is not shared with that user. This new method of surfacing search results interactively should streamline work for data analysts, and in addition requires less compute compared to listing out all the tables in a project or dataset.

You May Like: Los Lunas Government

What Is Data Governance

Data governance is a principled approach to managing data during its life cycle, from acquisition to use to disposal.

Every organization needs data governance. As businesses throughout all industries proceed on their digital-transformation journeys, data has quickly become the most valuable asset they possess.

Senior managers need accurate and timely data to make strategic business decisions. Marketing and sales professionals need trustworthy data to understand what customers want. Procurement and supply-chain-management personnel need accurate data to keep inventories stocked and to minimize manufacturing costs. Compliance officers need to prove that data is being handled according to both internal and external mandates. And so on.

Efficient Acl’d Search In Bigquery

Serverless Data Warehouses Powered by Google Cloud Platform (Cloud Next ’18)

As mentioned, Data Catalog is now integrated into the main BigQuery UI. This means you can now access the powerful search capabilities of Data Catalog in context, while working within the BigQuery UI. Enable the search/autocomplete preview to allow Data Catalog search into the resources panel. When a partial table name is specified, Data Catalog will helpfully show a list of matches, without the need to explicitly run a search and wait for results.

Data Catalog search works with the resource permissions the user is assigned, and can surface tables shared explicitly with the user even if the containing dataset is not shared with that user. This new method of surfacing search results interactively should streamline work for data analysts, and in addition requires less compute compared to listing out all the tables in a project or dataset.

Don’t Miss: Warner Robins Air Force Base Contractor Jobs

Achieving Data Assurance With Google Cloud Platform

Moving workloads to the Google Cloud Platform requires adherence to the structure that controls all the cloud resources and the respective strategies for each product. Every resource in Google Cloud is generally systematized in a hierarchy. Since the dynamics of data management alter in many fundamental ways when a business creates and stores more information in the cloud, risk management and privacy must be considered.

Accordingly, the data access layer in the Google Cloud platform is set up based on data sensitivity and privacy. It also provides a well-defined change management process and communication to all stakeholders to simplify the large change. Data quality expectations, systems, and performance are recorded which assist the data justification and monitoring process.

As implementation rates of Google Cloud continue to skyrocket, queries about potential stakes of dealing with data in the cloud also emerge for many reasons, such as protecting and securing the information, regulations, visibility and management. These risk factors emphasize the need for better data assessment, classification of metadata, data quality, and information security as fundamental data governance components.

Figure 1: Sample illustration of HCL Data Governance framework on Google Cloud

More articles

Popular Articles