Security Principles of Google Cloud Platform

While studying new material in private time I like to take notes to memorize things better and have neat reference material for the future. I often end up polishing some of my notes on a specific subject and releasing it to the infosec community, and I’ve found such a piece of work from last year when I’ve been intensely studying the security concepts of Google Cloud Platform.

I’m NOT a DevOps/GCP expert by any means – just wanted to share something to add a building block to our community knowledge base and make it easier for others to learn the ropes of cloud security engineering.

Table of Contents includes:

Resources hierarchies and policies
Cloud IAM Overview
IAM roles
Service Accounts
Cloud Identity
Cloud IAM Best Practices
Network Security
VPC Network details
Firewall rules
Load Balancers
VPC Best Practices
Cloud Interconnect and Cloud VPN
Cloud DNS and DNSSEC
Encryption
Cloud Key Management Service (KMS)
Cloud Identity-Aware Proxy
Data Loss Prevention
Cloud Security Command Center
Forseti
DDOS Mitigation
Cloud Security Scanner
Compute Engine Best Practices
Google Kubernetes Engine(GKE) Security
Secrets Management
Cloud Storage and Storage Type:
Cloud Storage Permissions and Access Control Lists
Data Retention Policies using Bucket Lock
BigQuery Security
Stackdriver
Cloud Responsibility Model

Resources hierarchies and policies:

IAM policies are inherited from the top down and parent permissive policies override restrictive child policies
1. Super Admin User best practices
  1. it’s the user setup when you first spin up for a Google Cloud account and it has full access to the Organisation
  2. It’s recommended to link to to a @gmail.com or another email account thats not your GSuite user / Cloud Indentity User
  3. Enable 2FA
  4. Don’t use this user for daily activities, instead create an ‘Organisation admin’ group for day to day administrative activities on the Organisation – But keep the super admin user outside of this group.
  5. Discourage usage of this account by:
    1. Enable 2FA with physical device
    2. Don’t share the password/credentials
    3. Setup stackdriver alerts which are sent to a group of people when super admin user is used, to discourage people from being the reason of those alerts

Cloud IAM Overview:

defines who can do what on which resources
Allows you to define granular access to specific GCP resources, allowing you to follow the principle of least privileges and prevents unwanted access to other resources

IAM roles:

Primitive roles – very generic roles allowing you to define one of three options being Owner, Viewer, Editor. Missing granularity that’s why it’s discouraged to be used. However you can use them and then sprinkle it with other roles, e.g. you can make it easier for yourself by setting up a Viewer role for most users if needed and then defining granular custom roles for higher level of access to the specific types of resources
Predefined roles – granular roles defined and maintained by Google, that allow you to truly follow the principle of least privilege
Custom roles – you can define your own roles with as limited set of permissions as you wish. While predefined roles are good and a common practice, they often consist of enabled accesses to multiple APIs, while in custom role you can define as little as one API access for a given role.
Beware the allUsers group, which grants access to your resources to all users including unauthenticated ones.

Service Accounts:

Service accounts are both users and resources – because another user can have binding to a ServiceAccountUserRole to access the final resource with the role of the given service account
Service accounts are accessible by keys, not passwords
When you SSH into the instance, you’re actually using the service account bound to the instance rather than using your Cloud Identity user
Google default service accounts define their permissions through access scopes which is a legacy way of setting up permissions for service accounts and it’s recommended to at least customize the API access scopes to setup a bit more granular access for specific APIs as opposed to using defaults
Create custom Service Accounts and define custom IAM policy for most granular roles thanks to which you can actually follow the principle of least privilege
If you’re going to use the service account in your apps/code outside of GCP then generate the custom ssh keypair. Otherwise just let google generate and manage it for you behind the scenes

Cloud Identity:

You can deploy SSO through 3rd party ldp,
you can synchronize with your AD or LDAP with GCDS
You can deploy a variety of 2FA options
You can utilize Mobile Device Management, enforce policies for personal and corporate devices, define a whitelist of approved apps and set requirements of company-manager apps

Cloud IAM Best Practices:

Grant roles at smallest scope necessary.
While using Service Accounts treat each app component as a separate trust boundary.
Create a separate service account for each service
Restrict service account access and who can create/manage service accounts
Beware the Owner role which has access to all settings in GCP, including billing
Rotate user-managed service account keys
Name service keys to reflect use and permissions
Use Cloud Audit logs to regularly audit IAM policy changes
Audit who can edit IAM policies on projects
Export audit logs to GCS for long-term retention
Restrict log access with logging roles
As a rule of thumb grant roles to a Google group instead of individual users

Network Security:

VPC Network details:

When you create a default VPC, the set of default FW rules is created. Make sure to review those to confirm you’re exposing the ports you truly need as opposed to leaving e.g. RDP and SSH ports wide open to the Internet which is the default rule for VPCs in GCP. Remember that whatever you do network-wise you want to reduce the attack surface which comes before the idea of permissive access for easier accessibility
VPC Network Peering allows private connections across two VPCs regardless of whether they’re in the same project/organisation or not. It allows you to connect multiple networks without making the traffic traverse the public Internet, while remaining in full independent control over FW rules for each subnet
You can connect your GCP VPC network with on-premises through Google VPN or Interconnect
Shared VPC allows you to connect resources from multiple projects to a common VPC network, to communicate securely within internal Google network. It allows you to enable networking while keeping the administration and billing management separate across different departments
If you have multiple VPC in one project, you can’t setup IAM role to limit user access to vpc-1 and to block their access to vpc-2 in the same project. VPC are meant to separate resources, not users access.

Firewall rules:

Enable you to allow/deny traffic to and from your VM based on your configuration
Defined at the VPC level but enforced at the instance level
Rules can be set to be enforced between instances and other networks as well as between instances on the same network
By default the rules are to deny all ingress traffic and to allow all egress traffic
Firewall rules inner workings:
1. Lowest number(id) of priority is the highest priority
2. You need to define if the rule applies to ingress or egress traffic
3. Every FW rule must have a target – it being either instances, tags or service accounts
4. Define the source(ingress) or destination(egress) in the rule
5. You can specify the protocol and port
Network tags:
1. Using Network Tags for Compute Engine instances is a good idea. Use meaningful text attributes to name your rulse, e.g. apache-http-plaintext which opens port 80
2. They allow you to apply FW rules and routes to individual instance as well as to a set of instances
Private Google Access:
1. You can enable on the subnet level setting an option to allow instances with internal IPs to reach only a certain APIs and services within GCP
2. It doesn’t effect external IPs

Load Balancers:

In GCP you can distribute load among instances in single or multiple regions
Sits in front of your instances using an IP frontend and intelligently relies traffic to multiple backend targets
HTTPS Load Balancer:
1. Layer7 – cross region and external
2. Supports HTTPS for encryption in transit
3. Traffic can be distributed by location or content
4. Forwarding rules are defined to distribute defined targets to target pool of instance groups
5. URL maps redirect requests based on defined rules
6. You can have Google manage your SSL certificates or manage your own
SSL Proxy Load Balancer
1. Network layer
2. Support for TCP with SSL offload(non-HTTPs traffic)
3. Traffic is distributed by location
4. Client SSL Sessions are terminted at the load balancer
5. End-to-end encryption is supported by configuring backend services to access traffic over SSL
6. Can be used for services such as Secure WebSockets, IMAP over SSL
7. Cloud SSL is used for non-HTTP(S) traffic
TCP Proxy Load Balancer
1. Network Layer, Cross-Region External
2. Intended for non-HTTP traffic
3. Intelligent routing that routes to locations that have capacity
4. Support many common ports
5. Is able to forward traffic as TCP or SSL
Network Load Balancer
1. Network Layer LB, region-external
2. Supports either TCP or UDP, can’t do both
3. Supports UDP, TCP, and SSL LB on ports which aren’t supported by the TCP proxy and SSL Proxy in GCP
4. SSL traffic is decrypted by backends and not the load balancer itself
5. Distributes traffic depending on the protocols, scheme and scope
6. No TLS offloading or proxying
7. Forwarding rules in place to distribute defined targets to instance groups – applies for TCP and UDP only as other protocols use target instances
8. Enforces self-managed SSL certificates

VPC Best Practices:

Use internal IP and Private Google access when possible
Start with a single VPC for resources that have common requirements
Create a VPC for each team, connected to a shared services VPC to maintain granular level control for each VPC
Isolate sensitive data in its own VPC, e.g. for HIPAA/PCI compliance
Consider using VPC Flow Logs for network monitoring and forensics

Cloud Interconnect and Cloud VPN

Cloud VPN:
1. Connects on-premises network to VPC or two VPCs in GCP
2. IPSec tunnel over the public internet
3. encrypted by one gateway and decrypted by the other
4. site to site vpn only, it doesn’t support site to client
5. supports up to 3gbps per tunnel with 8 tunnels max
6. supports both static and dynamic routing
7. supports for IKEv1 or IKEv2 using shared secret
Cloud Interconnect:
1. Physical link between Google and on-premise networks
2. Doesn’t traverse the public internet
3. Located in colocation facility owned by Google or Google partner with speeds from 50mbps to 200gbps
When to use which:
1. Interconnect should be used to:
  1. Prevent traffic from going through public internet
  2. to extend your VPC network
  3. when you need low latency and high-speed connection
  4. enables private google access for on-premises hosts
2. VPN should be used when:
  1. Public internet access is needed
  2. peering location isn’t available
  3. you have budget constraints as this option is much cheaper than Interconnect
  4. You don’t need very high speed and low latency

Cloud DNS and DNSSEC

Each domain has its own zone
It’s built around projects, you can have
1. Managed Zones – Public/Private
2. Public – Publicly facing
3. Private – only within specified VPC networks
DNSSEC is built into Cloud DNS. To enable it, add a DS resource to the TLD at your registrar and then enable DNSSEC on the domain in the Google console

Encryption:

Encryption at rest:
1. Encryption by default –
  1. Each data chunk has a separate encryption key(DEK)
  2. Backups are encrypted using a separate DEK
2. Cloud Key Management Service – when you choose to manage your own encryption key for sensitive data
  1. DEK is additionaly encrypted with key encryption key(KEK) so the envelope encryption process is followed – KEKs are not exportable from KMS
  2. Provides an audit trial
  3. Allows to setup an ACL for each key – key per policy
  4. Tracked each time key is used and authenticated
  5. Keys are automatically rotated each 90 days
3. Customer-Supplied Encryption Keys, with HSM
  1. to match your on-prem encryption setup and achieve even more granular control over encryption keys
Encryption in transit:
1. Google maintains a strict security measures to protect the physical boundaries of the network
2. Google Front End
  1. Globally distributed system that Google Cloud services accept requests from, with presence all around the globe
  2. Includes features such as:
    1. terminating traffic for incoming HTTP, HTTPS, TCP, TLS proxy traffic
    2. Provides DDOS attack prevention
    3. routing and load balancing traffic to Google Cloud servics
3. Types of routing requests within the GCP infrastructure – GCP encrypts and authenticated all data in transit at at least one network layer when data moves outside the physical boundaries controlled by Google
  1. User <-> Google Front End Encryption
  2. User <-> Customer apllication hosted in GCP
  3. VM <-> VM
  4. VM <-> Google Cloud service
  5. Google Cloud service <-> Google Cloud service
4. By default traffic encryption is performed at the network layer with AES-128
  1. Session keys are established on hosts and protected by ALTS
  2. Security tokens are used for authentication and generated for every flow, consisting of token key and host secret
  3. It protects host from spoofing packets on the network
  4. Google sets up IPSec Tunnels for communication between two networks. It’s encrypted by one VPN gateway and decrypted by the VPN gateway on the other end, with IKE v1(AES-128) and IKE v2(AES-256) supported
  5. Any data sent to GFE is encrypted in transit using TLS, including the API interactions
  6. GFE employes BoringSSL for TLS, which is a fork of openSSL. It’s configured to automatically negotiate the highest version of the protocol
  7. Google Certificate Authority enables identity verification achieved inTLS thorugh the use of a certificate. Certificate holds DNS hostname of server and public key
  8. Root key migration and key rotation:
    1. Google is responsible for the rotation of keys and certificates
    2. TLS certificates are rotated every 2 weeks with lifetime of 3 months
    3. Keys are rotated daily with lifetime of 3 days
  9. Application Layer Transport Security(ALTS)
    1. Layer 7 Traffic
    2. Mutual authentication and transport encryption system developed by Google
    3. used for securing Remote Procedure Call(RPC) communications within Google’s infrastructure
    4. Identities are bound to entities (user, machine, service) instead of to a specific server name or host
    5. relies on both the handshake protocol and the record protocol
    6. governs how sessions are established, authenticated, encrypted and resumed
    7. GFE <-> service | service <-> service

Cloud Key Management Service (KMS):

KMS is a service that lets you manage cryptographic keys for every service within GCP
Generate, use, rotate, destroy symmetric encryption keys
Automatic or at-will key rotation
assymetric and symmetric key support
used to encrypt all data on Google Cloud
Integrated with Cloud IAM and Cloud Audit Logs
Tracked every time it’s used and authenticated and logged
Permissions are handled by ACLS on a per-key basis
used with Cloud HSM
DEKs are encrypted with KEK
Process known as envelope encryption, where you’re encrypting key with another key
central repository for storing KEKs
KEKs not exportable from KMS
automatically rotates KEKs at regular intervals
standard rotation period is 90days
KMS belongs to the project, and the best practice is to actually run KMS in a separate project
You can choose the location where KMS keys are stored and receive requests
You can group keys for your purposes through Key ring, which also makes keys in a key ring to inherit the permissions
Separation of duties:
1. Ensuring that one individual does not have all necessary permissions to be able to complete a malicious action
2. users normally shouldn’t have access to decryption keys
3. helps prevent security or privacy incidents and programmatic errors
4. Move KMS to its own project
Secrets Management:
1. Cloud KMS doesn’t directly store secrets
2. Encrypts secrets that you store elsewhere
3. Use the default encryption built into Cloud Storage buckets
4. use application layer encryption using a key in Cloud KMS

Cloud Identity-Aware Proxy:

Establishes a central authorization layer for applications accessed by HTTPS, enabling you to use an application-level access controls instead of using network-level firewalls.
Controls HTTPS access to your applications and VMs on GCP
Central authorization layer for application-level access control
Enforces access control policies for applications and resources
Allows employees to work from untrusted networks without the use of VPN
The flow of IAP is as follows:
1. User hits Cloud IAP Proxy
2. Cloud IAP enabled app or backend service validates credentials and performs user authentication
3. Then it checks if user has been authorized access to the resource
Allows you to access web apps and infrastructure from any device without VPN
it’s built into GCP infrastructure and GSuite
It’s integrated with Cloud Identity and Cloud Armor
Supports both cloud and on-premise
Supports IAP TCP Forwarding:
1. Control who can access administrative services such as SSH/RDP over the public internet, and putting them behing Cloud IAP protects them from being exposed to the internet
2. Require user to pass authentication and authorization checks before they gain access to the target resource
IAP Best Practices
1. Shouldn’t use thirds party CDN to avoid insecure caching
2. To secure your app, use signed headers
3. Ensure that all requests to CE or GKE are routed through the load balancer
4. Configure source traffic to be routed through GFE whenever possible

Data Loss Prevention:

Allows you to manage and redact sensitive data such as credit card numbers, names, phone numbers, credentials etc
Will alert you when sensitive data is discovered, with an information on the likelihood of the legitimacy of the findings
Results can be imported into BQ for analysis
There are 90+ predefined detectors but you can also define custom ones
Can work on text files as well as images

Cloud Security Command Center:

Single pane of glass dashboard that allows you to gather data, and identify threats to easily act on them.
It allows you to:
1. View and monitor inventory of cloud assets
2. scan for sensitive data
3. detect vulnerabilities and anomalous behavior
4. review access rights
5. inspect your current and past asset states
6. provides insights on your resources, allowing you to understand your attack surface
7. it’s native to the Google Cloud and support many 3rd party integrations
General modules:
1. Asset Discovery and Inventory
  1. Allows you to see all assets in the Organization, such as projects, asset types(new/current/changed), change types and IAM policies
2. Sensitive Data Identification
  1. Integrates with Cloud DLP solution to provide you all the findings in the SCC
3. Application vulnerability detection:
  1. Integration with Cloud Security Scanner, enabling you to ship CSS findings to the SCC dashboard
4. Access control monitoring
  1. ensure the right access control policies are in place
  2. alerts when policies are misconfigured or changed
  3. natively integrates with Forseti
5. Anomaly Detection from Google
  1. Identify Threats with built-in anomaly detection
    1. anomalies include botnets, crypto mining, generally suspicious network traffic, outbound DDOS traffic, unexpected reboots etc
6. Third party integrations
  1. Allows you to integrate 3rd party tools with SCC, providing you a single pane of glass dashboard to manage security risks and threats of your GCP environment
7. Allows you to setup real time notifications via cloud pub/sub notification integration

Forseti:

Collection of community driven open source security tools, that allow you to pick and choose any or multiple modules independently of each other.
Designed for security at scale
Allows you to create rule based policies and codify the process
Ensures that the security of your environment is governed by consistent set of predefined rules
You can modify the resources of your choosing and have Forseti notify you when anything changes
You can use enforcer mode, so that when someone makes a change that’s against the policies you’ve setup in Forseti, it’ll automatically revert the changes and enforce your best practices
Snapshots of the inventory are saved into Cloud SQL

DDOS Mitigation:

There are mechanisms in place to protect the GCP cloud
CSP responsibility:
1. Ensure that no single service can overwhelm the shared infrastructure
2. Provide isolation among customers using the shared infrastructure
3. Deployed detection systems
4. Implemented barriers
5. Absorbing attacks by up-scaling
Customer responsibility:
1. Reduce the attack surface:
  1. Isolate and secure your network subnets, FW rules, tags and IAM
  2. Use FW rules and protocol forwarding
  3. Anti-spoofing protection is provided for the private network by default
  4. Automatic isolation between virtual networks
2. Isolate Internal traffic from the external world:
  1. Deploy instances without public IPs unless necessary
  2. Setup a NAT gateway or SSH bastion host to limit the number of instances exposed to the internet
  3. Deploy internal LB on internal client instances that access internally deployed services to avoid exposure to the external world
3. Enable proxy-based LB
  1. HTTPS or SSL proxy load balancing allows Google infrastructure to mitigate many L4 and below attacks, such as SYN floods, IP fragment floods, port exhaustion etc
  2. Disperse attack across instances around the globe with HTTP/S load balancing to instances in multiple regions
4. Scale to absorb the attacks
  1. Protection by GFE Infrastructure
    1. Global load balancing
    2. Scales to absorb certain types of attacks such as SYN floods
  2. Anycast-based LB
    1. HTTP/S LB and SSL proxy enable a single anycast IP to front-end
  3. Autoscaling
5. Protection with CDN Offloading, where Google Cloud CDN acts as a proxy
6. Deploy 3rd party DDOS protection solutions that integrate with GCP out of the box
7. App Engine deployment:
  1. Fully multi-tenant system
  2. Safeguards in place
  3. Sits behind the GFE
  4. Specify a set of IPs / networks
8. Google Cloud Storage:
  1. Use signed URLs to access Google Cloud Storage
9. API Rate Limiting:
  1. Define the number of allowed requests to Compute Engine API
  2. API rate limits apply on per-project basis
  3. Projects are limited to an API rate limit of 20 requests/second
10. Resource Quotas:
  1. Quotas help prevent unexpected spikes in usage
11. Cloud Armor – works with global HTTP/S LB to provide defense against DDoS attacks, by using security policies that are made up of rules at allow or prohibit traffic from IP addresses or ranges defined in the rule:
  1. Is implemented at the edge of Google networks, and supports HTTP, HTTPS and HTTP/2
  2. Security policies are allow/deny list type of rules and can be specified for backend services
  3. You can deny/allow both precise IPs as well as whole CIDR ranges
  4. You can test and preview the rules without going live
  5. Logging module allows you to see triggered policy, associated action and related information

Cloud Security Scanner:

Is a web security scanner for common vulnerabilities in App Engine, Compute Engine and GKE applications
Provides automatic vulnerability scanning testing for issues such as XSS, mixed content, cleartext passwords, insecure JS libraries etc
Available at no extra cost and has very low rates of false positives
You can perform an immediate scan or schedule it to run on periodic basis
You should run in QA environment as running in prod can cause unexpected issues and hinder the UX for legitimate application users. Cloud Security Scanner has a fuzzer type of an engine, it’ll try to play with your APIs, post comments, add posts which is something you most likely don’t want to become visible in publicly available instance of your application

Compute Engine Best Practices:

All instances should run with service accounts, instead of giving users direct access to the instance
1. Create a new service accounts and do not use default service accounts, so you can follow the principle of least privilege
2. Grant users the serviceAccountUser role at the project level, to provide them the ability to create/manage instances
  1. Set permissions to allow create an instance, attach a disk, set instance metadata, use ssh, reconfigure an instance to run as a service accounts
Track how your CE resources are modified and accessed having always an audit trail of who did what and when
Networking:
1. Separate instances that don’t need intra-network communication should be put into different VPC networks
Image Management:
1. Restrict the use of public images
2. Allow only approved images, which are hardened with software approved by the security team
3. Utilize the Trusted Image feature
4. Set the configuration for images management on the Organization level
Patch Management:
1. In modern day cloud infrastructure you want to strive to achieve immutable infrastructure, so when you need to perform a patch or upgrade, you should consider replacing the CE instance instead of updating it

Google Kubernetes Engine(GKE) Security

GKE is managed environment for deploying containerized applications, by grouping them into easily manageable units
GKE handles:
1. deployment
2. auto-scaling
3. updates
4. load balancing
5. auto recovery
The shortened description of Google Kubernetes’ architecture is as follows:
1. Cluster – consists of one cluster master and one/multiple worker machines called nodes
  1. Cluster Master runs the Kubernetes control plane processes such as Kubernetes API server, scheduler and core resource controllers
2. Node – worker machine that runs containerized apps and other types of workloads. Each node is a Compute Engine instance provisioned by GKE during cluster creation
3. Pod – smallest, simplest deployable object in Kubernetes. A pod represents a single instance of a running process in the cluster. It’s running on the Kubernetes nodes
GKE Networking:
1. Internal Cluster networking:
  1. Cluster IP is an IP address assigned to a service, and is stable for its lifetime
  2. Node IP – IP address assigned to a given node, comes from cluster’s VPC network and each node has a pool of IP addresses to assign to its pods
  3. Pod IP – IP address assigned to a given pod, shared with all containers in that pod and pods IPs are ephemeral by their nature
  4. Label – arbitrary key/value pair attached to an object
  5. Service – grouping of multiple related pods into a logical unit using labels
  6. Stable IP address, DNS entry and ports
  7. Provides load balancing among the set of pods whose labels match all the labels defined in the label selector when the service is created
  8. kube-proxy – a component running on each node that manages connectivity between pods and services
    1. egress-based LB controller
    2. continually maps the cluster IP to healthy pods
  9. Namespace – virtual clusters backed by the same physical cluster
    1. Intended for use in environments with many users spread across multiple projects or teams such as dev, qa, production
    2. a way to divide cluster resources between multiple users
    3. unique name within the cluster
Kubernetes Security:
1. Authentication and Authorization
  1. Service accounts pod level
  2. Disable Attribute based access control(ABAC) and use RBAC – role based access control
    1. rbac allows you to grant permissions to resources at the cluster/namespace level
  3. Follow the principle of least privilege and reduce node service account scopes
2. Control Plane Security
  1. Components are managed and maintaned by Google
  2. Disable Kubernetes Web UI
  3. Disable authentication with client certs
  4. Rotate credentials on regular basis
3. Node Security
  1. Use container-optimized OS for enhanced security
    1. i.e. locked down firewall
    2. read-only filesystem wherever possible
    3. limited user accounts and disabled root login
  2. Patch OS on regular basis, ideally with enabled automatic upgrades
  3. Protect the OS on the node from untrusted workloads running in the pods
  4. Use metadata concealment to ensure pods do not have access to sensitive data
4. Network Security principles:
  1. All pods in the cluster can communicate with each other
  2. Restrict ingress/egress traffic of pods using network policies in a namespace
  3. Load balance pods with a service of type LoadBalancer
  4. Restrict which IP address ranges can access endpoints
  5. Filter authorized traffic using kube-proxy
  6. Use Cloud Armor / IAP Proxy when using external HTTPS LB
5. Securing workloads:
  1. Limit pod container process privileges using PodSecurityPolicies
  2. To give pods access to GCP resources
    1. workload identity
    2. node service account
6. Best practices for container security:
  1. Package a single application per container
  2. Create a process for managing zombie processes
  3. Optimize for the docker build cache, to allow accelerated building later on
  4. Remove unnecessary tools
  5. Build the smallest, most lightweight image possible
  6. Properly tag your images
  7. Use trusted images and be careful when using public images
  8. Use container registry vulnerability scanning to analyze container images

Secrets Management:

Common concerns while using secrets such as passwords, token, API keys, private keys etc:
1. Authorization and access management
2. Auditability on per-secret level
3. Encryption at rest and protection in case of unauthorized access
4. Rotation of secrets automatically or on-demand
5. Isolation, separation, management vs usage of secrets, separation of duties
Using secrets encrypted in code with Cloud KMS
1. Encrypt secrets at the application layer
2. Limit the scope of access to the secret
3. Restricted to all developers with access to the code
4. Must have access to both the code and access key
5. Audited for those who do have access
Cloud Storage bucket, encrypted at rest
1. Limits access to smaller set of developers
2. Auditable
3. Separation of systems. separate from code repository
4. Able to rotate secrets easily
Third party solutions
1. Dedicated secret management tools
2. Auditable
3. Separation of systems
4. Rotate secrets automatically
Changing secrets:
1. Rotating secrets
  1. Rotate secrets regularly
  2. Store few versions of a secret
  3. Rotate/rollback if needed
2. Caching secrets locally
  1. May be required by the application
  2. Can be rotated frequently, even several times per hour
  3. Can refresh secrets quickly
3. Separate solution or platform
  1. platform agnostic
  2. automatic and scheduled secret rotation
Managing Access:
1. Limiting Access:
  1. Create two projects, one for Cloud Storage to store secrets and one of Cloud KMS to manage encryption keys
  2. Assign roles to access secrets, you can use service accounts for that
  3. Store each secret as an encrypted object in Cloud Storage, group them as needed
  4. Rotate secrets and encryption keys regularly
  5. Protect each bucket by using encryption. Although buckets have default encryption, it’s recommended to use Cloud KMS at the application layer
  6. Enable Cloud Audit Logs for activity monitoring
2. Restricting and enforcing access:
  1. Access controls on the bucket in which the secret is stored
    1. Support for multiple secrets/objects per bucket
    2. single secret per bucket
  2. Access controls on the key that encrypted the bucket in which the secret is stored
    1. Support for multiple secrets/objects per key
    2. single secret per key
  3. Usage of service accounts is encouraged
3. Best Practices:
  1. Limit the amount of data that one encryption key protects:
    1. Cryptographic isolation
    2. Allows for more granular control over secret access
    3. Helps prevent accidental permissions
    4. Supports more granular auditing
  2. Store each secret as its own object
  3. Store similar secrets in the same bucket
  4. One encryption key per bucket
  5. Regularly rotate keys and secrets to limit the lifecycle of each
  6. Enable Cloud Audit logging
4. Kubernetes Secret Management:
  1. Generic – local file, directory or literal value
  2. dockercfg secret for use with a Docker registry
  3. TLS secret from an existing KMS public/private keypair
  4. Secret values are encoded in base64
  5. Encrypt secrets at the application layer using KMS keys
  6. Use secrets by:
    1. Specifying environment variables that reference the secrets value
    2. Mounting a volume containing the secret
  7. Third party solutions can be used

Cloud Storage and Storage Types:

Cloud Storage offers four storage classes:
1. Multi-regional
2. Regional
3. Nearline
4. Coldline
All storage classes provide:
1. Low latency
2. High durability
Storage classes differ by availability, minimum storage duration and pricing
The storage class set for an object affects its availability and pricing
Object’s existing storage class can be changed:
1. Rewriting the object
2. Object lifecycle management

Cloud Storage Permissions and Access Control Lists

IAM:
1. Grant access to buckets as well as bulk access to bucket’s objects
2. Can be added to project or bucket
3. Broad control over buckets
  1. no fine-grained control
4. set the minimum permissions needed
5. recommended to set permissions for buckets
ACLs:
1. Customize access to individual objects within a bucket
2. Can be added to bucket or object
3. Fine-grained control over individual objects
4. Supplement each other with IAM
5. Public access can be granted to objects
6. Defined by permissions and scope
7. Permission can be:
  1. Owner
  2. Writer
  3. Reader
8. Scope can be:
  1. Google account
  2. Google gropus
  3. Convenience values for projects (such as viewers-project)
  4. GSuite/Cloud Identity domain
  5. All Google account holders
  6. AllUsers
9. Default ACLs
  1. All new buckets assigned with a default ACL
  2. When default ACL for bucket is changed, it’s propagated to all objects
10. Signed URLs
  1. an URL that provides time-limited read/write/delete access to an object in cloud storage
  2. those who have access to the URL can access the object for the duration of time specified
  3. no google account is needed for access

Data Retention Policies using Bucket Lock

Allows you to configure a data retention policy for Cloud Storage bucket to govern how long objects in the bucket must be retained. The feature also allows you to lock the data retention policy, permanently preventing the policy from being reduced or removed
Used for Write Once Read Many (WORM) storage
Prevents deletion or modification of data for a specified time period
Helps meet compliance, legal and regulatory requirements for data retention
Works with all tiers of Cloud Storage
Lifecycle policies can be applied to automatically move locked data to colder storage classes
Retention policies
1. Can be included when creating a new bucket
2. Add a retention policy to an existing bucket
3. Ensures that all current and future objects in the bucket cannot be deleted or overwritten until they reach the age defined in the policy
4. Tracked by retention expiration time metadata
Retention periods:
1. Measured in seconds,
2. Can be set in days, months or years
3. Maximum is 100 years
Retention policy locks:
1. Prevent the policy from ever being removed and retention period from ever being reduced
2. Once a retention policy is locked, you cannot delete the bucket until every object has met the retention period
3. Locking a retention policy is irreversible
Object holds:
1. metadata flags that are placed on individual objects
2. Objects with holds cannot be deleted
  1. Event-based holds
  2. temporary holds
3. Event-based holds can be used in conjunction with retention policies to control retention based on event occurrences
4. Temporary holds can be used for regulatory or legal investigation purposes
5. Objects can have one, both or neither
Compliance:
1. Can be used to comply with financial institution regulatory requirements for electronic record retention such as SEC, FINRA etc

BigQuery Security:

Integrates with DLP, Cloud Storage and Stackdriver
Authorized Views:
1. View access to a dataset
2. Cannot assign access controls directly to tables or views
3. Lowest level is the dataset level
4. Allows you to share query results with users/groups
5. Restricts access to the underlying tables
6. Allows you to use the view’s SQL query to restrict the columns users are able to query
7. Must be created in a separate dataset
Exporting data:
1. Can be exported to CSV, JSON, Avro
2. Up to 1GB data to a single file
3. Can only export to Cloud Storage
Datasets can be scanned for PII with DLP

Stackdriver:

a set of tools logging, debugging and monitoring.
Available for GCP and AWS
Provides VM monitoring with agents
Stackdriver products:
1. Stackdriver Monitoring – metrics, time series, health checks, alerts
2. Stackdriver Logging – central aggregation of all log activity
3. Stackdriver Error Reporting – Identify and understand application errors
4. Stackdriver Debug – identify code errors in production
5. Stackdriver Trace – find performance bottlenecks in production
6. Stackdriver Profiler – identify CPU, memory and time consumption patterns
Integration with 3rd party products within one view
Stackdriver Logging:
1. Central repository for log data from multiple sources
2. Real-time log management and analysis
3. Tight integration with monitoring
4. Platform, system and application logs
5. Export logs to other sources for long-term storage and analysis
6. General ideas:
  1. associated primarily with GCP projects
    1. Logs Viewer only shows logs from one project
  2. Log Entry records a status or an event
    1. Project receives log entries when services being used produce log entries
  3. Logs are a named collection of log entries within a GCP resource
    1. Each log entry includes the name of its log
    2. Logs only exist if they have log entries
  4. Retention period – length of time for which logs are kept
7. Types of logs:
  1. Audit Logs:
    1. who did what, where and when
    2. Admin activity
    3. Data access
    4. System events
  2. Access Transparency Logs
    1. Actions taken by Google staff when accessing your data
  3. Agent logs:
    1. logging agents that run on VMs
    2. sends system and third party logs on the VM instance to stackdriver logging
8. Audit log types:
  1. Admin activity logs:
    1. API calls or other administrative actions
    2. always written
    3. cannot disable or configure them
    4. no charge
  2. Data Access Logs
    1. API calls that create, modify, read resource data provided by the user
    2. Disabled by default
    3. Must be explicitly enabled
    4. Charges apply
  3. Audit logs:
    1. System event Audit Logs:
      1. GCP administrative actions
      2. Generated by Google, not by user action
      3. Always written
      4. Cannot disable/configure them
      5. No charge
    2. Access Transparency Logs
      1. Actions taken by Google staff when accessing your data
        
        Investigations into your support requests
        
        Investigations recovering from an outage
      2. Enabled for entire Organization
      3. Enterprise support is needed as that’s when such activity even happen
    3. Agent Logs:
      1. Sends system and 3rd party logs on the VM to Stackdriver Logging
      2. Charges apply
9. IAM Roles:
  1. Logging Admin: Full control and able to add other members
  2. Logs Viewer – only view logs
  3. Private Logs Viewer – View logs, private logs
  4. Logs Writer – grant service account permissions to write
  5. Logs Configuration writer – create metrics and export sinks(for extended storage, big data analytics, streaming to other apps/systems)
VPC Flow Logs – record a sample of network flows sent from/received by VM instances. It’s useful for network monitoring, forensics and real-time security analysis
1. Can be viewed through Stackdriver Logging
2. Aggregated by connection from VMs and exported in real time
3. Subscribing to Cloud Pub/Sub enables streaming so that flow logs can be analyzed in real time
4. Enable/disable per VPC subnet
5. Each flow record covers all TCP and UDP flows
6. Filters can be applied to select which flow logs should be excluded from Stackdriver Logging and exported to external APIs
7. No delay in monitoring, as Flow Logs and native to GCP network stack
8. Collected for each VM at specified intervals
9. All packers are collected for a given interval and aggregated into a single flow log entry
Stackdriver Monitoring
1. Full stack monitoring for GCP, AWS and 3rd party apps
2. Provides single pane of glass dashboarding, integrates with Stackdriver Logging
3. Monitoring agent:
  1. gathers system and application metrics from VM
  2. Without the agent on VM, only CPU/disk traffic/network traffic and uptime metrics are collected
  3. Can monitor many 3rd party apps
4. Can monitor GKE clusters starting from general cluster metrics to inspection of services, nodes, pods and containers
5. Alerting:
  1. Policies can be defined to alert you when service is considered unhealthy(depends on the criteria you’ve specified)
  2. Allows notification through email, pagerduty, slack, SMS
Stackdriver APM – set of tools that work with code/apps running on cloud and on-premise infrastructure. Helps monitor and manage application performance
1. Consists of Stackdriver [Trace/Debugger/Profiler]
2. It’s a set of tools used by Google’s Site Reliability Engineering Team
3. Stackdriver Trace – helps understand how long it takes the application to handle incoming requests
4. Stackdriver Debugger:
  1. debug a running app without slowing it down thanks to an option to create a snapshot i.e. capture and inspect the call stack and local variables in the application
  2. inject logging into running services at available logpoints
5. Stackdriver Profiler:
  1. continuously gathers CPU usage and memory allocation information from your applications
  2. helps discover patterns of resource consumption
6. Stackdriver Error Reporting:
  1. Real time error monitoring and alerting
  2. Counts, analyzes and aggregates the crashed in GCP environment
  3. Alerts when a new application error happens
Logs exports:
1. You can export the logs(defined by your query) to:
  1. Cloud Storage
  2. BigQuery
  3. Cloud Pub/Sub – useful for exporting to SIEM-alike system
2. Logs exports aren’t charged

Cloud Responsibility Model

Security of the cloud – Google
Security in the cloud – User

A big credit for this contribution goes to the bloggers and platforms from which I’ve learnt a ton, to name a few: pluralsight, cybrary, pentesteracademy, linuxacademy, cousera, udemy, infosecacademy.

Hope this is helpful.

2 thoughts on “Security Principles of Google Cloud Platform”

Emanuel Pabis says:

Thursday, 16/April/2020 at 8:08 PM

Jeśli chodzi o wartość przekazywaną w materiałach i ogromne zaangażowanie Pana mogę uznać z czystym sumieniem, że jest Pan w topce jak nie najlepszym Blogerem/Youtuberem o temace cybersecurity w Polsce.

LikeLiked by 1 person

1. Dawid Balut says:
  
  Tuesday, 28/April/2020 at 7:30 AM
  
  Serdecznie dziękuję za bardzo pozytywny i motywujący komentarz. Doceniam.
  
  LikeLike

Security Principles of Google Cloud Platform

Published by Dawid Balut

2 thoughts on “Security Principles of Google Cloud Platform”

Leave a comment Cancel reply

Spread the word!

Related

Published by Dawid Balut

2 thoughts on “Security Principles of Google Cloud Platform”

Leave a comment Cancel reply