Security Principles of Google Cloud Platform

While studying new material in private time I like to take notes to memorize things better and have neat reference material for the future. I often end up polishing some of my notes on a specific subject and releasing it to the infosec community, and I’ve found such a piece of work from last year when I’ve been intensely studying the security concepts of Google Cloud Platform.

I’m NOT a DevOps/GCP expert by any means – just wanted to share something to add a building block to our community knowledge base and make it easier for others to learn the ropes of cloud security engineering.

Table of Contents includes:

  • Resources hierarchies and policies
  • Cloud IAM Overview
  • IAM roles
  • Service Accounts
  • Cloud Identity
  • Cloud IAM Best Practices
  • Network Security
  • VPC Network details
  • Firewall rules
  • Load Balancers
  • VPC Best Practices
  • Cloud Interconnect and Cloud VPN
  • Cloud DNS and DNSSEC
  • Encryption
  • Cloud Key Management Service (KMS)
  • Cloud Identity-Aware Proxy
  • Data Loss Prevention
  • Cloud Security Command Center
  • Forseti
  • DDOS Mitigation
  • Cloud Security Scanner
  • Compute Engine Best Practices
  • Google Kubernetes Engine(GKE) Security
  • Secrets Management
  • Cloud Storage and Storage Type:
  • Cloud Storage Permissions and Access Control Lists
  • Data Retention Policies using Bucket Lock
  • BigQuery Security
  • Stackdriver
  • Cloud Responsibility Model

 

 
Resources hierarchies and policies:
  1. IAM policies are inherited from the top down and parent permissive policies override restrictive child policies
    1. Super Admin User best practices
      1. it’s the user setup when you first spin up for a Google Cloud account and it has full access to the Organisation
      2. It’s recommended to link to to a @gmail.com or another email account thats not your GSuite  user / Cloud Indentity User
      3. Enable 2FA
      4. Don’t use this user for daily activities, instead create an ‘Organisation admin’ group for day to day administrative activities on the Organisation – But keep the super admin user outside of this group.
      5. Discourage usage of this account by:
        1. Enable 2FA with physical device
        2. Don’t share the password/credentials
        3. Setup stackdriver alerts which are sent to a group of people when super admin user is used, to discourage people from being the reason of those alerts
Cloud IAM Overview:
  1. defines who can do what on which resources
  2. Allows you to define granular access to specific GCP resources, allowing you to follow the principle of least privileges and prevents unwanted access to other resources
IAM roles:
  1. Primitive roles – very generic roles allowing you to define one of three options being Owner, Viewer, Editor. Missing granularity that’s why it’s discouraged to be used. However you can use them and then sprinkle it with other roles, e.g. you can make it easier for yourself by setting up a Viewer role for most users if needed and then defining granular custom roles for higher level of access to the specific types of resources
  2. Predefined roles – granular roles defined and maintained by Google, that allow you to truly follow the principle of least privilege
  3. Custom roles – you can define your own roles with as limited set of permissions as you wish. While predefined roles are good and a common practice, they often consist of enabled accesses to multiple APIs, while in custom role you can define as little as one API access for a given role.
  4. Beware the allUsers group, which grants access to your resources to all users including unauthenticated ones.
Service Accounts:
  1. Service accounts are both users and resources – because another user can have binding to a ServiceAccountUserRole to access the final resource with the role of the given service account
  2. Service accounts are accessible by keys, not passwords
  3. When you SSH into the instance, you’re actually using the service account bound to the instance rather than using your Cloud Identity user
  4. Google default service accounts define their permissions through access scopes which is a legacy way of setting up permissions for service accounts and it’s recommended to at least customize the API access scopes to setup a bit more granular access for specific APIs as opposed to using defaults
  5. Create custom Service Accounts and define custom IAM policy for most granular roles thanks to which you can actually follow the principle of least privilege
  6. If you’re going to use the service account in your apps/code outside of GCP then generate the custom ssh keypair. Otherwise just let google generate and manage it for you behind the scenes
 
 
Cloud Identity:
  1. You can deploy SSO through 3rd party ldp,
  2. you can synchronize with your AD or LDAP with GCDS
  3. You can deploy a variety of 2FA options
  4. You can utilize Mobile Device Management, enforce policies for personal and corporate devices, define a whitelist of approved apps and set requirements of company-manager apps
Cloud IAM Best Practices:
  1. Grant roles at smallest scope necessary.
  2. While using Service Accounts treat each app component as a separate trust boundary.
  3. Create a separate service account for each service
  4. Restrict service account access and who can create/manage service accounts
  5. Beware the Owner role which has access to all settings in GCP, including billing
  6. Rotate user-managed service account keys
  7. Name service keys to reflect use and permissions
  8. Use Cloud Audit logs to regularly audit IAM policy changes
  9. Audit who can edit IAM policies on projects
  10. Export audit logs to GCS for long-term retention
  11. Restrict log access with logging roles
  12. As a rule of thumb grant roles to a Google group instead of individual users
Network Security:
VPC Network details:
  1. When you create a default VPC, the set of default FW rules is created. Make sure to review those to confirm you’re exposing the ports you truly need as opposed to leaving e.g. RDP and SSH ports wide open to the Internet which is the default rule for VPCs in GCP. Remember that whatever you do network-wise you want to reduce the attack surface which comes before the idea of permissive access for easier accessibility
  2. VPC Network Peering allows private connections across two VPCs regardless of whether they’re in the same project/organisation or not. It allows you to connect multiple networks without making the traffic traverse the public Internet, while remaining in full independent control over FW rules for each subnet
  3. You can connect your GCP VPC network with on-premises through Google VPN or Interconnect
  4. Shared VPC allows you to connect resources from multiple projects to a common VPC network, to communicate securely within internal Google network. It allows you to enable networking while keeping the administration and billing management separate across different departments
  5. If you have multiple VPC in one project, you can’t setup IAM role to limit user access to vpc-1 and to block their access to vpc-2 in the same project. VPC are meant to separate resources, not users access.
Firewall rules:
  1. Enable you to allow/deny traffic to and from your VM based on your configuration
  2. Defined at the VPC level but enforced at the instance level
  3. Rules can be set to be enforced between instances and other networks as well as between instances on the same network
  4. By default the rules are to deny all ingress traffic and to allow all egress traffic
  5. Firewall rules inner workings:
    1. Lowest number(id) of priority is the highest priority
    2. You need to define if the rule applies to ingress or egress traffic
    3. Every FW rule must have a target – it being either instances, tags or service accounts
    4. Define the source(ingress) or destination(egress) in the rule
    5. You can specify the protocol and port
  6. Network tags:
    1. Using Network Tags for Compute Engine instances is a good idea. Use meaningful text attributes to name your rulse, e.g. apache-http-plaintext which opens port 80
    2. They allow you to apply FW rules and routes to individual instance as well as to a set of instances
  7. Private Google Access:
    1. You can enable on the subnet level setting an option to allow instances with internal IPs to reach only a certain APIs and services within GCP
    2. It doesn’t effect external IPs
Load Balancers:
  1. In GCP you can distribute load among instances in single or multiple regions
  2. Sits in front of your instances using an IP frontend and intelligently relies traffic to multiple backend targets
  3. HTTPS Load Balancer:
    1. Layer7 – cross region and external
    2. Supports HTTPS for encryption in transit
    3. Traffic can be distributed by location or content
    4. Forwarding rules are defined to distribute defined targets to target pool of instance groups
    5. URL maps redirect requests based on defined rules
    6. You can have Google manage your SSL certificates or manage your own
  4. SSL Proxy Load Balancer
    1. Network layer
    2. Support for TCP with SSL offload(non-HTTPs traffic)
    3. Traffic is distributed by location
    4. Client SSL Sessions are terminted at the load balancer
    5. End-to-end encryption is supported by configuring backend services to access traffic over SSL
    6. Can be used for services such as Secure WebSockets, IMAP over SSL
    7. Cloud SSL is used for non-HTTP(S) traffic
  5. TCP Proxy Load Balancer
    1. Network Layer, Cross-Region External
    2. Intended for non-HTTP traffic
    3. Intelligent routing that routes to locations that have capacity
    4. Support many common ports
    5. Is able to forward traffic as TCP or SSL
  6. Network Load Balancer
    1. Network Layer LB, region-external
    2. Supports either TCP or UDP, can’t do both
    3. Supports UDP, TCP, and SSL LB on ports which aren’t supported by the TCP proxy and SSL Proxy in GCP
    4. SSL traffic is decrypted by backends and not the load balancer itself
    5. Distributes traffic depending on the protocols, scheme and scope
    6. No TLS offloading or proxying
    7. Forwarding rules in place to distribute defined targets to instance groups – applies for TCP and UDP only as other protocols use target instances
    8. Enforces self-managed SSL certificates
VPC Best Practices:
  1. Use internal IP and Private Google access when possible
  2. Start with a single VPC for resources that have common requirements
  3. Create a VPC for each team, connected to a shared services VPC to maintain granular level control for each VPC
  4. Isolate sensitive data in its own VPC, e.g. for HIPAA/PCI compliance
  5. Consider using VPC Flow Logs for network monitoring and forensics
Cloud Interconnect and Cloud VPN
  1. Cloud VPN:
    1. Connects on-premises network to VPC or two VPCs in GCP
    2. IPSec tunnel over the public internet
    3. encrypted by one gateway and decrypted by the other
    4. site to site vpn only, it doesn’t support site to client
    5. supports up to 3gbps per tunnel with 8 tunnels max
    6. supports both static and dynamic routing
    7. supports for IKEv1 or IKEv2 using shared secret
  2. Cloud Interconnect:
    1. Physical link between Google and on-premise networks
    2. Doesn’t traverse the public internet
    3. Located in colocation facility owned by Google or Google partner with speeds from 50mbps to 200gbps
  3. When to use which:
    1. Interconnect should be used to:
      1. Prevent traffic from going through public internet
      2. to extend your VPC network
      3. when you need low latency and high-speed connection
      4. enables private google access for on-premises hosts
    2. VPN should be used when:
      1. Public internet access is needed
      2. peering location isn’t available
      3. you have budget constraints as this option is much cheaper than Interconnect
      4. You don’t need very high speed and low latency
Cloud DNS and DNSSEC
  1. Each domain has its own zone
  2. It’s built around projects, you can have
    1. Managed Zones – Public/Private
    2. Public – Publicly facing
    3. Private – only within specified VPC networks
  3. DNSSEC is built into Cloud DNS. To enable it, add a DS resource to the TLD at your registrar and then enable DNSSEC on the domain in the Google console
Encryption:
  1. Encryption at rest:
    1. Encryption by default –
      1. Each data chunk has a separate encryption key(DEK)
      2. Backups are encrypted using a separate DEK
    2. Cloud Key Management Service – when you choose to manage your own encryption key for sensitive data
      1. DEK is additionaly encrypted with key encryption key(KEK) so the envelope encryption process is followed – KEKs are not exportable from KMS
      2. Provides an audit trial
      3. Allows to setup an ACL for each key – key per policy
      4. Tracked each time key is used and authenticated
      5. Keys are automatically rotated each 90 days
    3. Customer-Supplied Encryption Keys, with HSM
      1. to match your on-prem encryption setup and achieve even more granular control over encryption keys
  2. Encryption in transit:
    1. Google maintains a strict security measures to protect the physical boundaries of the network
    2. Google Front End
      1. Globally distributed system that Google Cloud services accept requests from, with presence all around the globe
      2. Includes features such as:
        1. terminating traffic for incoming HTTP, HTTPS, TCP, TLS proxy traffic
        2. Provides DDOS attack prevention
        3. routing and load balancing traffic to Google Cloud servics
    3. Types of routing requests within the GCP infrastructure – GCP encrypts and authenticated all data in transit at at least one network layer when data moves outside the physical boundaries controlled by Google
      1. User <-> Google Front End Encryption
      2. User <-> Customer apllication hosted in GCP
      3. VM <-> VM
      4. VM <-> Google Cloud service
      5. Google Cloud service <-> Google Cloud service
    4. By default traffic encryption is performed at the network layer with AES-128
      1. Session keys are established on hosts and protected by ALTS
      2. Security tokens are used for authentication and generated for every flow, consisting of token key and host secret
      3. It protects host from spoofing packets on the network
      4. Google sets up IPSec Tunnels for communication between two networks. It’s encrypted by one VPN gateway and decrypted by the VPN gateway on the other end, with IKE v1(AES-128) and IKE v2(AES-256) supported
      5. Any data sent to GFE is encrypted in transit using TLS, including the API interactions
      6. GFE employes BoringSSL for TLS, which is a fork of openSSL. It’s configured to automatically negotiate the highest version of the protocol
      7. Google Certificate Authority enables identity verification achieved inTLS thorugh the use of a certificate. Certificate holds DNS hostname of server and public key
      8. Root key migration and key rotation:
        1. Google is responsible for the rotation of keys and certificates
        2. TLS certificates are rotated every 2 weeks with lifetime of 3 months
        3. Keys are rotated daily with lifetime of 3 days
      9. Application Layer Transport Security(ALTS)
        1. Layer 7 Traffic
        2. Mutual authentication and transport encryption system developed by Google
        3. used for securing Remote Procedure Call(RPC) communications within Google’s infrastructure
        4. Identities are bound to entities (user, machine, service) instead of to a specific server name or host
        5. relies on both the handshake protocol and the record protocol
        6. governs how sessions are established, authenticated, encrypted and resumed
        7. GFE <-> service | service <-> service
Cloud Key Management Service (KMS):
  1. KMS is a service that lets you manage cryptographic keys for every service within GCP
  2. Generate, use, rotate, destroy symmetric encryption keys
  3. Automatic or at-will key rotation
  4. assymetric and symmetric key support
  5. used to encrypt all data on Google Cloud
  6. Integrated with Cloud IAM and Cloud Audit Logs
  7. Tracked every time it’s used and authenticated and logged
  8. Permissions are handled by ACLS on a per-key basis
  9. used with Cloud HSM
  10. DEKs are encrypted with KEK
  11. Process known as envelope encryption, where you’re encrypting key with another key
  12. central repository for storing KEKs
  13. KEKs not exportable from KMS
  14. automatically rotates KEKs at regular intervals
  15. standard rotation period is 90days
  16. KMS belongs to the project, and the best practice is to actually run KMS in a separate project
  17. You can choose the location where KMS keys are stored and receive requests
  18. You can group keys for your purposes through Key ring, which also makes keys in a key ring to inherit the permissions
  19. Separation of duties:
    1. Ensuring that one individual does not have all necessary permissions to be able to complete a malicious action
    2. users normally shouldn’t have access to decryption keys
    3. helps prevent security or privacy incidents and programmatic errors
    4. Move KMS to its own project
  20. Secrets Management:
    1. Cloud KMS doesn’t directly store secrets
    2. Encrypts secrets that you store elsewhere
    3. Use the default encryption built into Cloud Storage buckets
    4. use application layer encryption using a key in Cloud KMS
Cloud Identity-Aware Proxy:
  1. Establishes a central authorization layer for applications accessed by HTTPS, enabling you to use an application-level access controls instead of using network-level firewalls.
  2. Controls HTTPS access to your applications and VMs on GCP
  3. Central authorization layer for application-level access control
  4. Enforces access control policies for applications and resources
  5. Allows employees to work from untrusted networks without the use of VPN
  6. The flow of IAP is as follows:
    1. User hits Cloud IAP Proxy
    2. Cloud IAP enabled app or backend service validates credentials and performs user authentication
    3. Then it checks if user has been authorized access to the resource
  7. Allows you to access web apps and infrastructure from any device without VPN
  8. it’s built into GCP infrastructure and GSuite
  9. It’s integrated with Cloud Identity and Cloud Armor
  10. Supports both cloud and on-premise
  11. Supports IAP TCP Forwarding:
    1. Control who can access administrative services such as SSH/RDP over the public internet, and putting them behing Cloud IAP protects them from being exposed to the internet
    2. Require user to pass authentication and authorization checks before they gain access to the target resource
  12. IAP Best Practices
    1. Shouldn’t use thirds party CDN to avoid insecure caching
    2. To secure your app, use signed headers
    3. Ensure that all requests to CE or GKE are routed through the load balancer
    4. Configure source traffic to be routed through GFE whenever possible
 
 
Data Loss Prevention:
  1. Allows you to manage and redact sensitive data such as credit card numbers, names, phone numbers, credentials etc
  2. Will alert you when sensitive data is discovered, with an information on the likelihood of the legitimacy of the findings
  3. Results can be imported into BQ for analysis
  4. There are 90+ predefined detectors but you can also define custom ones
  5. Can work on text files as well as images
 
 
Cloud Security Command Center:
  1. Single pane of glass dashboard that allows you to gather data, and identify threats to easily act on them.
  2. It allows you to:
    1. View and monitor inventory of cloud assets
    2. scan for sensitive data
    3. detect vulnerabilities and anomalous behavior
    4. review access rights
    5. inspect your current and past asset states
    6. provides insights on your resources, allowing you to understand your attack surface
    7. it’s native to the Google Cloud and support many 3rd party integrations
  3. General modules:
    1. Asset Discovery and Inventory
      1. Allows you to see all assets in the Organization, such as projects, asset types(new/current/changed), change types and IAM policies
    2. Sensitive Data Identification
      1. Integrates with Cloud DLP solution to provide you all the findings in the SCC
    3. Application vulnerability detection:
      1. Integration with Cloud Security Scanner, enabling you to ship CSS findings to the SCC dashboard
    4. Access control monitoring
      1. ensure the right access control policies are in place
      2. alerts when policies are misconfigured or changed
      3. natively integrates with Forseti
    5. Anomaly Detection from Google
      1. Identify Threats with built-in anomaly detection
        1. anomalies include botnets, crypto mining, generally suspicious network traffic, outbound DDOS traffic, unexpected reboots etc
    6. Third party integrations
      1. Allows you to integrate 3rd party tools with SCC, providing you a single pane of glass dashboard to manage security risks and threats of your GCP environment
    7. Allows you to setup real time notifications via cloud pub/sub notification integration
 
 
Forseti:
  1. Collection of community driven open source security tools, that allow you to pick and choose any or multiple modules independently of each other.
  2. Designed for security at scale
  3. Allows you to create rule based policies and codify the process
  4. Ensures that the security of your environment is governed by consistent set of predefined rules
  5. You can modify the resources of your choosing and have Forseti notify you when anything changes
  6. You can use enforcer mode, so that when someone makes a change that’s against the policies you’ve setup in Forseti, it’ll automatically revert the changes and enforce your best practices
  7. Snapshots of the inventory are saved into Cloud SQL
 
 
 
 
DDOS Mitigation:
  1. There are mechanisms in place to protect the GCP cloud
  2. CSP responsibility:
    1. Ensure that no single service can overwhelm the shared infrastructure
    2. Provide isolation among customers using the shared infrastructure
    3. Deployed detection systems
    4. Implemented barriers
    5. Absorbing attacks by up-scaling
  3. Customer responsibility:
    1. Reduce the attack surface:
      1. Isolate and secure your network subnets, FW rules, tags and IAM
      2. Use FW rules and protocol forwarding
      3. Anti-spoofing protection is provided for the private network by default
      4. Automatic isolation between virtual networks
    2. Isolate Internal traffic from the external world:
      1. Deploy instances without public IPs unless necessary
      2. Setup a NAT gateway or SSH bastion host to limit the number of instances exposed to the internet
      3. Deploy internal LB on internal client instances that access internally deployed services to avoid exposure to the external world
    3. Enable proxy-based LB
      1. HTTPS or SSL proxy load balancing allows Google infrastructure to mitigate many L4 and below attacks, such as SYN floods, IP fragment floods, port exhaustion etc
      2. Disperse attack across instances around the globe with HTTP/S load balancing to instances in multiple regions
    4. Scale to absorb the attacks
      1. Protection by GFE Infrastructure
        1. Global load balancing
        2. Scales to absorb certain types of attacks such as SYN floods
      2. Anycast-based LB
        1. HTTP/S LB and SSL proxy enable a single anycast IP to front-end
      3. Autoscaling
    5. Protection with CDN Offloading, where Google Cloud CDN acts as a proxy
    6. Deploy 3rd party DDOS protection solutions that integrate with GCP out of the box
    7. App Engine deployment:
      1. Fully multi-tenant system
      2. Safeguards in place
      3. Sits behind the GFE
      4. Specify a set of IPs / networks
    8. Google Cloud Storage:
      1. Use signed URLs to access Google Cloud Storage
    9. API Rate Limiting:
      1. Define the number of allowed requests to Compute Engine API
      2. API rate limits apply on per-project basis
      3. Projects are limited to an API rate limit of 20 requests/second
    10. Resource Quotas:
      1. Quotas help prevent unexpected spikes in usage
    11. Cloud Armor – works with global HTTP/S LB to provide defense against DDoS attacks, by using security policies that are made up of rules at allow or prohibit traffic from IP addresses or ranges defined in the rule:
      1. Is implemented at the edge of Google networks, and supports HTTP, HTTPS and HTTP/2
      2. Security policies are allow/deny list type of rules and can be specified for backend services
      3. You can deny/allow both precise IPs as well as whole CIDR ranges
      4. You can test and preview the rules without going live
      5. Logging module allows you to see triggered policy, associated action and related information
 
 
Cloud Security Scanner:
  1. Is a web security scanner for common vulnerabilities in App Engine, Compute Engine and GKE applications
  2. Provides automatic vulnerability scanning testing for issues such as XSS, mixed content, cleartext passwords, insecure JS libraries etc
  3. Available at no extra cost and has very low rates of false positives
  4. You can perform an immediate scan or schedule it to run on periodic basis
  5. You should run in QA environment as running in prod can cause unexpected issues and hinder the UX for legitimate application users. Cloud Security Scanner has a fuzzer type of an engine, it’ll try to play with your APIs, post comments, add posts which is something you most likely don’t want to become visible in publicly available instance of your application
 
 
 
Compute Engine Best Practices:
  1. All instances should run with service accounts, instead of giving users direct access to the instance
    1. Create a new service accounts and do not use default service accounts, so you can follow the principle of least privilege
    2. Grant users the serviceAccountUser role at the project level, to provide them the ability to create/manage instances
      1. Set permissions to allow create an instance, attach a disk, set instance metadata, use ssh, reconfigure an instance to run as a service accounts
  2. Track how your CE resources are modified and accessed having always an audit trail of who did what and when
  3. Networking:
    1. Separate instances that don’t need intra-network communication should be put into different VPC networks
  4. Image Management:
    1. Restrict the use of public images
    2. Allow only approved images, which are hardened with software approved by the security team
    3. Utilize the Trusted Image feature
    4. Set the configuration for images management on the Organization level
  5. Patch Management:
    1. In modern day cloud infrastructure you want to strive to achieve immutable infrastructure, so when you need to perform a patch or upgrade, you should consider replacing the CE instance instead of updating it
Google Kubernetes Engine(GKE) Security
  1. GKE is managed environment for deploying containerized applications, by grouping them into easily manageable units
  2. GKE handles:
    1. deployment
    2. auto-scaling
    3. updates
    4. load balancing
    5. auto recovery
  3. The shortened description of Google Kubernetes’ architecture is as follows:
    1. Cluster – consists of one cluster master and one/multiple worker machines called nodes
      1. Cluster Master runs the Kubernetes control plane processes such as Kubernetes API server, scheduler and core resource controllers
    2. Node – worker machine that runs containerized apps and other types of workloads. Each node is a Compute Engine instance provisioned by GKE during cluster creation
    3. Pod – smallest, simplest deployable object in Kubernetes. A pod represents a single instance of a running process in the cluster. It’s running on the Kubernetes nodes
  4. GKE Networking:
    1. Internal Cluster networking:
      1. Cluster IP is an IP address assigned to a service, and is stable for its lifetime
      2. Node IP – IP address assigned to a given node, comes from cluster’s VPC network and each node has a pool of IP addresses to assign to its pods
      3. Pod IP – IP address assigned to a given pod, shared with all containers in that pod and pods IPs are ephemeral by their nature
      4. Label – arbitrary key/value pair attached to an object
      5. Service – grouping of multiple related pods into a logical unit using labels
      6. Stable IP address, DNS entry and ports
      7. Provides load balancing among the set of pods whose labels match all the labels defined in the label selector when the service is created
      8. kube-proxy – a component running on each node that manages connectivity between pods and services
        1. egress-based LB controller
        2. continually maps the cluster IP to healthy pods
      9. Namespace – virtual clusters backed by the same physical cluster
        1. Intended for use in environments with many users spread across multiple projects or teams such as dev, qa, production
        2. a way to divide cluster resources between multiple users
        3. unique name within the cluster
  5. Kubernetes Security:
    1. Authentication and Authorization
      1. Service accounts pod level
      2. Disable Attribute based access control(ABAC) and use RBAC – role based access control
        1. rbac allows you to grant permissions to resources at the cluster/namespace level
      3. Follow the principle of least privilege and reduce node service account scopes
    2. Control Plane Security
      1. Components are managed and maintaned by Google
      2. Disable Kubernetes Web UI
      3. Disable authentication with client certs
      4. Rotate credentials on regular basis
    3. Node Security
      1. Use container-optimized OS for enhanced security
        1. i.e. locked down firewall
        2. read-only filesystem wherever possible
        3. limited user accounts and disabled root login
      2. Patch OS on regular basis, ideally with enabled automatic upgrades
      3. Protect the OS on the node from untrusted workloads running in the pods
      4. Use metadata concealment to ensure pods do not have access to sensitive data
    4. Network Security principles:
      1. All pods in the cluster can communicate with each other
      2. Restrict ingress/egress traffic of pods using network policies in a namespace
      3. Load balance pods with a service of type LoadBalancer
      4. Restrict which IP address ranges can access endpoints
      5. Filter authorized traffic using kube-proxy
      6. Use Cloud Armor / IAP Proxy when using external HTTPS LB
    5. Securing workloads:
      1. Limit pod container process privileges using PodSecurityPolicies
      2. To give pods access to GCP resources
        1. workload identity
        2. node service account
    6. Best practices for container security:
      1. Package a single application per container
      2. Create a process for managing zombie processes
      3. Optimize for the docker build cache, to allow accelerated building later on
      4. Remove unnecessary tools
      5. Build the smallest, most lightweight image possible
      6. Properly tag your images
      7. Use trusted images and be careful when using public images
      8. Use container registry vulnerability scanning to analyze container images
Secrets Management:
  1. Common concerns while using secrets such as passwords, token, API keys, private keys etc:
    1. Authorization and access management
    2. Auditability on per-secret level
    3. Encryption at rest and protection in case of unauthorized access
    4. Rotation of secrets automatically or on-demand
    5. Isolation, separation, management vs usage of secrets, separation of duties
  2. Using secrets encrypted in code with Cloud KMS
    1. Encrypt secrets at the application layer
    2. Limit the scope of access to the secret
    3. Restricted to all developers with access to the code
    4. Must have access to both the code and access key
    5. Audited for those who do have access
  3. Cloud Storage bucket, encrypted at rest
    1. Limits access to smaller set of developers
    2. Auditable
    3. Separation of systems. separate from code repository
    4. Able to rotate secrets easily
  4. Third party solutions
    1. Dedicated secret management tools
    2. Auditable
    3. Separation of systems
    4. Rotate secrets automatically
  5. Changing secrets:
    1. Rotating secrets
      1. Rotate secrets regularly
      2. Store few versions of a secret
      3. Rotate/rollback if needed
    2. Caching secrets locally
      1. May be required by the application
      2. Can be rotated frequently, even several times per hour
      3. Can refresh secrets quickly
    3. Separate solution or platform
      1. platform agnostic
      2. automatic and scheduled secret rotation
  6. Managing Access:
    1. Limiting Access:
      1. Create two projects, one for Cloud Storage to store secrets and one of Cloud KMS to manage encryption keys
      2. Assign roles to access secrets, you can use service accounts for that
      3. Store each secret as an encrypted object in Cloud Storage, group them as needed
      4. Rotate secrets and encryption keys regularly
      5. Protect each bucket by using encryption. Although buckets have default encryption, it’s recommended to use Cloud KMS at the application layer
      6. Enable Cloud Audit Logs for activity monitoring
    2. Restricting and enforcing access:
      1. Access controls on the bucket in which the secret is stored
        1. Support for multiple secrets/objects per bucket
        2. single secret per bucket
      2. Access controls on the key that encrypted the bucket in which the secret is stored
        1. Support for multiple secrets/objects per key
        2. single secret per key
      3. Usage of service accounts is encouraged
    3. Best Practices:
      1. Limit the amount of data that one encryption key protects:
        1. Cryptographic isolation
        2. Allows for more granular control over secret access
        3. Helps prevent accidental permissions
        4. Supports more granular auditing
      2. Store each secret as its own object
      3. Store similar secrets in the same bucket
      4. One encryption key per bucket
      5. Regularly rotate keys and secrets to limit the lifecycle of each
      6. Enable Cloud Audit logging
    4. Kubernetes Secret Management:
      1. Generic – local file, directory or literal value
      2. dockercfg secret for use with a Docker registry
      3. TLS secret from an existing KMS public/private keypair
      4. Secret values are encoded in base64
      5. Encrypt secrets at the application layer using KMS keys
      6. Use secrets by:
        1. Specifying environment variables that reference the secrets value
        2. Mounting a volume containing the secret
      7. Third party solutions can be used
Cloud Storage and Storage Types:
  1. Cloud Storage offers four storage classes:
    1. Multi-regional
    2. Regional
    3. Nearline
    4. Coldline
  2. All storage classes provide:
    1. Low latency
    2. High durability
  3. Storage classes differ by availability, minimum storage duration and pricing
  4. The storage class set for an object affects its availability and pricing
  5. Object’s existing storage class can be changed:
    1. Rewriting the object
    2. Object lifecycle management
Cloud Storage Permissions and Access Control Lists
  1. IAM:
    1. Grant access to buckets as well as bulk access to bucket’s objects
    2. Can be added to project or bucket
    3. Broad control over buckets
      1. no fine-grained control
    4. set the minimum permissions needed
    5. recommended to set permissions for buckets
  2. ACLs:
    1. Customize access to individual objects within a bucket
    2. Can be added to bucket or object
    3. Fine-grained control over individual objects
    4. Supplement each other with IAM
    5. Public access can be granted to objects
    6. Defined by permissions and scope
    7. Permission can be:
      1. Owner
      2. Writer
      3. Reader
    8. Scope can be:
      1. Google account
      2. Google gropus
      3. Convenience values for projects (such as viewers-project)
      4. GSuite/Cloud Identity domain
      5. All Google account holders
      6. AllUsers
    9. Default ACLs
      1. All new buckets assigned with a default ACL
      2. When default ACL for bucket is changed, it’s propagated to all objects
    10. Signed URLs
      1. an URL that provides time-limited read/write/delete access to an object in cloud storage
      2. those who have access to the URL can access the object for the duration of time specified
      3. no google account is needed for access
Data Retention Policies using Bucket Lock
  1. Allows you to configure a data retention policy for Cloud Storage bucket to govern how long objects in the bucket must be retained. The feature also allows you to lock the data retention policy, permanently preventing the policy from being reduced or removed
  2. Used for Write Once Read Many (WORM) storage
  3. Prevents deletion or modification of data for a specified time period
  4. Helps meet compliance, legal and regulatory requirements for data retention
  5. Works with all tiers of Cloud Storage
  6. Lifecycle policies can be applied to automatically move locked data to colder storage classes
  7. Retention policies
    1. Can be included when creating a new bucket
    2. Add a retention policy to an existing bucket
    3. Ensures that all current and future objects in the bucket cannot be deleted or overwritten until they reach the age defined in the policy
    4. Tracked by retention expiration time metadata
  8. Retention periods:
    1. Measured in seconds,
    2. Can be set in days, months or years
    3. Maximum is 100 years
  9. Retention policy locks:
    1. Prevent the policy from ever being removed and retention period from ever being reduced
    2. Once a retention policy is locked, you cannot delete the bucket until every object has met the retention period
    3. Locking a retention policy is irreversible
  10. Object holds:
    1. metadata flags that are placed on individual objects
    2. Objects with holds cannot be deleted
      1. Event-based holds
      2. temporary holds
    3. Event-based holds can be used in conjunction with retention policies to control retention based on event occurrences
    4. Temporary holds can be used for regulatory or legal investigation purposes
    5. Objects can have one, both or neither
  11. Compliance:
    1. Can be used to comply with financial institution regulatory requirements for electronic record retention such as SEC, FINRA etc
BigQuery Security:
  1. Integrates with DLP, Cloud Storage and Stackdriver
  2. Authorized Views:
    1. View access to a dataset
    2. Cannot assign access controls directly to tables or views
    3. Lowest level is the dataset level
    4. Allows you to share query results with users/groups
    5. Restricts access to the underlying tables
    6. Allows you to use the view’s SQL query to restrict the columns users are able to query
    7. Must be created in a separate dataset
  3. Exporting data:
    1. Can be exported to CSV, JSON, Avro
    2. Up to 1GB data to a single file
    3. Can only export to Cloud Storage
  4. Datasets can be scanned for PII with DLP
Stackdriver:
  1. a set of tools logging, debugging and monitoring.
  2. Available for GCP and AWS
  3. Provides VM monitoring with agents
  4. Stackdriver products:
    1. Stackdriver Monitoring – metrics, time series, health checks, alerts
    2. Stackdriver Logging – central aggregation of all log activity
    3. Stackdriver Error Reporting – Identify and understand application errors
    4. Stackdriver Debug – identify code errors in production
    5. Stackdriver Trace – find performance bottlenecks in production
    6. Stackdriver Profiler – identify CPU, memory and time consumption patterns
  5. Integration with 3rd party products within one view
  6.  Stackdriver Logging:
    1. Central repository for log data from multiple sources
    2. Real-time log management and analysis
    3. Tight integration with monitoring
    4. Platform, system and application logs
    5. Export logs to other sources for long-term storage and analysis
    6. General ideas:
      1. associated primarily with GCP projects
        1. Logs Viewer only shows logs from one project
      2. Log Entry records a status or an event
        1. Project receives log entries when services being used produce log entries
      3. Logs are a named collection of log entries within a GCP resource
        1. Each log entry includes the name of its log
        2. Logs only exist if they have log entries
      4. Retention period – length of time for which logs are kept
    7. Types of logs:
      1. Audit Logs:
        1. who did what, where and when
        2. Admin activity
        3. Data access
        4. System events
      2. Access Transparency Logs
        1. Actions taken by Google staff when accessing your data
      3. Agent logs:
        1. logging agents that run on VMs
        2. sends system and third party logs on the VM instance to stackdriver logging
    8. Audit log types:
      1. Admin activity logs:
        1. API calls or other administrative actions
        2. always written
        3. cannot disable or configure them
        4. no charge
      2. Data Access Logs
        1. API calls that create, modify, read resource data provided by the user
        2. Disabled by default
        3. Must be explicitly enabled
        4. Charges apply
      3. Audit logs:
        1. System event Audit Logs:
          1. GCP administrative actions
          2. Generated by Google, not by user action
          3. Always written
          4. Cannot disable/configure them
          5. No charge
        2. Access Transparency Logs
          1. Actions taken by Google staff when accessing your data
            1. Investigations into your support requests
            2. Investigations recovering from an outage
          2. Enabled for entire Organization
          3. Enterprise support is needed as that’s when such activity even happen
        3. Agent Logs:
          1. Sends system and 3rd party logs on the VM to Stackdriver Logging
          2. Charges apply
    9. IAM Roles:
      1. Logging Admin: Full control and able to add other members
      2. Logs Viewer – only view logs
      3. Private Logs Viewer – View logs, private logs
      4. Logs Writer – grant service account permissions to write
      5. Logs Configuration writer – create metrics and export sinks(for extended storage, big data analytics, streaming to other apps/systems)
  7. VPC Flow Logs – record a sample of network flows sent from/received by VM instances. It’s useful for network monitoring, forensics and real-time security analysis
    1. Can be viewed through Stackdriver Logging
    2. Aggregated by connection from VMs and exported in real time
    3. Subscribing to Cloud Pub/Sub enables streaming so that flow logs can be analyzed in real time
    4. Enable/disable per VPC subnet
    5. Each flow record covers all TCP and UDP flows
    6. Filters can be applied to select which flow logs should be excluded from Stackdriver Logging and exported to external APIs
    7. No delay in monitoring, as Flow Logs and native to GCP network stack
    8. Collected for each VM at specified intervals
    9. All packers are collected for a given interval and aggregated into a single flow log entry
  8. Stackdriver Monitoring
    1. Full stack monitoring for GCP, AWS and 3rd party apps
    2. Provides single pane of glass dashboarding, integrates with Stackdriver Logging
    3. Monitoring agent:
      1. gathers system and application metrics from VM
      2. Without the agent on VM, only CPU/disk traffic/network traffic and uptime metrics are collected
      3. Can monitor many 3rd party apps
    4. Can monitor GKE clusters starting from general cluster metrics to inspection of services, nodes, pods and containers
    5. Alerting:
      1. Policies can be defined to alert you when service is considered unhealthy(depends on the criteria you’ve specified)
      2. Allows notification through email, pagerduty, slack, SMS
  9. Stackdriver APM – set of tools that work with code/apps running on cloud and on-premise infrastructure. Helps monitor and manage application performance
    1. Consists of Stackdriver [Trace/Debugger/Profiler]
    2. It’s a set of tools used by Google’s Site Reliability Engineering Team
    3. Stackdriver Trace – helps understand how long it takes the application to handle incoming requests
    4. Stackdriver Debugger:
      1. debug a running app without slowing it down thanks to an option to create a snapshot i.e. capture and inspect the call stack and local variables in the application
      2. inject logging into running services at available logpoints
    5. Stackdriver Profiler:
      1. continuously gathers CPU usage and memory allocation information from your applications
      2. helps discover patterns of resource consumption
    6. Stackdriver Error Reporting:
      1. Real time error monitoring and alerting
      2. Counts, analyzes and aggregates the crashed in GCP environment
      3. Alerts when a new application error happens
  10. Logs exports:
    1. You can export the logs(defined by your query) to:
      1. Cloud Storage
      2. BigQuery
      3. Cloud Pub/Sub – useful for exporting to SIEM-alike system
    2. Logs exports aren’t charged
Cloud Responsibility Model
  1. Security of the cloud – Google
  2. Security in the cloud – User

 

 

A big credit for this contribution goes to the bloggers and platforms from which I’ve learnt a ton, to name a few: pluralsight, cybrary, pentesteracademy, linuxacademy, cousera, udemy, infosecacademy.

 

Hope this is helpful.

2 thoughts on “Security Principles of Google Cloud Platform

  1. Jeśli chodzi o wartość przekazywaną w materiałach i ogromne zaangażowanie Pana mogę uznać z czystym sumieniem, że jest Pan w topce jak nie najlepszym Blogerem/Youtuberem o temace cybersecurity w Polsce.

    Liked by 1 person

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.