SaFi Bank Space : Cloud Storage Bucket data lifecycles

Cloud Storage provides globally unified, scalable, and highly durable object storage. Cloud Storage buckets can be created in a single region, dual regions, or multi-regions within a continent.

If a zone experiences an outage, data in the unavailable zone is automatically and transparently served from elsewhere in the region. Data and metadata are stored redundantly across zones, starting with the initial write. No writes are lost upon a zone becoming unavailable.

In the case of a regional outage, regional buckets in that region are offline until the region becomes available again.

When higher availability is required, you should consider storing data in a dual-region or multi-region configuration. Cloud Storage uses Cloud Load Balancing to serve dual-region and multi-region buckets from different regions. In the case of a regional outage, serving is not interrupted.

Cloud Storage dual-region and multi-region configurations replicate written data synchronously to another zone within the same region and asynchronously to another region or regions. For more information, see Bucket locations in the Cloud Storage documentation.

During a regional outage, data that was recently written to the affected region may not have been replicated to other regions. As a result, that data may not be accessible during the outage, and could be lost in the case of physical destruction of the data in the affected region.

(!) Google Cloud Storage Bucket is often used as backup storage for other GCP services. It has unlimited storage with no minimum object size.

Available storage classes

The following table summarizes the primary storage classes offered by Cloud Storage.

Storage Class

Name for APIs and gsutil

Minimum storage duration

Typical monthly availability

Standard storage

STANDARD

None

  • >99.99% in multi-regions and dual-regions

  • 99.99% in regions

Nearline storage

NEARLINE

30 days

  • 99.95% in multi-regions and dual-regions

  • 99.9% in regions

Coldline storage

COLDLINE

90 days

  • 99.95% in multi-regions and dual-regions

  • 99.9% in regions

Archive storage

ARCHIVE

365 days

  • 99.95% in multi-regions and dual-regions

  • 99.9% in regions

See the class descriptions for the availability SLA for each storage class.

Bucket retention policies and policy locks

By default you can delete and replace objects in the bucket at any time.

  • Retention policies allow you to configure a data retention policy for a Cloud Storage bucket that governs how long objects in the bucket must be retained.

    • You can add a retention policy to a bucket to specify a retention period.

      • If a bucket has a retention policy, objects in the bucket can only be deleted or replaced once their age is greater than the retention period.

      • A retention policy retroactively applies to existing objects in the bucket as well as new objects added to the bucket.

  • Retention policy locks allow you to lock the data retention policy, permanently preventing the policy from being reduced or removed.
    This feature can provide immutable storage on Cloud Storage. In conjunction with Detailed audit logging mode, which logs Cloud Storage request and response details, Bucket Lock can help with regulatory and compliance requirements, such as those associated with FINRA, SEC, and CFTC. Bucket Lock may also help you address certain health care industry retention regulations.

    • You can lock a retention policy to permanently set it on the bucket.
      Once you lock a retention policy, you cannot remove it or reduce the retention period it has.

    • You cannot delete a bucket with a locked retention policy unless every object in the bucket has met the retention period.

    • You can increase the retention period of a locked retention policy.

    • Locking a retention policy can help your data comply with record retention regulations.

      (warning) Locking a retention policy is an irreversible action. Once locked, you must delete the entire bucket in order to "remove" the bucket's retention policy. However, before you can delete the bucket, you must be able to delete all the objects in the bucket, which itself is only possible if all the objects have reached the retention period set by the retention policy. (warning)

You can configure your bucket retention policies according to your desired RTO & RPO.

Retention policies

You can include a retention policy when creating a new bucket, or you can add a retention policy to an existing bucket. Placing a retention policy on a bucket ensures that all current and future objects in the bucket cannot be deleted or replaced until they reach the age you define in the retention policy. Attempts to delete or replace objects whose age is less than the retention period fail with a 403 - retentionPolicyNotMet error.

To help track when individual objects are eligible for deletion, objects in a bucket with a retention policy each have retention expiration time metadata. This piece of metadata shows the date and time when an object fulfills the retention period.

When working with retention policies, keep in mind the following:

  • Unless the retention policy is locked, you can increase, decrease, or remove the retention policy from a bucket.

  • Changing a retention policy is considered a single Class A operation, regardless of the number of objects affected.

  • An object's editable metadata is not subject to the retention policy and can be modified even when the object itself cannot be.

  • A retention policy contains an effective time, the time after which all objects in the bucket are guaranteed to be in compliance with the retention period.

  • To see the earliest date when a given object is eligible for deletion in a bucket with a retention policy, view the retention expiration date portion of the object's metadata.

The following are interactions that retention policies have with other Cloud Storage features:

  • Retention policies and Object Versioning are mutually exclusive features in Cloud Storage: for a given bucket, only one of these can be enabled at a time. Any versioned objects remaining in a bucket when you apply a retention policy are also protected by the retention policy.

  • You can use Object Lifecycle Management to automatically delete objects in a bucket, including in a bucket with a locked policy. A lifecycle rule won't delete an object until after the object fulfills the retention policy.

  • You should not perform parallel composite uploads if your bucket has a retention policy, because the component pieces cannot be deleted until each has met the bucket's minimum retention period.

  • Attempting to complete an XML API multipart upload fails if the resulting object would overwrite an object that has not yet met its retention period.

  • You can use the retention policy constraint in your organization policies to require that retention policies with specific retention periods be included as part of creating a new bucket or as part of adding/updating the a retention policy on an existing bucket.

Retention periods

Retention periods are measured in seconds; however, some tools, like the Google Cloud console and gsutil allow you to set and view retention periods with other units of time for convenience. The following conversions apply in such cases:

  • A day is considered to be 86,400 seconds.

  • A month is considered to be 31 days, which is 2,678,400 seconds.

  • A year is considered to be 365.25 days, which is 31,557,600 seconds.

You can set a maximum retention period of 3,155,760,000 seconds (100 years).

For gsutil, when specifying a retention period, you specify an integer and a unit, where the unit can be s, d, m, or y to signify seconds, days, months, or years, respectively. Only one unit of time can be used in a command. For example, you can use 86400s or 1d, but you cannot use 1d30s.

Configure bucket retention policy via Terraform
resource "google_storage_bucket" "bucket_example" {
  
  name          = "bucket-example"
  location      = "ASIA"

  uniform_bucket_level_access = true

  ## Configuration of the bucket's data retention policy 
  ## for how long objects in the bucket should be retained.
  retention_policy {
    ## The period of time, in seconds, that objects in the bucket must be retained and 
    ## cannot be deleted, overwritten, or archived. The value must be less than 2,147,483,647 seconds.
    ## 259200 sec = 30 days
    retention_period = 2592000
  }
  <...>
}

Retention policy locks

When you lock a retention policy on a bucket, you prevent the policy from ever being removed or the retention period from ever being reduced (although you can still increase the retention period). If you try to remove or reduce the policy duration of a locked bucket, you get a 400 BadRequestException error. Once a retention policy is locked, you cannot delete the bucket until every object in the bucket has met the retention period.

Locking a retention policy is irreversible, and you should be familiar with the implications of doing so prior to using this feature. When you use an unlocked retention policy, you have the ability to remove the policy, allowing you to still delete objects when desired. When you lock a retention policy, you must delete the entire bucket in order to "remove" the policy. However, you can't delete the bucket if there are objects in it that haven't fulfilled their retention period. Thus, to "remove" a locked retention policy, you have to wait until every object in the bucket has fulfilled its retention period, at which point you can delete the bucket.

Additionally, when you lock a retention policy, Cloud Storage automatically applies a lien to the projects.delete permission for the project that contains the bucket. While in place, the lien prevents the project from being deleted. To delete the project, you must first remove all such liens. Note that removing a lien requires the resourcemanager.projects.updateLiens permission, which is part of the roles/owner and roles/resourcemanager.lienModifier roles.

Configure bucket retention policy lock via Terraform
resource "google_storage_bucket" "bucket_example" {
  
  name          = "bucket-example"
  location      = "ASIA"

  uniform_bucket_level_access = true

  ## Configuration of the bucket's data retention policy 
  ## for how long objects in the bucket should be retained.
  retention_policy {
    ## The bucket will be locked and permanently restrict edits to the bucket's retention policy. 
    ## Caution: Locking a bucket is an irreversible action.
    is_locked = true
  }
  <...>
}

Object versioning

Object Versioning can be enabled on a bucket in order to retain older versions of objects. When the live version of an object is deleted or replaced, it becomes noncurrent if versioning is enabled on the bucket. If you accidentally delete a live object version, you can restore the noncurrent version of it back to the live version.

(warning) Caution: Object Versioning does not protect your data if you delete the entire bucket. (warning)

Object Versioning increases storage costs, but this can be partially mitigated by configuring Object Lifecycle Management to delete older object versions. For one possible setup, see the lifecycle configuration example for deleting objects.

Configure bucket with object versioning via Terraform
resource "google_storage_bucket" "bucket_example" {
  
  name          = "bucket-example"
  location      = "ASIA"

  uniform_bucket_level_access = true

  versioning {
    ## While set to true, versioning is fully enabled for this bucket.
    enabled = true
  }
  <...>
}

Object Lifecycle Management

Object Lifecycle Management can be configured for a bucket, which gives you more automated control over deleting objects. When you define a lifecycle configuration, Cloud Storage performs a specified action on an object only if the object meets your criteria.

Lifecycle configuration

Each lifecycle management configuration contains a set of rules. Each rule contains one action and one or more conditions.

  • An object has to match all of the conditions specified in a rule for the action in the rule to be taken.

  • If you specify multiple rules that contain the same action, the action is taken on an object when that object matches the conditions in any of the rules.

  • If multiple rules have their conditions satisfied simultaneously for a single object, Cloud Storage performs the action associated with only one of the rules, based on the following considerations:

    • The Delete action takes precedence over any SetStorageClass action.

    • The SetStorageClass action that switches the object to the storage class with the lowest at-rest storage pricing takes precedence.

    For example, if you have one rule that changes the object's class to Nearline storage and another rule that changes the object's class to Coldline storage, but both rules use the exact same condition, the object's class always changes to Coldline storage when the condition is met.

  • You should test your lifecycle rules on development data before applying to production to ensure your rules don't perform actions under unintended sets of conditions. If that's not possible, you should test on a small subset of your production data by using the MatchesPrefix or MatchesSuffix conditions in your rules.

  • Changes to a bucket's lifecycle configuration can take up to 24 hours to go into effect, and Object Lifecycle Management might still perform actions based on the old configuration during this time.

    For example, if you change an Age condition from 10 days to 20 days, an object that is 11 days old could be deleted by Object Lifecycle Management up to 24 hours later, due to the criteria of the old configuration.

For use cases, see Configuration examples for Object Lifecycle Management.

Lifecycle actions

A lifecycle rule specifies exactly one of the following actions:

  • Delete - action deletes an object when the object meets all conditions specified in the lifecycle rule.

    Exception: In buckets with Object Versioning enabled, deleting the live version of an object causes it to become a noncurrent version, while deleting a noncurrent version deletes that version permanently. See the configuration for deleting objects for an example of using the Delete action along with Object Versioning.

    The Delete action does not take effect on an object while the object has an object hold placed on it or a retention policy that it has not yet fulfilled. As long as the conditions in the Delete action remain satisfied for the object, the Delete action occurs after any object hold is removed and any retention policy is fulfilled.
    (warning) Once an object is deleted, it cannot be undeleted. You should take care when setting up your lifecycle rules so that you do not cause more data to be deleted than you intend. (warning)

  • SetStorageClass - action changes the storage class of an object and updates the object's modification time when the object meets all conditions specified in the lifecycle rule.
    SetStorageClass supports the following storage class transitions:

Original storage class

New storage class

Durable Reduced Availability (DRA) storage

Nearline storage
Coldline storage
Archive storage
Multi-Regional storage/Regional storage

Standard storage, Multi-Regional storage, or Regional storage

Nearline storage
Coldline storage
Archive storage

Nearline storage

Coldline storage
Archive storage

Coldline storage

Archive storage

Cloud Storage does not validate correctness of the storage class transition. This means that you can specify a storage class transition not listed in the above table, but the transition will not occur. You should verify that your lifecycle rules use one of the listed storage class transitions.

  • AbortIncompleteMultipartUpload - action aborts an incomplete multipart upload and deletes the associated parts when the multipart upload meets the conditions specified in the lifecycle rule.

    Only the following lifecycle conditions can be used with this action:

    • Age

    • MatchesPrefix

    • MatchesSuffix

    Attempting to create a rule that uses the AbortIncompleteMultipartUpload action in combination with other conditions results in an error.
    More about lifecycle conditions

Configuration of bucket lifecycle rules via terraform: see the documentation