melkorm's blog

Hosting Single Page Applications on GCP

With all Single Page Applications, there is a problem of hosting it without any backend and also making sure that seo, cache and routing works correctly, which means:

The last point on the list is the hardest to implement correctly.

Now, let me show you how to do it properly with Google Cloud using Cloud Storage, Cloud Load Balancer and CDN. We will use terraform as Infrastructure as a Code which can be easily applied or clicked-trough on GCP.

Google storage bucket setup

This post assumes that the user has already setup google_compute_global_forwarding_rule with http to https redirect and global IP address.

Let’s configure our bucket using google_storage_bucket resource:

resource "google_storage_bucket" "bucket" {
  project = var.project // your GCP project ID

  name          = "example-bucket-name" // unique across GCP
  location      = "EU"
  force_destroy = false

  uniform_bucket_level_access = true
}

This bucket serves as the origin for all traffic. The actual routing to index.html will be configured in the Load Balancer setup below. To allow users access it we need to setup google_storage_bucket_iam_member resource.

Warning: all files uploaded to website’s bucket are public!

Setting up access policy:

resource "google_storage_bucket_iam_member" "public_rule" {
  bucket = google_storage_bucket.bucket.name
  role   = "roles/storage.objectViewer"
  member = "allUsers"
}

This way we allow everyone to view at files in our bucket.

Now we need to define service for our loadbalancer to use to route traffic to the bucket. For this we can use google_compute_backend_bucket resource:

resource "google_compute_backend_bucket" "bucket" {
  project = var.project

  name = "website-backend-bucket"

  description      = "Backend bucket for SPA"
  bucket_name      = google_storage_bucket.bucket.name
  enable_cdn       = true
  compression_mode = "AUTOMATIC" // enable compression over the wire
  cdn_policy {
    default_ttl = 3600 * 24 * 30 // one month cache
    max_ttl     = 3600 * 24 * 30 // one month cache
  }
}

This default_ttl (one month) applies primarily to static assets that do not have their own explicit Cache-Control metadata set when uploaded. We will override this for index.html in the deployment step. This also assumes that static assets have random hash added during build.

Load balancer setup

resource "google_compute_url_map" "url_map" {
  project = var.project

  name = "website-url-map"

  default_service = google_compute_backend_bucket.bucket.id

  host_rule {
    hosts        = ["example.org"]
    path_matcher = "website"
  }

  path_matcher {
    name            = "website"
    default_service =  google_compute_backend_bucket.bucket.id

    dynamic "route_rules" {
      for_each = ["/images", "/public"] // static assets paths
      content {
        priority = 10 + route_rules.key
        service  = google_compute_backend_bucket.bucket.id

        match_rules {
          prefix_match = route_rules.value
        }
      }
    }

    route_rules {
      priority = 20
      service  = google_compute_backend_bucket.bucket.id

      match_rules {
        prefix_match = "/404.html"
      }

      custom_error_response_policy {
        error_response_rule {
          match_response_codes   = ["404"]
          path                   = "/index.html"
          override_response_code = 404
        }
        error_service = google_compute_backend_bucket.bucket.id
      }
    }

    route_rules {
      priority = 30
      service  = google_compute_backend_bucket.bucket.id

      match_rules {
        ignore_case         = true
        path_template_match = "/**"
      }

      route_action {
        url_rewrite {
          path_template_rewrite = "/index.html"
        }
      }
    }
  }
}

There is a lot going in this snippet, so let’s see what is what.

route_rules are evaluated from highest priority (lowest number) to lowest (higher number). This means priority 1 will be evaluated before priority 2. We increase priority by ten to easily add other rules in between without changing all other rules priorities.

First rules is:

dynamic "route_rules" {
  for_each = ["/images", "/public"] // static assets paths
  content {
    priority = 10 + route_rules.key
    service  = google_compute_backend_bucket.bucket.id

    match_rules {
      prefix_match = route_rules.value
    }
  }
}

The rule is dynamic to setup all the paths we have for static files. We need to make sure that assets requests go directly to storage bucket and won’t be redirected to index.html. This also assumes that the website doesn’t host any static files on root path, if it does it is also needed to be added to the list, e.g. robots.txt or favicon.

Next we setup our 404 rule, this is needed to have an explicit rule on our load balancer which will return 404 http code when needed and still render our SPA app.

route_rules {
  priority = 20
  service  = google_compute_backend_bucket.bucket.id

  match_rules {
    prefix_match = "/404.html" // path in our SPA app to render not found page
  }

  custom_error_response_policy {
    error_response_rule {
      match_response_codes   = ["404"]
      path                   = "/index.html"
      override_response_code = 404
    }
    error_service = google_compute_backend_bucket.bucket.id
  }
}

Important! 404.html file CAN’T exist in our bucket, otherwise bucket backend will return 200 OK to our load balancer and we won’t trigger custom_error_response_policy section.

In this rule we setup custom_error_response_policy which will work as follows:

CDN setup

In our bucket backend we have enabled CDN but there is still a problem. When we deploy a new version of our app nothing happens and site is still served from CDN until we purge it. The good news is that there is a solution to this problem too. We can actually set additional metadata to our GCS objects which our CDN will respect. One of this metadata fields is cache control. We can actually tell which objects should be cached for how long overriding default settings. In our case we only want to tell to not cache index.html at all. To do this when using gcloud cli we can do:

gcloud storage cp --cache-control='no-cache, no-store, max-age=0' \
    ./dist/index.html \
    gs://example-bucket-name/"

The full process of deployment is:

Doing this assures that users will get a new webpage without broken css/images etc. This also assumes that our application during build adds hash to the names of all static assets.

<< Previous Post

|

Next Post >>

#seo #google #gcp #gcs #frontend #spa #terraform