Hosting Single Page Applications on GCP
With all Single Page Applications, there is a problem of hosting it without any backend and also making sure that seo, cache and routing works correctly, which means:
- all non-assets requests go to
index.html, - all assets are cached but
index.html, - when releasing a new page, everything is refreshed and client won’t receive stale information,
- CDN works correctly including the above,
- Not found pages return the correct 404 HTTP status code.
The last point on the list is the hardest to implement correctly.
Now, let me show you how to do it properly with Google Cloud using Cloud Storage, Cloud Load Balancer and CDN. We will use terraform as Infrastructure as a Code which can be easily applied or clicked-trough on GCP.
Google storage bucket setup
This post assumes that the user has already setup google_compute_global_forwarding_rule with http to https redirect and global IP address.
Let’s configure our bucket using google_storage_bucket resource:
resource "google_storage_bucket" "bucket" {
project = var.project // your GCP project ID
name = "example-bucket-name" // unique across GCP
location = "EU"
force_destroy = false
uniform_bucket_level_access = true
}
This bucket serves as the origin for all traffic. The actual routing to index.html will be configured in the Load Balancer setup below. To allow users access it we need to setup google_storage_bucket_iam_member resource.
Warning: all files uploaded to website’s bucket are public!
Setting up access policy:
resource "google_storage_bucket_iam_member" "public_rule" {
bucket = google_storage_bucket.bucket.name
role = "roles/storage.objectViewer"
member = "allUsers"
}
This way we allow everyone to view at files in our bucket.
Now we need to define service for our loadbalancer to use to route traffic to the bucket. For this
we can use google_compute_backend_bucket resource:
resource "google_compute_backend_bucket" "bucket" {
project = var.project
name = "website-backend-bucket"
description = "Backend bucket for SPA"
bucket_name = google_storage_bucket.bucket.name
enable_cdn = true
compression_mode = "AUTOMATIC" // enable compression over the wire
cdn_policy {
default_ttl = 3600 * 24 * 30 // one month cache
max_ttl = 3600 * 24 * 30 // one month cache
}
}
This default_ttl (one month) applies primarily to static assets that do not have their own explicit Cache-Control metadata set when uploaded. We will override this for index.html in the deployment step. This also assumes that static assets have random hash added during build.
Load balancer setup
resource "google_compute_url_map" "url_map" {
project = var.project
name = "website-url-map"
default_service = google_compute_backend_bucket.bucket.id
host_rule {
hosts = ["example.org"]
path_matcher = "website"
}
path_matcher {
name = "website"
default_service = google_compute_backend_bucket.bucket.id
dynamic "route_rules" {
for_each = ["/images", "/public"] // static assets paths
content {
priority = 10 + route_rules.key
service = google_compute_backend_bucket.bucket.id
match_rules {
prefix_match = route_rules.value
}
}
}
route_rules {
priority = 20
service = google_compute_backend_bucket.bucket.id
match_rules {
prefix_match = "/404.html"
}
custom_error_response_policy {
error_response_rule {
match_response_codes = ["404"]
path = "/index.html"
override_response_code = 404
}
error_service = google_compute_backend_bucket.bucket.id
}
}
route_rules {
priority = 30
service = google_compute_backend_bucket.bucket.id
match_rules {
ignore_case = true
path_template_match = "/**"
}
route_action {
url_rewrite {
path_template_rewrite = "/index.html"
}
}
}
}
}
There is a lot going in this snippet, so let’s see what is what.
route_rules are evaluated from highest priority (lowest number) to lowest (higher number). This means priority 1 will be evaluated before priority 2. We increase priority by ten to easily add other rules in between without changing all other rules priorities.
First rules is:
dynamic "route_rules" {
for_each = ["/images", "/public"] // static assets paths
content {
priority = 10 + route_rules.key
service = google_compute_backend_bucket.bucket.id
match_rules {
prefix_match = route_rules.value
}
}
}
The rule is dynamic to setup all the paths we have for static files. We need to make sure that assets requests go directly to storage bucket and won’t be redirected to index.html. This also assumes that the website doesn’t host any static files on root path, if it does it is also needed to be added to the list, e.g. robots.txt or favicon.
Next we setup our 404 rule, this is needed to have an explicit rule on our load balancer which will return 404 http code when needed and still render our SPA app.
route_rules {
priority = 20
service = google_compute_backend_bucket.bucket.id
match_rules {
prefix_match = "/404.html" // path in our SPA app to render not found page
}
custom_error_response_policy {
error_response_rule {
match_response_codes = ["404"]
path = "/index.html"
override_response_code = 404
}
error_service = google_compute_backend_bucket.bucket.id
}
}
Important! 404.html file CAN’T exist in our bucket, otherwise bucket backend will return 200 OK to our load balancer and we won’t trigger
custom_error_response_policysection.
In this rule we setup custom_error_response_policy which will work as follows:
- our SPA website faced with unknown resource will hard redirect user to
/404.htmlpage, we are not talking about just changingwindow.pathwe need to forcefully redirect user, usingwindow.location.href = "/404.html", to new page to initiate new request on our load balancer to trigger 404 status code, - google load balancer will try to request this file from storage bucket, which will result in 404 not found status code,
- then it will render
index.html, our SPA app, and return404http code to the user.
CDN setup
In our bucket backend we have enabled CDN but there is still a problem. When we deploy a new version of our app nothing happens and site is still served from CDN until we purge it. The good news is that there is a solution to this problem too. We can actually set additional metadata to our GCS objects which our CDN will respect. One of this metadata fields is cache control. We can actually tell which objects should be cached for how long overriding default settings. In our case we only want to tell to not cache index.html at all. To do this when using gcloud cli we can do:
gcloud storage cp --cache-control='no-cache, no-store, max-age=0' \
./dist/index.html \
gs://example-bucket-name/"
The full process of deployment is:
- deploy all static assets using
gcloud storage cp --recursive ./dist/assets/* gs://example-bucket-name/assets/ - and then deploy
index.html.
Doing this assures that users will get a new webpage without broken css/images etc. This also assumes that our application during build adds hash to the names of all static assets.