Setting up a new cluster

These are notes from winter 2021 about setting up a new cluster from scratch.



  • have a gcloud project ea-jupyter and the gcloud command line tools set up

  • have kubectl and helm 3 installed

Long version

What follows are “this is what I did, including mistakes that needed to be fixed” rather than “follow these steps exactly as a tutorial”.

Creating a new kubernetes cluster

Create a new cluster called jhub2 in the ea-jupyter project:

gcloud container clusters create jhub2 \
    --num-nodes=1 --machine-type=n1-standard-2 \
    --zone=us-central1-b --image-type=cos_containerd \
    --enable-autoscaling --max-nodes=3 --min-nodes=1


Changes from cluster creation in existing docs are:

  • use the default version of kubernetes rather than specifying a version. You can see the default version of kubernetes using gcloud container get-server-config. At this moment, the defaultVersion on the RELEASE channel is 1.17.13-gke.2600.

  • use the containerd image type rather than the default cos type because the latter is being deprecated, see

Output from cluster creation includes the following warnings:

WARNING: Starting in January 2021, clusters will use the Regular release channel by default when `--cluster-version`, `--release-channel`, `--no-enable-autoupgrade`, and `--no-enable-autorepair` flags are not specified.
WARNING: Currently VPC-native is not the default mode during cluster creation. In the future, this will become the default mode and can be disabled using `--no-enable-ip-alias` flag. Use `--[no-]enable-ip-alias` flag to suppress this warning.
WARNING: Starting with version 1.18, clusters will have shielded GKE nodes by default.
WARNING: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s).

and this status:

jhub2  us-central1-b  1.16.15-gke.4901  n1-standard-2  1.16.15-gke.4901  1          RUNNING

The create a static ip:

gcloud compute addresses create jhub2-ip --region us-central1

List the ip addresses for the cluster(s):

$ gcloud compute addresses list
jhub-ip    EXTERNAL                    us-central1          IN_USE
jhub2-ip  EXTERNAL                    us-central1          RESERVED

This is a single node cluster with the n1-standard-2 node type. In the current jhub cluster, the core-pool is a custom node type and there is also a node pool. There does not seem to be documentation about setting this up in our notes, but the z2jh has notes on setting up a node-pool on GKE (see step 7).

Create the user user pool (using n1-standard-8 nodes with autoscaling maximum at 5 nodes):

gcloud beta container node-pools create user-pool \
--machine-type n1-standard-8 \
--num-nodes 0 \
--enable-autoscaling \
--min-nodes 0 \
--max-nodes 5 \
--node-labels \
--node-taints hub.jupyter.org_dedicated=user:NoSchedule \
--zone us-central1-b \
--cluster jhub2

Forgot to set the image type to cos_containerd to match the core-pool, so I changed that using the web console UI.

Note that creating a cluster changes the default context for kubectl.

When you create a cluster using gcloud container clusters create, an entry is automatically added to the kubeconfig in your environment, and the current context changes to that cluster (from

Other users will need to change their context using gcloud container clusters get-credentials cluster-name.

Install JupyterHub


$ helm repo add jupyterhub
$ helm repo update

Add the secret token from secrets/staginghub.yaml to config.yaml as specified in the z2jh docs (will set up the decryption of secrets later and before committing this file!) and then install the helm chart:

helm upgrade staginghub jupyterhub/jupyterhub --install \
--cleanup-on-fail --namespace staginghub --create-namespace \
--version 0.10.6 --values config.yaml

Set up docker image and gitpuller

See, added the following to config.yaml:

    name: earthlabhubops/ea-k8s-user-staginghub
    tag: 9d034c2
        command: ["gitpuller", "", "master", "ea-bootcamp-shared"]

Remove the token from config.yaml and provide it on the command line when we upgrade (also add a timeout to allow for downloading the image):

helm upgrade --cleanup-on-fail staginghub jupyterhub/jupyterhub --namespace staginghub --version 0.10.6 --timeout 600s --debug -f config.yaml -f ../../secrets/staginghub.yaml

Ingress and https


In order to have multiple hubs at the same URL (e.g.,, etc) we need to set up an ingress controller. As recommended by the z2jh team, we use kubernetes/ingress-nginx. Following the ingress-nginx Helm installation instructions:

helm repo add ingress-nginx
helm repo update

kubectl create namespace ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx -n ingress-nginx

The output includes the following info:

An example Ingress that makes use of the controller:

  kind: Ingress
    annotations: nginx
    name: example
    namespace: foo
      - host:
            - backend:
                serviceName: exampleService
                servicePort: 80
              path: /
    # This section is only required if TLS is to be enabled for the Ingress
        - hosts:
          secretName: example-tls

If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:

  apiVersion: v1
  kind: Secret
    name: example-tls
    namespace: foo
    tls.crt: <base64 encoded cert>
    tls.key: <base64 encoded key>


Now we need a TLS certificate manager for https. Here, we deviate from the z2jh documentation and use cert-manager rather than the (deprecated) kube-lego. Following the cert-manager installation guide, specifically the parts about installing with heml:

kubectl create namespace cert-manager
helm repo add jetstack
helm repo update

Then install the custom resource definitions (CRDs):

kubectl apply -f

And install the helm chart:

helm install cert-manager jetstack/cert-manager --namespace cert-manager  --version v1.1.0

Check the installation:

kubectl get pods --namespace cert-manager

Now you need to install a clusterIssuer resource (this is very poorly documented in the cert-manager docs, presumably because they assume their users know more about k8s than I do).

Create a cluster-issuer.yaml file based on the ACME template, using:


And create (and check) the clusterissuer:

kubectl create -f cluster-issuer.yaml
kubectl describe clusterissuer letsencrypt-prod

Updating values.yaml

Add the following setup to you values.yaml file:

    type: ClusterIP

  baseUrl: /staginghub/

  enabled: true
  annotations: nginx "letsencrypt-prod"
    - secretName: cert-manager-tls

Then upgrade helm:

helm upgrade --cleanup-on-fail staginghub jupyterhub/jupyterhub --namespace staginghub --version 0.10.6 --timeout 600s --debug -f config.yaml -f ../../secrets/staginghub.yaml

I had to delete the proxy-public service that got created before switching over to manual ingress setup:

kubectl delete service proxy-public -n staginghub

and upgrade helm.

GKE version updating

In the GCloud console UI, find the jhub2 GKE cluster, and the release channel option. Change the setting from Static version to Release channel and choose the Stable channel. This ensures that the kubernetes version will be automatically updated. Note that this will not be true for the core-pool, since there is only one node.


We are using GitHub Actions for continuous integration - building and pushing docker images, and deploying updates to the cluster.

Create a gcloud service account and assign the Kubernetes Engine Admin role (roles/container.clusterAdmin). See gcloud iam docs for details about permisssions, or run gcloud iam roles describe roles/container.clusterAdmin.

The Kubernetes Developer role is not sufficient because the jupyter helm chart requires the ability to add and delete rbac roles as part of installing the pre-upgrade hooks.