Manage, Modify or Remove a Hub¶

These instructions are for modifying a JupyterHub. If you want to understand more about how deployment works, or want to modify how we do deployment, see deployment.

Making Changes to an Existing Hub¶

To make changes to an existing hub:

fork https://github.com/earthlab/hub-ops
in your fork create a new branch
edit the hub’s configuration in hub-configs/<hubname>.yaml
commit the change and make a PR
fix any GitHub Action errors, https://github.com/earthlab/hub-ops/actions
after merge, Actions will will start deploying your changes. Check the status of your deployment
once the Actions workflows have completed, check that the hub is working as expected at https://hub.earthdatascience.org/hubname/.

Maintaining Your Hub¶

The JupyterHub interface has a built in administration panel that allows you to:

View users with access to the hub
View and manage active servers

It is important to note that this admin interface works well for a hub working on a local server or virtual machine. However when running through Kubernetes using Google Cloud (which is our current setup), most of the admin tasks will need to be performed directly through Kubernetes and google cloud rather than in the admin interface.

Some features of the build in hub admin panel that will not work include the ability to:

remove users and
shutdown the hub.

The above two steps should not be utilized in a Google Cloud deployment as Kubernetes is running behind the scenes and will thus control users and hub deployment. To remove users you will thus need to

Edit the hub’s yaml file which contains a list of users with permission to access the hub
(If GitHub authentication) Remove access tokens for the users
Manually delete storage <TODO: add more details about the best way to handle storage removal>

Shut Down a Hub (And Remove Associated Storage)¶

At the end of a workshop or semester you should consider removing a hub again. While a hub scales down to use minimal resources when no one is logged in, it does use some resources (like disk space) that will only be reclaimed once the hub has been turned off.

Currently this is a manual process and requires you to have kubectl and helm installed on your computer (see Google Cloud & Kubernetes Tools). The reasoning is that removing a hub involves deleting user data, which might be catastrophic! So think about what you are doing and wait for a quiet moment. A few extra days of paying for storage is going to be a lot cheaper than trying to recreate data or code you deleted by accident.

Step one: Turn off your hub autobuild / update¶

The first step in removing a hub is to turn it off. To do this

Open the .github/workflows/build-deploy.yml file in the root of the hub-ops repo.
Remove the commands listed below

For example, to remove a hub called bootcamp-hub, in the GitHub actions hubname section remove:

strategy:
  matrix:
    hubname: [ea-hub, bootcamp-hub]

In the build-only.yml file remove:

strategy:
  matrix:
    hubname: [ea-hub, bootcamp-hub]

These two actions test deployment and then deploy your hub. Once you have removed the hub from each action, create a pull request in GitHub. Merge that PR. Wait for GitHub Actions to deploy your changes before moving on.

If you check your hub should still be running at this point. This is because all you have done so far is tell CI to not deploy new changes for this hub.

If you check your hub should still be running at this point. This is because all we have done is stop Actions from trying to build the docker image and deploy the hub when there are changes.

Step two: Uninstall the helm release¶

The second step is to uninstall the helm release to shutdown your hub. You will need kubectl and helm installed and configured on your local machine to perform this step.

To check for the installation

One way to check this is to run kubectl get pods --namespace=<hubname>. You should see a few pods running:

NAME                              READY   STATUS                  RESTARTS   AGE
continuous-image-puller-hgrjp     1/1     Running                 0          4d9h
hook-image-awaiter-zc8tv          1/1     Running                 0          4d11h
hook-image-puller-tlmmz           0/1     Init:ImagePullBackOff   0          4d9h
hub-c5c44d76b-k9lsb               1/1     Running                 0          4d10h
proxy-5797f8d787-dm9fh            1/1     Running                 0          4d10h
user-placeholder-0                1/1     Running                 0          4d9h
user-placeholder-1                1/1     Running                 0          4d9h
user-scheduler-779876497d-mcwgn   1/1     Running                 0          4d11h
user-scheduler-779876497d-zvqbv   1/1     Running                 0          4d10h

But you should not see any pods named jupyter-username (because this would indicate that users are still connected to your hub, and they might be surprised to be kicked off).

To check the helm releases currently installed, run helm list --all-namespaces. It should look similar to this:

NAME          NAMESPACE       REVISION        UPDATED                                 STATUS          CHART                   APP VERSION
cert-manager  cert-manager    1               2021-01-11 10:19:55.227696 -0500 EST    deployed        cert-manager-v1.1.0     v1.1.0
ea-hub        ea-hub          20              2021-06-04 22:39:47.769249637 +0000 UTC deployed        jupyterhub-0.10.6       1.2.2
ingress-nginx ingress-nginx   1               2021-01-11 10:53:04.954353 -0500 EST    deployed        ingress-nginx-3.19.0    0.43.0
nbgrader-hub  nbgrader-hub    22              2021-06-04 22:39:55.101091107 +0000 UTC failed          jupyterhub-0.10.6       1.2.2
staginghub    staginghub      5               2021-01-25 20:54:55.67648376 +0000 UTC  deployed        jupyterhub-0.10.6       1.2.2

Depending on how many hubs are running there will be at least two releases deployed: ingress-nginx and cert-manager. These support all hubs and should never be removed. In the case shown above there are three hubs running: ea-hub, nbgrader-hub and staginghub.

To uninstall the hub <hubname> from the namespace <hubname> run:

helm uninstall <hubname> -n <hubname>

If you now visit https://hub.earthdatascience.org/<hubname>/ you should get a 404 error.

Step Three: Clean Up & Remove Storage¶

The final step is to delete all storage and IP addresses associated with your hub.

IMPORTANT: If you execute the next step there is no way to recover the data in student’s home drives or any other data associated to the cluster. Take a moment to make sure you have all the data you will need from the cluster.

To permanently remove all storage (THERE IS NO RECOVERING THE DATA AFTER DOING THIS!) run the following command:

kubectl delete namespace <hubname>

You have now deleted the hub and all of its storage.

Removing users from a hub¶

Removing users from a hub involves removing them from the whitelist and /or admin lists and also revoking their authentication token (if using GitHub authentication). This is because checking the whitelist is the last step in authentication, so if the user already has a token, the whitelist has no effect.

To remove users from the whitelist, edit hub-configs/hubname.yaml and remove their usernames from the auth whitelist.

To revoke _all_ user tokens, you can go to the Settings for the Earthlab GitHub organization and click Revoke all user tokens. This means that all users will need to re-authenticate (and will be checked through the whitelist).

To revoke a single user token, you can probably do this via the GitHub API directly but we have not tried this yet.

Deploying New Hubs Deployment