Running JupyterHub on and off campus Architectural Scenarios

Image result for jupyterhub logo

Dedicated Hardware Environments for hosting JupyterHub

On premise – Own Maintain, secure and Operate the services

image

Installation

JupyterHub can be installed with pip (and the proxy with npm) or conda:

pip, npm:

 python3 -m pip install jupyterhub
npm install -g configurable-http-proxy
python3 -m pip install notebook  # needed if running the notebook servers locally

conda (one command installs jupyterhub and proxy):

 conda install -c conda-forge jupyterhub  # installs jupyterhub and proxy
conda install notebook  # needed if running the notebook servers locally

Test your installation. If installed, these commands should return the packages' help contents:

 jupyterhub -h
configurable-http-proxy -h

Start the Hub server

To start the Hub server, run the command:

 jupyterhub

Visit https://localhost:8000 in your browser, and sign in with your unix credentials.

To allow multiple users to sign in to the Hub server, you must start jupyterhub as a privileged user, such as root:

 sudo jupyterhub

Authentication: PAM (Local Users, Passwords)

Adding SSL Cert to JupyterHub

 openssl re –x509 – nodes –days 365 –newkey rsa:1024 \ – keyout jupyterhub.key – out jupyterhub.crt

To get a FREE SSL Cert you can use https://letsencrypt.org/getting-started

 wget https://dl.eff.org/certbot-auto chmod a+x certbot-auto./certbot-auto certonly –-standalone –d mydomain.tld

key & Cert Locations

key: /etc/letsencrypt/live.mydomain.tld/privkey.pe
cert: /etc/letsencrypt/live/mydomain.tld/fullchain.pem

Adding SSL to config file

c.JuypterHub.ssl_key =’jupyterhub.key’
c.JupyterHub.ssl_cert = ‘juypterhub.crt’
c.JupyterHub.port = 443

Starting Jupyter

Create a Jupyterhub config file – /etc/jupyter/juypterhub_config.py

 jupyterhub –generate—config

Using Containers

Image result for jupyterhub logo

Starting JupyterHub with docker

The JupyterHub docker image can be started with the following command:

 docker run -d --name jupyterhub jupyterhub/jupyterhub jupyterhub

This command will create a container named jupyterhub that you can stop and resume with docker stop/start.

The Hub service will be listening on all interfaces at port 8000, which makes this a good choice for testing JupyterHub on your desktop or laptop.

If you want to run docker on a computer that has a public IP then you should (as in MUST) secure it with ssl by adding ssl options to your docker configuration or using a ssl enabled proxy.

Mounting volumes will allow you to store data outside the docker image (host system) so it will be persistent, even when you start a new image.

The command docker exec -it jupyterhub bash will spawn a root shell in your docker container. You can use the root shell to create system users in the container. These accounts will be used for authentication in JupyterHub’s default configuration.

Setting up Kubernetes on Microsoft Azure Container Service (ACS)

Note see https://zero-to-jupyterhub.readthedocs.io/en/latest/create-k8s-cluster.html#setting-up-kubernetes-on-microsoft-azure-container-service-acs

  1. Install and initialize the Azure command-line tools, which send commands to Azure and let you do things like create and delete clusters.

  2. Authenticate the az tool so it may access your Azure account:

     az login
    
  3. Specify a Azure resource group, and create one if it doesn’t already exist:

     export RESOURCE_GROUP=<YOUR_RESOURCE_GROUP>
    export LOCATION=<YOUR_LOCATION>
    az group create --name=${RESOURCE_GROUP} --location=${LOCATION}
    

where:

  • --name specifies your Azure resource group. If a group doesn’t exist, az will create it for you.
  • --location specifies which computer center to use. To reduce latency, choose a zone closest to whoever is sending the commands. View available zones via az account list-locations.
  1. Install kubectl, a tool for controlling Kubernetes:

     az acs kubernetes install-cli
    
  2. Create a Kubernetes cluster on Azure, by typing in the following commands:

     export CLUSTER_NAME=<YOUR_CLUSTER_NAME>
    export DNS_PREFIX=<YOUR_PREFIX>
    az acs create --orchestrator-type=kubernetes \
        --resource-group=${RESOURCE_GROUP} \
        --name=${CLUSTER_NAME} \
        --dns-prefix=${DNS_PREFIX}
    
  3. Authenticate kubectl:

     az acs kubernetes get-credentials \
        --resource-group=${RESOURCE_GROUP} \
        --name=${CLUSTER_NAME}
    

where:

  • --resource-group specifies your Azure resource group.
  • --name is your ACS cluster name.
  • --dns-prefix is the domain name prefix for the cluster.
  1. To test if your cluster is initialized, run:

     kubectl get node
    

    The response should list three running nodes.

Documentation

https://jupyterhub.readthedocs.io/en/latest/

Using Jupyterhub on the Microsoft Data Science Virtual Machine

clip_image001

Juypterhub comes preinstalled on the Microsoft Data Science VM on Windows 2012, 2016, CentOS or Ubuntu

Webinar Link: https://info.microsoft.com/data-science-virtual-machine.html

More Product Information: Data Science Virtual Machine Landing Page
Community Forum: DSVM Forum Page

Cloud Hybrid approach to implementing Jupyterhub and Data Science Virtual Machine

image

A new understanding of the world through grassroots Data Science education at UC Berkeley. In an effort to empower more data-driven thinking, Microsoft is working with U.C. Berkeley to help realize its vision of giving every undergraduate easy access to the university’s Data Science Education Program.

To succeed, the program had to be accessible to 1000+ students beyond the realm of computer science. One way the program does this is through a flexible and scalable technology infrastructure that enables students to quickly set up labs for hands-on practice—they don’t have to spend time installing programs or learning nuances of complicated applications. https://github.com/data-8/

‘By hosting it in Azure, we can control the environment Students just log in and they’re ready to go.’

- Ryan Lovett, Systems Manager for the Department of Statistics at UC Berkeley.

image

Remote desktop in Azure Infrastructure as Service (IaaS) Data Science Virtual Machine Windows or Linux

•Azure Remote Desktop domain-joined VMs can be deployed against AAD Domain Services domains

•Users simply SSH or RDP into servers

•Data Science VM comes preinstalled with Jupyter and JupyterHub

•Known issue: Remote Desktop licensing service does not work – no license reporting

•Workaround: Track per-user licensing separately (out-of-band)

Setup Documentation

https://blogs.msdn.microsoft.com/uk_faculty_connection/2017/06/12/using-dsvm-jupyterhub-with-aad-authentication/

•Joining an Ubuntu Data Science VM to AD https://github.com/Azure/DataScienceVM/blob/master/Scripts/ActiveDirectory/UbuntuDSVMJoinAD.md

•Joining CentOS Data Science VM to AD https://github.com/Azure/DataScienceVM/blob/master/Scripts/ActiveDirectory/CentOSDSVMJoinAD.md

•Joining Windows Data Science VM, to AD https://github.com/Azure/DataScienceVM/blob/master/Scripts/ActiveDirectory/WindowsDSVMJoinAD.md

image

Application level security:

Jupyter Hub application uses a web-form to collect user credentials and authenticates users via LDAP bind to the directory.

•This application can be migrated & deployed in Azure VMs.

•End-users sign in using their existing corporate credentials.

•The app is deployed in Azure, transparent to end-users.

Setup Documentation

https://blogs.msdn.microsoft.com/uk_faculty_connection/2017/06/19/using-shibboleth-and-domain-connecting-your-data-science-virtual-machine/

Using OAuth

If you wanted to use Github as OAuth services ttp://github.com/settings/applications/new

For Microsoft See https://docs.microsoft.com/en-us/azure/active-directory/develop/active-directory-v2-protocols

See https://www.slideshare.net/willingc/jupyterhub-tutorial-at-jupytercon 

image

Applications that use Windows Integrated Authentication

An application uses an AD service account for its web front-end to authenticate access to a backend server.

•Deployed in Azure VMs.

•You can create custom OUs & provision service accounts within those OUs.

•You can assign custom password policies (eg. password-never-expires) to service accounts.

GMSAs (Group Managed Service Accounts) work as well.

Fully Cloud Hosted Solution

No maintenance, installation, patching or support requirements

image

As the pace of global innovation continues to accelerate, the University of Cambridge is evolving engineering curriculum to teach core concepts faster using higher level, open source tools in the public cloud. For example, a professor increased learning in an introductory computing class by having students use Microsoft Azure Notebooks, which allows them to spend more time mastering concepts and enhancing problem solving skills and less time on language syntax. This technology switch also gives students anytime, anywhere access to required tools needed to complete assignments, and it facilitates greater collaboration between professors, students, and the larger community. In addition, after Cambridge adopted a public cloud solution, IT infrastructure doesn’t limit the ingenuity of bright minds.

‘By using Azure Notebooks, students aren’t hindered by installation issues. They can just start working straight away. All they need is a decent browser and an Internet connection.’

- Dr. Garth Wells, Hibbit Reader in Solid Mechanics, Department of Engineering, University of Cambridge

https://aka.ms/CambridgeNotebooks

image

Azure Notebooks use Windows Integrated Authentication using O365 or MSA user accounts

Jupyter notebooks to write Python 2, Python 3, R and F# code interactively

Network: Your code can access Azure, github, PyPI, CRAN, OneDrive, DropBox and Google Drive

Memory is limited to 4Gb

Storage: We reserve the right to remove your data from our storage after 60 days of inactivity to avoid storing unused/abandoned user data

Usage should be limited to learning, research, general computing, etc. and must abide by the Microsoft Azure Terms of Use see https://notebooks.azure.com

Additional Resources

For setting up Jupyterhub on VMs or Docker see https://www.slideshare.net/willingc/jupyterhub-tutorial-at-jupytercon for a Step by Step setup guide 

Running Jupyter Notebooks as Software as Services (Maintenance/Management Free) see https://Notebooks.azure.com