Using nginx and Docker Containers on Azure to migrate a Java/HTML/MongoDB Solution

Article
07/01/2015

Overview

By Shawn Cicoria, Partner Catalyst Team

This case study focuses on migrating existing multi-tiered applications to Azure, using Docker containers and nginx to expedite rehydration and reduce the overall surface area for open ports for the main application. After reviewing this case study you’ll have a basic understanding on a use case for nginx and dealing with a common scenario for on-premises applications and migrating them to the cloud – specifically, handling non-standard HTTP ports that are artifacts of developer choices and masked when on-premises.

The Problem

The problems presented and addressed are two areas that often happen when certain implementation choices are made and lack of communication between an organization’s Development and Infrastructure teams exists. As teams progress towards a DevOps structure and culture, these issues are generally recognized and addressed early.

The example presented here is abstracted from a more complex solution and simplified to highlight the basic processes and core technologies that are applied to address some common issues.

Rapid Environment Setup and Deployment of Tiered Solution

Existing multi-tiered solutions are often hard to rehydrate in new environments and may take days, even weeks to get the right people focused on addressing the connections and linkage of all the moving parts.

In a good, well established DevOps culture, the team can easily and quickly establish a full environment for development, test, staging, disaster recovery, even production with minimal intervention. Often however, if teams are disconnected or lack concern, motivation, or just lack focus until its tool late, these proper processes and approaches are not followed, and migration is slower and more expensive than it needs to be.

Use of Non-Standard Ports

One primary issue is that developers often implement solutions that make use of non-standard TCP ports (e.g. HTTP(s) over port 8000 or 44300 – not 80 or 443). Standard or “well known ports” are listed here: https://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers and here https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml.

When solutions are deployed internally on-premises, these issues generally go unnoticed, since traffic within the corporate network is usually routed without concern as ports are not generally locked down on routers, switches, and internal bridges. Other than additional configuration that may happen on the server to open up these additional ports (which may be open by default – no firewall), the use of non-standard ports goes undetected or pushed to a low priority fix. However, moving to the cloud requires that we have more explicit control and knowledge of port usage.

Existing On-Premise Solution Overview

The architecture that is being used is shown in the diagram in this section (Figure 1 - On-Premises and Non-Standard Ports). The core aspects of the solution are:

Browser based rich client
1. HTML & JavaScript
Tomcat used for static content – port 80/443
1. HTML / CSS / JavaScript / images
2. No direct calls from this tier to any other tier
3. “serverConfig.js” containing variable that identified the Spring Framework DNS name and Port – as example:

 var baseAddress = 'https://apiserver:8080';

Spring Framework / Java – port 8080/44300
1. Used for REST services
Mongo DB – port 27017
1. Document Storage / Database

NOTE: While static content was deployed to an IaaS instance running Docker, further determination should be done to identify if CDN is applicable and other application remediation approaches; further review of nginx modules, CDN requirements, and cache-ability of content is required.

Non-Standard Ports

Non-standard ports are an issue as we move the solution to the cloud. Generally edge gateways and proxy servers expect most traffic or permit only traffic to known ports. While getting an exception to be implemented in a firewall rule for specific applications is an option, this creates governance overhead, management, and support issues as the number of solutions in the enterprise increase with these exceptions.

Rapid Environment Setup

As new developers and testers ramp onto a team and a solution, it’s important to be able to rapidly stand up a working environment so they can be productive right away. In addition, as new releases of the solution are implemented, testing the solution in a clean environments helps to diagnose situations where old code or remnants of prior solutions may remain. While production deployments may be different, using clean well known configurations and applying a release on that narrows the complexity and helps mitigate issues that occur as machines run over time and various configuration or environment changes are made and potentially not tracked.

Overview of the Solution

One approach is to use both Docker and nginx (https://nginx.org) along with the existing Spring-based Java “middle-tier” and keeping Mongo DB as the persistence tier.

Docker is a Linux container technology that provides lightweight runtime and packaging tool, and Docker Hub, allowing developers and IT Pros to automate their build pipeline and share artifacts with collaborators through public or private repositories. Please see https://docker.com for extensive information on Docker.

Rapid Environment Deployment and Repeatability

Docker is utilized as a deployment tool for defining and creating several Dockerfile files and Builds that are built and deployed rapidly – and run across 1 or more Docker Host containers as required.

Docker offers up many attributes that assist in the repeatability. While this case study did not cover clustering approaches (as there are several in Docker), Docker offers Swarm as their native (in beta) approach for Docker clustering.

This solution also did not publish to the Docker Hub – it took advantage of creating a build in a local Docker host container and running direct from there. While Docker Hub is an option, there are approaches to running your own Docker repository:

https://azure.microsoft.com/blog/2014/11/11/deploying-your-own-private-docker-registry-on-azure/

Mongo DB and Docker

Docker offers an official Mongo DB image; the following solution is based upon that official image, and includes some modifications to insert seed data and check for ready-state. This seed data is just for demonstration purposes and not intended for production. Note that the primary purpose of these Dockerfiles and the scripts was to rapidly establish a running Solution environment with all tiers working, so that developers and testers can validate a new build.

Dockerfile for Mongo

 # Dockerizing MongoDB: Dockerfile for building MongoDB images
 # Based on ubuntu:latest, installs MongoDB following the instructions from:
 # https://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/
 
 FROM ubuntu:latest
 MAINTAINER Shawn Cicoria shawn.cicoria@microsoft.com
 
 ADD mongorun.sh .
 # Installation:
 # Import MongoDB public GPG key AND create a MongoDB list file
 RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
 RUN echo 'deb https://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | tee /etc/apt/sources.list.d/10gen.list
 
 # Update apt-get sources AND install MongoDB
 RUN apt-get update && apt-get install -y mongodb-org
 
 # Create the MongoDB data directory
 RUN mkdir -p -m 777 /data/db
 
 RUN mkdir -p -m 777 /tmp
 
 COPY mongodb.catalog.json /tmp/mongodb.catalog.json
 COPY mongodb.dealers.json /tmp/mongodb.dealers.json
 COPY mongodb.quotes.json /tmp/mongodb.quotes.json
 COPY mongorun.sh /tmp/mongorun.sh
 
 WORKDIR /tmp
 
 RUN chmod +x ./mongorun.sh
 
 # Expose port #27017 from the container to the host
 EXPOSE 27017
 
 VOLUME /data/db
 
 ENTRYPOINT ["./mongorun.sh"]

Mongo Startup Script

The script utilized as the entry point in the Dockerfile initializes a log file, and then starts MongoDB on a background process. It then waits for MongoDB to be ready by reading the log file for the message “wating for connections on port” . This represents one way to check for readiness. In the OrderService approach that follows in the next section, you will see another way.

 #!/bin/bash
 
 # Initialize a mongo data folder and logfile
 sudo rm -r /data/db 1>/dev/null 2>/dev/null
 mkdir -p -m 777 /data/db 
 touch /data/db/mongodb.log
 echo step 1
 # Start mongodb with logging
 # --logpath Without this mongod will output all log information to the standard output.
 # --logappend Ensure mongod appends new entries to the end of the logfile. We create it first so that the below tail always finds something
 /usr/bin/mongod --smallfiles --quiet --logpath /data/db/mongodb.log --logappend &
 MONGO_PID=$!
 echo step 2
 # Wait until mongo logs that it's ready (or timeout after 60s)
 COUNTER=0
 grep -q 'waiting for connections on port' /data/db/mongodb.log
 while [[ $? -ne 0 && $COUNTER -lt 90 ]] ; do
 sleep 2
 let COUNTER+=2
 echo "Waiting for mongo to initialize... ($COUNTER seconds so far)"
 grep -q 'waiting for connections on port' /data/db/mongodb.log
 done
 
 # Now we know mongo is ready and can continue with other commands
 echo now populate
 #some point do something to chedk if already run; but for this demo just do it.
 /usr/bin/mongoimport -d ordering -c catalog < /tmp/mongodb.catalog.json
 /usr/bin/mongoimport -d ordering -c dealers < /tmp/mongodb.dealers.json
 /usr/bin/mongoimport -d ordering -c quotes < /tmp/mongodb.quotes.json
 
 wait $MONGO_PID

Java Spring and Docker

For the Java Spring framework based application tier, which provides various REST services that the HTML front-end calls directly, the following Dockerfile was used

OrderService Dockerfile

 FROM java:8-jre
 MAINTAINER Shawn Cicoria shawn.cicoria@microsoft.com
 
 ENV APP_HOME /usr/local/app
 ENV PATH $APP_HOME:$PATH
 RUN mkdir -p "$APP_HOME"
 
 WORKDIR $APP_HOME
 
 ADD ordering-service-0.1.0.jar $APP_HOME/
 ADD startService.sh $APP_HOME/
 
 RUN chmod +x startService.sh 
 
 EXPOSE 8080
 #CMD ["java", "-jar", "ordering-service-0.1.0.jar"]
 CMD ["./startService.sh"]

StartService Script

The OrderService Dockerfile references the following script, which polls MongoDB using a hostname of ‘mongodb’ (you’ll see that alias used in the ‘docker run –link’ commands later) to check for ready state.

 #!/bin/bash
 
 while ! curl https://mongodb:27017/
 do
 echo "$(date) - still trying"
 sleep 1
 done
 echo "$(date) - connected successfully"
 
 java -jar ordering-service-0.1.0.jar

nginx and Docker

nginx [engine x] is an HTTP and reverse proxy server that is popular and used in this solution to provide the static file content needs along with providing a proxy to the OrderService, using simple rules. The primary benefit is squashing down to a single IP port for all traffic – however, nginx offers up other potential for distributed and rule based proxy routing and an abstraction of the OrderService endpoints – thus location transparency.

Nginx Dockerfile

Note that the Dockerfile uses the base image from Docker for nginx and extends that for custom configuration and content. There is a parameter of “daemon off” to nginx as without it the container would just exit as Docker requires the process to continually run in order for it to assume it’s running and manageable.

 FROM nginx:1.7.10
 MAINTAINER Shawn Cicoria shawn.cicoria@microsoft.com
 
 ENV WEB_HOME /usr/local/web
 
 ADD Web.tar $WEB_HOME/
 
 COPY nginx.conf /etc/nginx/nginx.conf
 
 EXPOSE 8000
 CMD ["nginx", "-g", "daemon off;"]

Nginx Configuration

 worker_processes 1;
 
 events {
 worker_connections 1024;
 }
 
 http {
 include mime.types;
 default_type application/octet-stream;
 
 sendfile on;
 keepalive_timeout 65;
 
 gzip on;
 
 server {
 listen 8000;
 server_name localhost;
 
 location / {
 root /usr/local/web/Web;
 index index.html index.htm;
 }
 
 error_page 500 502 503 504 /50x.html;
 
 location = /50x.html {
 root html;
 }
 
 location /catalog {
 proxy_pass https://orderservice:8080;
 }

  location /quotes {
 proxy_pass https://orderservice:8080;
 }
 location /shipments {
 proxy_pass https://orderservice:8080;
 }
 }
 }

Solution Architecture and Container Approach

In the diagram at the end of this section, the solution takes advantage of Docker and can scale up to numerous Docker hosts (containers). Note that we’ve also achieved use of a single IP port along with location transparency to the running containers and the calling clients.

In addition, for full clustering support you must ensure that any calls made between each tier leverages Docker’s linking of containers as Docker provides dynamic IP addresses and container names for each running container.

Docker Build Script

The Docker build script is below and is fairly standard.

 #!/bin/sh
 cd mongoseed
 docker build -t scicoria/mongoseed:0.1 .
 cd ../orderService
 docker build -t scicoria/orderservice:0.1 .
 cd ../staticsite
 docker build -t scicoria/staticsite:0.1 .
 cd ..

Docker Run Script

For running the containers, it’s important that the “client” is aware or “linked” to containers is needs to initiate requests to. In the following script the following links are in place:

1. StaticWeb -> OrderService (DNS name OrderService) on Port 8000

2. OrderService -> MongoDB (DNS name mongodb) on Port 27017

 #!/bin/sh
 docker run -d -p 27017:27017 --name mongodb -v /data/db:/data/db scicoria/mongoseed:0.1
 docker run -d -p 8080:8080 --name orderservice --link mongodb:mongodb scicoria/orderservice:0.1
 docker run -d -p 8000:8000 --link orderservice:orderservice scicoria/staticsite:0.1

Docker and Linking

Docker provides virtual networking among containers and any communication among containers must be explicitly declared when running a container. While the ports that are identified on the command line with the ‘-p’ parameters are for external communication into the container, Docker prevents that communication – unless it is explicitly linked. In addition you should be aware of the hostnames and dynamic IP addressing and Ports that are also used, as Docker effectively provides NAT (network address translation) services among containers.

Eliminate use of Non-Standard Ports

Finally, with this solution we can eliminate the use of non-standard ports and reduce configuration, support, and troubleshooting issues with regards to corporate firewalls, proxy servers, etc. that may prevent non-standard port traffic from traversing the network.

The final code used in the static web site for the JavaScript calls simply builds the endpoint name using the browser’s location provider. Since all traffic is now being routed through the same nginx front-ends, and nginx is determining which traffic to proxy and send to OrderService, we’ve reduced that complexity by a significant amount.

So, in updating the static web sites “serverConfig.js” – which executes in the Browser to the following, we have a minimal impact on the static web site and it’s supporting JavaScript:

 var baseAddress = window.location.protocol + '//' + window.location.hostname;

Code Artifacts

The source files are published to GitHub here – with the [nginix] branch represented here:

https://github.com/cicorias/IgniteARMDocker/tree/nginix [nginx] (note the different branch from ‘master’).

This solution and source is part of a Pre-MS Ignite Session on DevOps – with additional walk-through items here: https://github.com/Microsoft/PartsUnlimitedMRP

Opportunities for Reuse

The use of nginx for a Web front end provides an approach for layer 7 proxy and SSL termination. This example just brushes the surface of the full capabilities but they are on par with Application Request Routing (ARR) under IIS and nginx also runs on both Linux and Windows.
The various scripts and Dockefiles are representations of simple build and deployments and can be reused as is if published direct to Docker Hub.

Further Research

The pattern applied here is useful for many solutions. The concept of location transparency for service endpoints and the use of nginx provides a simple but effective and highly performant proxy based approach. In addition nginx is extensible allowing more static injection of routing of proxy calls to different endpoints – based upon various attributes such as load, time of day, versioning, etc.

Shawn Cicoria has been working on distributed systems for nearly 2 decades after dealing his last deck of cards on the Blackjack table. Follow Shawn at https://bit.ly/cicorias or @cicorias