Written by Mathew Salvaris, Fidan Boylu Uz, and Daniel Greece. Edited by Nanette Ray. Reviewed by Mike Wasson.
Reference architectures provide a consistent approach and best practices for a given solution. Each architecture includes recommended practices, along with considerations for scalability, availability, manageability, and security. This architecture includes a deployable solution as well. The full array of reference architectures is available on the Azure Architecture Center.
This reference architecture shows how to deploy Python models as web services to make real-time predictions. Two scenarios are covered: deploying regular Python models, and the specific requirements of deploying deep learning models. Both scenarios use the architecture shown.
This architecture consists of the following components:
- Virtual machine (VM) is shown as an example of a device—local or in the cloud—that can send an HTTP request.
- Azure Kubernetes Service (AKS) is used to deploy the application on a Kubernetes cluster. AKS simplifies the deployment and operations of Kubernetes. The cluster can be configured using CPU-only VMs for regular Python models or GPU-enabled VMs for deep learning models.
- Load balancer, provisioned by AKS, is used to expose the service externally. Traffic from the load balancer is directed to the back-end pods.
- Docker Hub is used to store the Docker image that is deployed on Kubernetes cluster. Docker Hub was chosen for this architecture because it's easy to use and is the default image repository for Docker users. Azure Container Registry can also be used for this architecture.
This article covers the following topics:
- Performance considerations
- Scalability considerations
- Monitoring and logging considerations
- AKS monitoring
- AKS logs
- Security considerations
- Container registry
- DDoS protection
Head to the article page to learn more and to deploy the solution.
"Hands-on solutions, with our heads in the Cloud!"