Just replied yet again to someone whose customer thinks they’re adding security by blocking outbound network traffic to cloud services using IP-based allow-lists. They don’t.
Service Bus and many other cloud services are multitenant systems that are shared across a range of customers. The IP addresses we assign come from a pool and that pool shifts as we optimize traffic from and to datacenters. We may also move clusters between datacenters within one region for disaster recovery, should that be necessary. The reason why we cannot give every feature slice an IP address is also that the world has none left. We’re out of IPv4 address space, which means we must pool workloads.
The last points are important ones and also shows how antiquated the IP-address lockdown model is relative to current practices for datacenter operations. Because of the IPv4 shortage, pools get acquired and traded and change. Because of automated and semi-automated disaster recovery mechanisms, we can provide service continuity even if clusters or datacenter segments or even datacenters fail, but a client system that’s locked to a single IP address will not be able to benefit from that. As the cloud system packs up and moves to a different place, the client stands in the dark due to its firewall rules. The same applies to rolling updates, which we perform using DNS switches.
The state of the art of no-downtime datacenter operations is that workloads are agile and will move as required. The place where you have stability is DNS.
Outbound Internet IP lockdowns add nothing in terms of security because workloads increasingly move into multitenant systems or systems that are dynamically managed as I’ve illustrated above. As there is no warning, the rule may be correct right now and pointing to a foreign system the next moment. The firewall will not be able to tell. The only proper way to ensure security is by making the remote system prove that it is the system you want to talk to and that happens at the transport security layer. If the system can present the expected certificate during the handshake, the traffic is legitimate. The IP address per-se proves nothing. Also, IP addresses can be spoofed and malicious routers can redirect the traffic. The firewall won’t be able to tell.
With most cloud-based services, traffic runs via TLS. You can verify the thumbprint of the certificate against the cert you can either set yourself, or obtain from the vendor out-of-band, or acquire by hitting a documented endpoint (in Windows Azure Service Bus, it’s the root of each namespace). With our messaging system in ServiceBus, you are furthermore encouraged to use any kind of cryptographic mechanism to protect payloads (message bodies). We do not evaluate those for any purpose. We evaluate headers and message properties for routing. Neither of those are logged beyond having them in the system for temporary storage in the broker.
The server having access to Service Bus should have outbound Internet access based on the server’s identity or the running process’s identity. This can be achieved using IPSec between the edge and the internal system. Constraining it to the Microsoft DC ranges it possible, but those ranges shift and expand without warning.
The bottom line here is that there is no way to make outbound IP address constraints work with cloud systems or high availability systems in general.