By Bill Eldredge, Associate Architect
As the former head of the Big Data Management and Governance team at Nokia, I was responsible for managing our internal business customers' needs and expectations use of the private Hadoop cloud and related Big Data Asset we spent five years building and maintaining. Unfortunately, several of those years amounted to a daily, losing battle against the constraints of that cluster and a related on-prem RDBMS. Over several years, we made the move to public cloud, and based on my experience with both, I would counsel anyone considering building a private cloud to first exhaust every possible avenue for using public cloud offerings. While I can certainly be seen as a biased source on this question, I hope that by baring my practical battle scars in this blog, I can give points of practical consideration to both IT and business decision-makers.
We meet with customers who are considering or planning the creation or expansion of a private "data lake" rather than taking advantage of public cloud offerings. Their reasons vary, but typically come down to control (in a number of senses) and protection of data. While we built our Hadoop cluster before public cloud options were sufficiently robust, we also labored under these notions for too long in justifying our continued maintenance of our private option. We certainly had successes, and liked the fact that our internal customers relied on our infrastructure, data preparation and practical advice for the enablement of their use cases. But as our cluster grew in importance, it also grew in costs. And while we could see and partially justify the increased monetary expenditures, what we underestimated was the opportunity costs to the business of staying private, with all the ultimately unproductive energy that effort entails.
When we moved to public cloud, we first transitioned from physical hardware to virtual machines, and then fully to Platform as a Service (PaaS) offerings. With each step, we intentionally lost some control, but provided more flexibility and outputs to the business at reduced effort and cost. Here are the key considerations we believe each organization should consider, phrased first from an IT perspective, but ending with questions that business decision-makers (BDMs) either will or should ask to ensure their needs are met.
- Multi-tenant environments s--k for the tenants – Our Cloudera cluster had roughly a Petabyte of data and daily processes from literally hundreds of use cases fought nearly continuously for resources. The jobs for the big use cases (including our own ETL jobs to maintain the common data asset as jobs were running against it) ran for hours, and all use cases suffered as a result, even if they had "guaranteed minimum" resources. We scraped by through increasingly active monitoring and (sometimes drastic) management, but only really found relief in moving workloads to a public cloud solution, where processing could be separated from storage, and resources could be dedicated to the appropriate level for each need. If BDMs are presented with tenancy on a shared system, they should ask whether and how their needs / jobs will be practically prioritized (or not).
- Scaling private clouds or on-prem RDBMS is slow and expensive. For a while, our solution to the contention issues I noted above was to "add more nodes." But at $10k purchase price and significant carrying cost for each machine – and with a 2-month minimum lead time to get them procured and installed – the economics became less and less tenable. Couple that with the business impact of the required time to take the cluster effectively offline to allow rebalancing of the data once new nodes were installed, and it became effectively impossible. RDBMS systems can be scaled more quickly and easily, but you pay for that – literally. Our Hadoop cluster started to get a lot more data and use cases when the IT department realized it couldn't keep increasing its spend on an RDBMS by multiple millions every year. BDMs should ask how quickly an on-prem solution can be scaled if more capacity or power is required for existing or new use cases.
- It's literally impossible to keep privately-managed systems even close to up-to-date with the innovation in the industry. Given the difficulty of scheduling upgrades on our actively-used Hadoop cluster, we fell further and further behind even Cloudera's releases, much less the innovation happening in the Apache projects. That was five years ago, and advances in the ability to deploy rolling upgrades has helped. But the pace of change now is weekly, if not daily. So while most use cases don't require the latest and greatest functionality, choosing a private cloud option almost certainly locks the organization into a cycle of falling behind – and rapidly. The ability to rapidly experiment on the latest open source offerings allows customer to maintain competitiveness and even get ahead of their competitors. BDMs should ask how the private platform will be kept up to date, and what the expected lag time will be between the release of new functions and their availability on the cluster.
- In a shared environment, new use cases seldom get adequate resources or investment. While it's understandable to try to ensure a shared solution meets the needs of business-critical jobs, the innovation and new value in an organization often come from new use cases – which sometimes have specific needs for new technologies. When our companion data science team at Nokia began working on our cluster, they wanted and needed to install and run a range of statistical software packages, and test out hypotheses on new combinations of data. Our need to protect the performance of the cluster and its existing use cases essentially caused them to move to a public cloud before we could support them in doing so, resulting in duplication of data, costs and effort. IT departments need to understand the risks of increased "shadow IT" that may come from the business' frustration with perceived or real delays in responding to their needs. BDMs should ask whether and how well IT will support software or features that they need, particularly if few or no other groups have the same needs.
- Owning and maintaining infrastructure actually degrades IT's ability to support the business. Our first motivation for moving to a public cloud was largely self-interest – we wanted to get rid of the headaches associated with managing infrastructure so we could focus on providing value to our customers. Our first step to virtual machines alleviated many of those issues, but we still needed to develop and support some of the connective tissue that wasn't offered by cloud providers at the time. As innovation continued and we were able to move completely to PaaS offerings, our response time and ability to act as strategic advisory partners to the business improved dramatically. BDMs should ask if IT believes owning and managing infrastructure is a source of competitive advantage.
- Operating on a public cloud is almost certainly less costly than supporting the same needs in a private cloud. Cloud providers often tout the mantra that "you only pay for what you use." And as we moved to the public cloud, we did see an immediate drop in our monthly operating costs. But it wasn't as great as we'd expected. After the migration was largely complete, though, we were able to focus on managing costs – identifying unused resources, optimizing those that were being used – and the additional savings materialized to an amazing degree. We had to develop and exercise new muscles, but in a few months, we were able to align business needs with costs, and to reduce monthly costs almost 75% from our private cloud run rate. Even better, we were able to provide clear accounting to our customers about the costs, benefits and resulting (greatly improved) ROI associated with our support of their use cases. BDMs should ask whether there will be internal transfer charges for private cloud usage, and how those costs will be determined in a shared environment.
- Data can be made just as secure in a public cloud as a private one. Data managers – particularly in global organizations – have to deal with a complex array of regulations and considerations regarding data privacy, sovereignty and security. And there are certainly situations where government regulation or business covenants preclude the option of placing at least certain data in a public cloud (particularly one not hosted in the "right" geographical location). But for data to be useful, it will have to be exposed on some network(s), and will have to be protected. There are numerous strategies and solutions for securing data in a public cloud. In the end, IT leaders should ask themselves whether they want to be the only ones working to keep their data safe, or whether they instead want the assistance of a global partner who's stake in the game is as great – if not greater. As far as our experience at Nokia, my team and I never faced a data privacy or security issue that arose, or was handled differently, because the data was hosted on a public cloud. BDMs should ask how their data will be protected, but make sure that their business needs are not hampered by over-protective policies.
These seven ideas form a framework for a full discussion of private vs. public cloud options. Again, I recognize that I am not an unbiased source, and that there are considerations and situations in which a private cloud may be necessary or preferred. But I can honestly say a public cloud – and Microsoft Azure in particular, given the strength of its PaaS offering – has fundamental advantages over privately-managed solutions, often resulting in reduced cost and accelerated business outcomes. It is very hard – I would say impossible – to create a private platform that is as scalable, flexible, updatable and focused on the needs of varied business users as can be created on Azure. So while a public cloud is not necessarily the best weapon for every battle, decision-makers in any organization should make sure the points above are strongly considered before promoting a private cloud solution to the front lines.