Data Scientists have often worked in a bit of a “silo” – meaning they were off to the side in an organization, maybe not even part of the Information Technology (IT) function. But that is changing. As data science projects are adopted into the mainstream, there is a need for structure. I’ve explained a modern data science structure for integration, called the Team Data Science Process (TDSP). It’s similar to IITL or the MOF but is designed to handle the processes involved in machine learning, artificial intelligence, and other advanced analytics.
Developer Operations – or DevOps – is not a framework for “doing Information Technology”. It’s really three things: People, Process, and Products. I’ll explain more about DevOps in a later article, but the point is that DevOps overlays the TDSP nicely, and is certainly something you need to think about from the outset. To distill the thought a bit, DevOps can be thought of as a “shift-left” mentality. That means at the very start of the project, you think about the outcomes of each step – coding, building, testing, deployment, security, patching – all that.
Seems difficult, doesn’t it? It’s actually not. Yes, there is work involved, but once you start, it simply becomes part of the process. And like all good habits, it requires a little effort and maintenance to keep it going. I’ll show you how to implement DevOps in Data Science as we go, but for now, know that it is essential to your data science projects. Essential? Why?
Because security. Because maintenance. Because testing. Because constant technical debt. For these reasons and many more that will become apparent, you need to start thinking about not only the TDSP as your structure your projects, but also DevOps. In this series I’ll show you how.