RDBMS Vs Hadoop

 

Hi,

Thanks for stopping by here, let me shamelessly accept the fact, I was not sure of what HDInsight is, what Hadoop is or what BigData is and why someone should use it instead of traditional RDBMS as I have spent most of my time in Web Development space like MVC, WPF and other .Net client side dev technologies with RDBMS like SQL. I had heard about Hadoop in bits and pieces from many of my industry friends. BigData or Hadoop is not an answer to any problem that you have. You need to know when to use it and when not to use it before you finalize on it. So thought will write a small blog for those developers like me who want to know it quickly may be precisely in a page.

Now that I’m trying things on Azure Big data, according to me Big data means really a big data, it is a collection of large datasets that cannot be processed using traditional computing techniques. Big data is not merely a data, rather it has become a complete subject, which involves various tools, techniques and frameworks.

Why Big Data?

  • New Data Sources
  • Non Traditional Data Sources
  • Large Data Volumes

So, the other question is when to use RDBMS as supposed to Hadoop. Below table kind of summarizes well as when to go with Hadoop Vs RDBMS.

clip_image001

Apache Hadoop is an open source software project that enables distributed processing of large data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with very high degree of fault tolerance.

Microsoft Azure HDInsight is a 100% Apache Hadoop-based service in the Azure cloud. It offers all the advantages of Hadoop, plus the ability to integrate with Excel, your on-premises Hadoop clusters, and the Microsoft ecosystem of business software and services. HDInsight-Our 100% Apache Hadoop-based service in the cloud which can scale to petabytes on demand to process unstructured and semi-structured data and you can use any of your development platform like Java, .NET, and more. You can actually spin up a Hadoop cluster in minutes and you can also easily integrate on-premises Hadoop clusters.

Find out more https://azure.microsoft.com/en-in/services/hdinsight/

https://azure.microsoft.com/en-in/documentation/articles/hdinsight-hadoop-tutorial-get-started-windows/

Hope that helps to get started.

Cheers,

Goutham