Today, I want to give a shout out to one of our partners who has a great offering for Azure Data Lake Store customers.
When ingesting large scale data into a data lake, data often requires data transformations such as cleaning and filtering. StreamSets Data CollectorTM offers an open source software for building and deploying modern data-in-motion flows. You can connect a variety of sources to Azure Data Lake Store with minimal custom coding, even in the face of the inevitable change in data schemas.
To find out more about how StreamSets Data Collector can help you ingest data into Azure Data Lake Store, watch this webinar on Next Gen Analytics at a Major Bank Using Azure Data Lake and StreamSets. And check out this new StreamSets tutorial that gives a great step-by-step guide and accompanying video for quickly building pipelines for ingesting, filtering, and transforming data as it is ingested into Azure Data Lake Store.
Azure Data Lake Store is fully integrated with Azure HDInsight. You can also deploy StreamSets Data Collector on top of Azure HDInsight, in order to enable real-time monitoring and data flow operations of your HDInsight cluster based analytics.
Azure Data Lake Store (ADLS) is the engine that powers storage for cloud big data analytics in Azure and offers a secure cloud-scale hierarchical file system compatible with Apache Hadoop Distributed File System (HDFS). We’ll keep you in the loop here as more innovative solutions like StreamSets enable customers to easily build big data platforms with Azure Data Lake.