Backup Cloudera data to Azure Storage

Azure Blob Storage supports an HDFS interface which can be accessed by HDFS clients using the syntax wasb://.  The hadoop-azure module which implements this interface is distributed with Apache Hadoop, but is not configured out of the box in Cloudera.  In this blog, we will provide instructions on how to backup Cloudera data to Azure…

0

Run Jupyter Notebook on Cloudera

In a previous blog, we demonstrated how to enable Hue Spark notebook with Livy on CDH.  Here we will provide instructions on how to run a Jupyter notebook on a CDH cluster.   These steps have been verified on a default deployment of Cloudera CDH cluster on Azure.  At the time of this writing, the…

2

Enable Kerberos on Cloudera with Azure AD Domain Service

In this previous blog series we documented how to integrate Active Directory deployed in virtual machines on Azure with Cloudera. In that scenario, we need to deploy and maintain the domain controller VMs ourselves. In this article, we will use Azure Active Directory Domain Service (AADDS) to integrate Kerberos and single-sign-on with Cloudera.  AADDS is a managed service…

2

Run Hue Spark Notebook on Cloudera

When you deploy a CDH cluster using Cloudera Manager, you can use Hue web UI to run, for example, Hive and Impala queries.  But Spark notebook is not configured out of the box.  Turns out installing and configuring Spark notebooks on CDH isn’t as straightforward as is described in their existing documentation.  In this blog,…

7