Posts Tagged “cloudera”

Kafka Kerberos Enable and Testing.

Apache Kafka is a distributed streaming platform. Kafka 2.0 supports Kerberos authentication, Enabling Kerberos Authentication Using the Wizard on cloudera manager. Courtesy - Apache Kafka

Written on May 16, 2017
linux centos redhat cloudera kafka kerberos cluster


Cloudera Manager - Duplicate entry 'zookeeper' for key 'NAME'.

We had recently built a cluster using cloudera API’s and had all the services running on it with Kerberos enabled. Next we had a requirement to add another kafka cluster to our already exsisting cluster in cloudera manager. Since it is a quick task to get the zookeeper and kafka up and running. We decided to get this done using the cloudera manager instead of the API’s. But we faced the Duplicate entry 'zookeeper' for key 'NAME' issue as described in the bug below.

https://issues.cloudera.org/browse/DISTRO-790

Written on May 14, 2017
linux centos redhat cloudera kafka zookeeper cluster


Parcel Not Distributing Cloudera CDH.

We were deploying one of the cluster on our lab environment which is used by everyone. So the lab has it own share of stale information on it.

Written on March 8, 2017
linux centos redhat cloudera hadoop cluster


Enable Kerberos Using Cloudera API.

Python API for cloudera is really nice, apart from getting the cluster setup, we can also do configuration and automation. We use a lot of automation using Chef/Ansible, but cloudera API give more control over the cluster.

Written on February 26, 2017
linux cloudera hadoop cloudera-api kerberos


Setting Up HDFS Services Using Cloudera API [Part 3]

This is the second follow up post. In the earlier post

Written on February 8, 2017
linux cloudera hadoop cloudera-api zookeeper hdfs


Setting Up Zookeeper Services Using Cloudera API [Part 2]

This is the second follow up post. In the earlier post Setting Up Cloudera Manager Services Using Cloudera API [Part 1] we install the cloudera management services. Now we will be installing Zookeeper service to the cluster.

Written on February 2, 2017
linux cloudera hadoop cloudera-api zookeeper


Setting Up Cloudera Manager Services Using Cloudera API [Part 1]

Cloudera API is a very convenient way to setup a cluster and do more.

Written on January 25, 2017
linux cloudera hadoop cloudera-api


Getting Started with Cloudera API

This is a basic steps to get connected with cloudera manager.

Written on January 14, 2017
linux cloudera hadoop cloudera-api


Basic Testing On Hadoop Environment [Cloudera]

These are a set of testing which we can do on a Hadoop environment. These are basic testing to make sure the environment is setup correctly.

Written on January 13, 2017
linux cloudera hadoop testing


Setting Hue to Listen on `0.0.0.0` [Cloudera]

We were working on setting up a cluster, but the Hue URL was set to a private IP of the server. As we had setup all the nodes to access each other using a private IP. But we wanted Hue to bind to public interface so that it can be accessed within the network.

Written on January 12, 2017
linux cloudera hadoop


Update Cloudera Manager to specific version [5.4.5]

Upgrading Cloudera Manager 5 to the Latest Cloudera Manager, In most cases it is possible to complete the following upgrade without shutting down most CDH services, although you may need to stop some dependent services. CDH daemons can continue running, unaffected, while Cloudera Manager is upgraded. The upgrade process does not affect your CDH installation. After upgrading Cloudera Manager you may also want to upgrade CDH 5 clusters to CDH 5.4.5 or latest.

Written on October 22, 2015
linux hadoop upgrade cloudera manager


Getting started with Hive with Kerberos.

Apache Hive is a powerful data warehousing application built on top of Hadoop; it enables you to access your data using Hive QL, a language that is similar to SQL. Install Hive on your client machine(s) from which you submit jobs; you do not need to install it on the nodes in your Hadoop cluster. If Kerberos authentication is used, authentication is supported between the Thrift client and HiveServer2, and between HiveServer2 and secure HDFS.

Written on October 20, 2015
linux hadoop hive kerberos ad ldap cloudera security


Setting up Pentaho Data Integration 5.4.1 with Hadoop Cluster (Clouder Manager)

Pentaho Data Integration (PDI) is a suite of open source Business Intelligence (BI) products which provide data integration, OLAP services, reporting, dashboarding, data mining and ETL capabilities. Here we are setting up pentaho server has the below steps. Will be referring to Pentaho Data Integration as PDI from now on.

Written on October 6, 2015
linux hadoop pentaho data-integration cloudera manager