Beliebte Suchanfragen

Cloud Native

DevOps

IT-Security

Agile Methoden

Java

|
//

Tutorial: Installing a Apache Hadoop Single Node Cluster with Hortonworks Data Platform

24.12.2012 | 5 minutes of reading time

In this Tutorial I will show you a complete way how you can install your own small Hadoop Single Node Cluster with the Hortonworks Data Platform inside a Virtualbox. After the easy setup you can play around with the cluster and get some experience with it without the need to setup a new machine. It could also be a local development environment where you can debug your Map/Reduce jobs. The Hortonworks Data Platform is an 100% Open Source Apache Hadoop Distribution and comes with the following components:

  • Hadoop Distributed File System (HDFS)
  • MapReduce
  • Apache Pig
  • Apache Hive
  • Apache HCatalog
  • Templeton
  • Apache HBase
  • Apache ZooKeeper
  • Apache Oozie
  • Apache Sqoop
  • Ganglia
  • Nagios

This tutorial is based on this quick start guide. It’s recommended to have a fast internet connection during the HMC setup. Otherwise you maybe run into problems with Puppet timeouts. In this case you can try to pre-install some of the RPMs. Have a look in this thread in the Hortonworks forum.

Install Virtualbox

  1. The first step is the installation of the Virtualbox Software, which can be downloaded here . Please choose the installation binaries for your operating system.
  2. Install Virtualbox with default options.
  3. Download the ISO for CentOS 6.3 from your favourite mirror . (Maybe you take directly this one).
  4. Install the ISO-file in your Virtualbox. You will find detailed setup instructions here .
  5. Before you start the virtual machine make sure that you configure the following settings:
    • Main memory: 4096 MB
    • Disk space: 16 GB
    • Enable the bridged network adapter
    • Enable IOAPIC
  6. Start the Virtual machine

See also the screenshots below:

Install CentOS

  1. When everthing is working correctly then CentOS will start the installation process.
  2. Please chosse “Install or upgrade an existing system” from the list.
  3. For the hostname leave the default “localhost.localdomain”.
  4. Skip the media test.
  5. Choose the installation type “Minimal Desktop”.
  6. Create a user for the cluster (e.g. hadoop).
  7. After the successful setup reboot your virtual system and login as root.

Prepare the HMC Single Node Cluster Setup

  1. Change the keyboard layout to the correct language through “System->Administration->Keyboard”.
  2. Disable the firewall.
  3. chkconfig iptables off
    chkconfig ip6tables off

  4. Disable SELinux.
  5. vi /etc/selinux/config

    • Change SELINUX=enforcing to SELINUX=disabled.
  6. Configure ntpd to start at bootup.
  7. chkconfig ntpd on

  8. Edit the File “/etc/hosts” so that it looks like in the following screenshot. It is important that the first entry is “localhost.localdomain”, otherwise the HMC-Setup will not work, because you will get a problem with the hostname resolution.
  9. Type “hostname -f” in the terminal. It should be “localhost.localdomain”.
  10. Type “hostname -s” in the terminal. It should be “localhost”.
  11. Start the ssh-Service with
  12. /sbin/service sshd start

  13. Make sure that sshd ist started automatically on startup.
  14. chkconfig sshd on

  15. Prepare password-less SSH Login for the root user to localhost.
  16. ssh-keygen
    ssh-copy-id localhost
    chmod 700 .ssh
    chmod 640 authorized_keys

  17. Check that password-less login works with
  18. ssh localhost

  19. Create a text file “hostdetail.txt” with the host names that will be part of your cluster. In our example with only one Node it should only contain this entry:
  20. localhost.localdomain

  21. When you want to use a GUI-Editor to edit the file then you will get this error. Just install your favourite editor, e.g. gedit. Just follow the instructions.
  22. After this preparation it’s recommended to make a snapshot of your actual system so that you can come back to this point when something goes wrong with the current installation.

Install Hortonworks Data Platform with HMC

  1. Download the RPM (Please verify if there is a newer version on this page )
  2. rpm -Uvh http://public-repo-1.hortonworks.com/HDP-1.1.1.16/repos/centos6/hdp-release-1.1.1.16-1.el6.noarch.rpm

  3. Install “Extra Packages for Enterprise Linux (EPEL)”.
  4. yum install epel-release

  5. Install HMC.
  6. yum install hmc

  7. Check the installation status with
  8. rpm -qa | grep hmc

  9. Start the HMC service. You will be prompted to agree to the Oracle Java License and download the binaries.
  10. service hmc start

  11. Stop the firewall
  12. /etc/init.d/iptables stop

  13. Proceed to the final installation step.

Provisioning Your Cluster

  1. Go to the main page of the Hortonworks Management Center (HMC). Maybe you replace “localhost” with the IP from your Virtual machine host, when you access it from outside.
  2. http://localhost/hmc/html

  3. Follow the wizard instructions
  4. When you are prompted to specify the Disk Mount Point then choose another as proposed in the wizard. For example “/data”.
  5. When the installation was successful you should see this screen 🙂
  6. When there is an error then the following logfiles are maybe helpful for troubleshooting:
  7. /var/log/hmc/hmc.log
    /var/log/puppet_apply.log

  8. You can now go to the dashboard and check the status of your cluster:
  9. To safely shutdown your Cluster please stop all services in the HMC and then you can stop your Virtual machine.
  10. When you restart your system you can start HMC again by issuing the following commands:
  11. service hmc start
    service hmc-agent start

  12. To run the HMC Service on startup follow the steps described here (optional).

You can now start playing around with your own Hadoop Cluster. When you have problems with the setup you can refer to the documentation or just leave a comment here. Merry X-Mas 🙂

|

share post

Likes

0

//

Gemeinsam bessere Projekte umsetzen.

Wir helfen deinem Unternehmen.

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.