Intro: Install and Configure HADOOP on OsX
Installing Hadoop on OSX
I decided that I wanted to setup a Hadoop cluster on the Mac’s I run, this was mainly decided because of Xgrid not begin available anymore on the new version os OsX. I have setup SGE clusters before, Xgrid obviously, and Microsoft Cluster Server so I wanted to get it under my belt. This isn’t the definitive guide but it worked fairly well for me, I am still not sure of some of the concepts but that will come with practice.
The first step is to make sure you have the basics.
Command line Xcode tools and Java Developer for your version os OsX.
Lets first create a group and a user on every machine.
Create a group named ‘hadoop’ and then add an admin user ‘hadoopadmin’ to the group.
Lets do everything as hadoopadmin to make it easy.
You can download Hadoop and install it yourself but I took a shortcut and used homebrew to install it.
->brew install hadoop
This will set all your env paths in the proper hadoop config files so this is a help.
Once installed lets set the config files in hadoop.
I named my machines
hadoop01 & hadoop02 for the first two.
Configure the masters and slaves file on all machines.
Also configure /etc/hosts on all machines.
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
I am using 2.4.0 so they are located in
I changed these two lines.
export JAVA_HOME=`/usr/libexec/java_home -v 1.6`
#export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true”
export HADOOP_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc=“
This last one stopped an error I was getting upon startup.
Insert this configuration
Now lets create a few hadoop directories
-> hadoop -mkdir tmp
-> hadoop -mkdir hdfs
-> hadoop -mkdir hdfs/name
-> hadoop -mkdir hdfs/data
I enabled passwordless SSH on all machines.
ssh-keygen -t dsa -P ” -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
I found info on this at
I then formatted the name node
-> hadoop namenode -format
Then started hadoop by running
I did all of this stuff on all my machines, although some items I think do not need to be.
I have to thank
For tutorials and help getting through this.