Introduction: Native Hadoop 2.6.0 Build on Pi

For this guide I will assume that you are familiar with some of the basic Linux Terminal commands. I will also assume you know how to edit text files although you only need to do that once. Which text editor you want to use is up to you. The guide also assumes you are using the default username (pi) and that everything is going to be downloaded and extracted in the home folder (/home/pi).

The actual build process for Hadoop is around 90 minutes on a Pi 2. I have not tried it on a Pi B+ so I cannot guarantee that it will work, however, don't be afraid to give it a try.

In addition, it takes around 1 hour extra to fully update the Pi, install some required packages for the build and compile/install protobuf which is also required by Hadoop. If you do everything right the first time, you can expect the whole guide to take you approximately 3 hours to complete, out of which most of the time you will be waiting for packages to be installed and software to compile. As you will see, there is only a handful of commands you actually have to type, so let's get started!

-------------------------------------------------------

This guide applies to a fresh install of Raspbian on a Pi 2 with Internet connection.

Step 1: Preparing the Build Environment

Update Pi and install packages we need to compile Hadoop:

sudo apt-get update && sudo apt-get upgrade -y && sudo apt-get install -y maven build-essential autoconf automake libtool cmake zlib1g-dev pkg-config libssl-dev libfuse-dev libsnappy-dev libsnappy-java libbz2-dev oracle-java8-jdk subversion

Switch to Oracle Java 8:

sudo update-alternatives --config java

#Select the number that corresponds to Oracle Java 8 when prompted.

Note: We are using Oracle Java 8 instead of OpenJDK for better performance. You can run this command again and set it to 0 after you complete this guide so your system will go back to using the default Java version.

Compile and install protocol buffers (must be v2.5 exactly!):

wget https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz

tar xzvf protobuf-2.5.0.tar.gz

sudo chown -R pi protobuf-2.5.0/

Note: Compiling and installing the protobuf will take around 40 minutes.

cd protobuf-2.5.0

./configure --prefix=/usr

make

make check

sudo make install

cd #Return to home directory

--------------------------------------------

We have now set up our build environment so we can move on to Hadoop.

Step 2: Downloading and Patching Hadoop 2.6.0

Download Hadoop 2.6.0 sources and patch for ARMHF:

wget http://apache.mirrors.spacedump.net/hadoop/core/hadoop-2.6.0/hadoop-2.6.0-src.tar.gz

tar xzvf hadoop-2.6.0-src.tar.gz

sudo chown -R pi hadoop-2.6.0-src/

cd hadoop-2.6.0-src/

-------------------------------------------------

Note: Java 8 is extremely strict with syntax errors and the syntax for some things has changed since the previous version. We are telling Java 8 to accept the older syntax so we won't have to go and fix the syntax in some of the source files.

Open the pom.xml file in hadoop-2.6.0-src and find the part that starts with <properties> at line 81. This section defines global properties for the build process. Between the tags <properties></properties>, add the following line along with the other lines:

<additionalparam>-Xdoclint:none</additionalparam>

#Save the file and close it.

We need to patch the build instructions for ARMHF systems like the Pi:

cd hadoop-common-project/hadoop-common/src

wget https://issues.apache.org/jira/secure/attachment/12570212/HADOOP-9320.patch

patch < HADOOP-9320.patch

#We have now patched the JNIFlags.cmake file. If for some reason it does not get patched automatically and it asks you for the file location, you can provide the path to the JNIFlags.cmake file manually.

Step 3: Building Hadoop 2.6.0

Let's build an ARMHF native Hadoop:

#Return to hadoop-2.6.0-src directory and enter this command:

sudo mvn package -Pdist,native -DskipTests -Dtar

Note: At this point maven (our build environment) will start downloading additional resources from the Internet and will compile Hadoop. It takes approximately 90 minutes.

-------------------------------------------

The compiled Hadoop can be found at: hadoop-2.6.0-src/hadoot-dist/target/hadoop2.6.0

There will also be a tar file of your native hadoop build at: hadoop-2.6.0-src/hadoot-dist/target/hadoop2.6.0.tar.gz

-------------------------------------------

To test your build you can try the following from:hadoop-2.6.0-src/hadoot-dist/target/hadoop2.6.0

bin/hadoop checknative -a

bin/hadoop version