[Hadoop] Install Hadoop on Ubuntu 12.04 x86_64

Install Hadoop on Ubuntu 12.04 x86_64
A)Install JAVA JDK
# chmod u+x jdk-6u43-linux-x64.bin
  # ./jdk-6u43-linux-x64.bin.bin
  # sudo mkdir -p /usr/lib/jvm
  # sudo mv jdk1.6.0_43 /usr/lib/jvm/
  # sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.6.0_43/bin/java" 1
  # sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/lib/jvm/jdk1.6.0_43/bin/javac" 1
 
B)Add Hadoop User
# sudo addgroup hadoop
# sudo adduser --ingroup hadoop hduser  (1qaz2wsx)
# sudo adduser hduser sudo  <-- add="" p="" permission="" sudo="">
C)Configuring SSH
# su - hduser
# ssh-keygen -t rsa -P ""
# cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
# ssh localhost

D)Disable ipv6
# sudo vi /etc/sysctl.conf
--------------------------
# disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
# check ipv6 disabled , return 1 for disable
# cat /proc/sys/net/ipv6/conf/all/disable_ipv6

E)Install Hadoop
#下載:http://apache.stu.edu.tw/hadoop/core/
# cd
# wget http://apache.stu.edu.tw/hadoop/core/hadoop-1.1.2/hadoop-1.1.2.tar.gz
# cd /usr/local
# sudo tar -zxvf ~/hadoop-1.1.2.tar.gz
# sudo mv hadoop-1.0.3 hadoop
# sudo chown -R hduser:hadoop hadoop

F)Update bash
# cd
# vi .bashrc
--------------------------
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop

# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/jdk1.6.0_43

# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"

# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
    hadoop fs -cat $1 | lzop -dc | head -1000 | less
}

# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin
--------------------------

G) Configuration
1.hadoop-env.sh  加入JAVA_HOME
# vi /usr/local/hadoop/conf/hadoop-env.sh
--------------------------
export JAVA_HOME=/usr/lib/jvm/jdk1.6.0_43
--------------------------

2.HDFS
# sudo mkdir -p /app/hadoop/tmp
# sudo chown hduser:hadoop /app/hadoop/tmp
# sudo chmod 750 /app/hadoop/tmp

3.core-site.xml
# vi /usr/local/hadoop/conf/core-site.xml

4.mapred-site.xml
# vi /usr/local/hadoop/conf/mapred-site.xml


5.hdfs-site.xml
# vi /usr/local/hadoop/conf/hdfs-site.xml
6.Format Name Node
# /usr/local/hadoop/bin/hadoop namenode -format

H) Run & Shutdown
# /usr/local/hadoop/bin/start-all.sh
# /usr/local/hadoop/bin/stop-all.sh

I) Testing...
1.Download input data
# mkdir /tmp/gutenberg
# cd /tmp/gutenberg
# wget http://www.gutenberg.org/cache/epub/4300/pg4300.txt
# wget http://www.gutenberg.org/cache/epub/5000/pg5000.txt
# wget http://www.gutenberg.org/cache/epub/20417/pg20417.txt

2.Start Hadoop
# /usr/local/hadoop/bin/start-all.sh

3.Copy local example data to HDFS
# hadoop dfs -copyFromLocal /tmp/gutenberg /user/hduser/gutenberg
# hadoop dfs -ls /user/hduser/gutenberg

4.run mapreduce wordcount
# bin/hadoop jar hadoop-examples-1.1.2.jar wordcount /user/hduser/gutenberg /user/hduser/gutenberg-output
# bin/hadoop dfs -ls /user/hduser/gutenberg-output
# bin/hadoop dfs -cat /user/hduser/gutenberg-output/part-r-00000
# bin/hadoop dfs -getmerge /user/hduser/gutenberg-output /tmp/gutenberg-outpu
# head /tmp/gutenberg-output/gutenberg-output

J) WEB UI
1.TaskTracker Web Interface (MapReduce layer)
- http://localhost:50060
2.JobTracker Web Interface (MapReduce layer)
- http://localhost:50030
3.NameNode Web Interface (HDFS layer)
- http://localhost:50070








沒有留言:

張貼留言