龙空技术网

hadoop 2.6.2 安装

坚持学习hadoop 210

前言:

今天大家对“apachehadoop2安装教程”大体比较看重,同学们都想要知道一些“apachehadoop2安装教程”的相关资讯。那么小编同时在网络上汇集了一些有关“apachehadoop2安装教程””的相关知识,希望同学们能喜欢,看官们快快来了解一下吧!

1、解压 tar -xzvf hadoop-2.6.2.tar.gz

2. 创建目录先在本地文件系统创建以下文件夹:~/hadoop/tmp、~/dfs/data、~/dfs/name。

都在/hadoop/etc/hadoop文件夹下,可以用gedit命令对其进行编辑。

~/hadoop/etc/hadoop/hadoop-env.sh

~/hadoop/etc/hadoop/yarn-env.sh

~/hadoop/etc/hadoop/slaves

~/hadoop/etc/hadoop/core-site.xml

~/hadoop/etc/hadoop/hdfs-site.xml

~/hadoop/etc/hadoop/mapred-site.xml

~/hadoop/etc/hadoop/yarn-site.xml

4、进去hadoop配置文件目录

[spark@S1PA11 hadoop-2.6.0]$ cd etc/hadoop/

[spark@S1PA11 hadoop]$ ls

capacity-scheduler.xml hadoop-env.sh httpfs-env.sh kms-env.sh mapred-env.sh ssl-client.xml.example

configuration.xsl hadoop-metrics2.properties httpfs-log4j.properties kms-log4j.properties mapred-queues.xml.template ssl-server.xml.example

container-executor.cfg hadoop-metrics.properties httpfs-signature.secret kms-site.xml mapred-site.xml yarn-env.cmd

core-site.xml hadoop-policy.xml httpfs-site.xml log4j.properties mapred-site.xml.template yarn-env.sh

hadoop-env.cmd hdfs-site.xml kms-acls.xml mapred-env.cmd slaves yarn-site.xml

4.1、配置 hadoop-env.sh文件-->修改JAVA_HOME

# The java implementation to use.

export JAVA_HOME=/usr/local/java/jdk1.7.0_79

4.2、配置 yarn-env.sh 文件-->>修改JAVA_HOME

# some Java parameters

export JAVA_HOME=/usr/local/java/jdk1.7.0_79

4.3、配置slaves文件-->>增加slave节点

node2

node3

node4

node5

4.4、

配置 core-site.xml文件-->>增加hadoop核心配置(file:/root/hadoop-2.6.2/tmp)

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://node1:8020</value>

</property>

<property>

<name>io.file.buffer.size</name>

<value>131072</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>file:/root/hadoop-2.6.2/tmp</value>

<description>Abase for other temporary directories.</description>

</property>

<property>

<name>hadoop.proxyuser.u0.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.u0.groups</name>

<value>*</value>

</property>

</configuration>

4.5、

配置 hdfs-site.xml 文件-->>增加hdfs配置信息(namenode、datanode端口和目录位置)

<configuration>

<property>

<name>dfs.http.address</name>

<value>node1:50070</value>

</property>

<property>

<name>dfs.namenode.secondary.http-address</name><value>node1:50090</value>

</property>

<property><name>dfs.namenode.secondary.http-address</name>

<value>node1:9001</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/root/hadoop-2.6.2/dfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/root/hadoop-2.6.2/dfs/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

<property>

<name>dfs.webhdfs.enabled</name>

<value>true</value></property>

</configuration>

4.6、

配置 mapred-site.xml 文件-->>增加mapreduce配置(使用yarn框架、jobhistory使用地址以及web地址)

注意:2.7.2版本没找到,只有mapred-site.xml.template

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>node1:10020</value>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>node1:19888</value>

</property>

</configuration>

4.7、

配置 yarn-site.xml 文件-->>增加yarn功能

<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.resourcemanager.address</name>

<value>node1:8032</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>node1:8030</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>node1:8031</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>node1:8033</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>node1:8088</value>

</property>

</configuration>

5、将配置好的hadoop文件copy到另一台slave机器上

[spark@S1PA11 opt]$ scp -r hadoop-2.6.2/ node2@192.168.44.130:/root

四、验证

1、格式化namenode:

cd hadoop-2.6.2/

ls

bin dfs etc include input lib libexec LICENSE.txt logs NOTICE.txt README.txt sbin share tmp

./bin/hdfs namenode -format

cd ~/opt/hadoop-2.6.2

./bin/hdfs namenode -format

2、启动hdfs:

./sbin/start-dfs.sh

15/01/05 16:41:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Starting namenodes on [S1PA11]

S1PA11: starting namenode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-namenode-S1PA11.out

S1PA222: starting datanode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-datanode-S1PA222.out

Starting secondary namenodes [S1PA11]

S1PA11: starting secondarynamenode, logging to /home/spark/opt/hadoop-2.6.0/logs/hadoop-spark-secondarynamenode-S1PA11.out

15/01/05 16:41:21 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[spark@S1PA11 hadoop-2.6.0]$ jps

22230 Master

30889 Jps

22478 Worker

30498 NameNode

30733 SecondaryNameNode

19781 ResourceManager

3、停止hdfs:

[spark@S1PA11 hadoop-2.6.0]$./sbin/stop-dfs.sh

15/01/05 16:40:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Stopping namenodes on [S1PA11]

S1PA11: stopping namenode

S1PA222: stopping datanode

Stopping secondary namenodes [S1PA11]

S1PA11: stopping secondarynamenode

15/01/05 16:40:48 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

[spark@S1PA11 hadoop-2.6.0]$ jps

30336 Jps

22230 Master

22478 Worker

19781 ResourceManager

4、启动yarn:

[spark@S1PA11 hadoop-2.6.0]$./sbin/start-yarn.sh

starting yarn daemons

starting resourcemanager, logging to /home/spark/opt/hadoop-2.6.0/logs/yarn-spark-resourcemanager-S1PA11.out

S1PA222: starting nodemanager, logging to /home/spark/opt/hadoop-2.6.0/logs/yarn-spark-nodemanager-S1PA222.out

[spark@S1PA11 hadoop-2.6.0]$ jps

31233 ResourceManager

22230 Master

22478 Worker

30498 NameNode

30733 SecondaryNameNode

31503 Jps

5、停止yarn:

[spark@S1PA11 hadoop-2.6.0]$ ./sbin/stop-yarn.sh

stopping yarn daemons

stopping resourcemanager

S1PA222: stopping nodemanager

no proxyserver to stop

[spark@S1PA11 hadoop-2.6.0]$ jps

31167 Jps

22230 Master

22478 Worker

30498 NameNode

30733 SecondaryNameNode

6、查看集群状态:

[spark@S1PA11 hadoop-2.6.0]$ ./bin/hdfs dfsadmin -report

15/01/05 16:44:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

Configured Capacity: 52101857280 (48.52 GB)

Present Capacity: 45749510144 (42.61 GB)

DFS Remaining: 45748686848 (42.61 GB)

DFS Used: 823296 (804 KB)

DFS Used%: 0.00%

Under replicated blocks: 10

Blocks with corrupt replicas: 0

Missing blocks: 0

-------------------------------------------------

Live datanodes (1):

Name: 10.126.45.56:50010 (S1PA222)

Hostname: S1PA209

Decommission Status : Normal

Configured Capacity: 52101857280 (48.52 GB)

DFS Used: 823296 (804 KB)

Non DFS Used: 6352347136 (5.92 GB)

DFS Remaining: 45748686848 (42.61 GB)

DFS Used%: 0.00%

DFS Remaining%: 87.81%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Mon Jan 05 16:44:50 CST 2015

7、查看hdfs:

8、查看RM:

9、运行wordcount程序

9.1、创建 input目录:[spark@S1PA11 hadoop-2.6.0]$ mkdir input

9.2、在input创建f1、f2并写内容

[spark@S1PA11 hadoop-2.6.0]$ cat input/f1

Hello world bye jj

[spark@S1PA11 hadoop-2.6.0]$ cat input/f2

Hello Hadoop bye Hadoop

9.3、在hdfs创建/tmp/input目录

[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -mkdir /tmp

[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -mkdir /tmp/input

9.4、将f1、f2文件copy到hdfs /tmp/input目录

[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -put input/ /tmp

9.5、查看hdfs上是否有f1、f2文件

[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -ls /tmp/input/

-rw-r--r-- 3 spark supergroup 20 2015-01-04 19:09 /tmp/input/f1

-rw-r--r-- 3 spark supergroup 25 2015-01-04 19:09 /tmp/input/f2

9.6、执行wordcount程序

[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /tmp/input /output

15/01/05 17:00:09 INFO client.RMProxy: Connecting to ResourceManager at S1PA11/10.58.44.47:8032

15/01/05 17:00:11 INFO input.FileInputFormat: Total input paths to process : 2

15/01/05 17:00:11 INFO mapreduce.JobSubmitter: number of splits:2

15/01/05 17:00:11 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1420447392452_0001

15/01/05 17:00:12 INFO impl.YarnClientImpl: Submitted application application_1420447392452_0001

15/01/05 17:00:12 INFO mapreduce.Job: The url to track the job:

15/01/05 17:00:12 INFO mapreduce.Job: Running job: job_1420447392452_0001

9.7、查看执行结果

[spark@S1PA11 hadoop-2.6.0]$ ./bin/hadoop fs -cat /output/part-r-0000

15/01/05 17:06:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

scp -r /root/hadoop-2.7.2 root@192.168.44.130:/root

scp -r root/zookeeper-3.4.6 root@192.168.44.131:/root

标签: #apachehadoop2安装教程