Hadoop的安装配置

下载

下载地址

点击下载

下载步骤

进入首页找到对应版本的Hadoop

1585021309825
1585021309825

然后点击进入

1585021342605
1585021342605

下载对应的压缩包即可!!!

安装配置

版本

  • hadoop版本:3.1.3

  • java的版本:1.8

基础配置

新建文件夹

1
$ sudo mkdir /opt/softwlare/hadoop/

复制解压

1
$ sudo tar -xvf /tmp/hadoop-3.1.3.tar.gz /opt/software/hadoop/

配置环境变量

1
sudo vi ~/.bashrc

添加环境变量如下:

1
2
3
4
5
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_212
export JRE_HOME=${JAVA_HOME}/jre  
export HADOOP_HOME=/opt/software/hadoop/hadoop-3.1.3
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  
export PATH=${JAVA_HOME}/bin:$PATH:${HADOOP_HOME}/bin:$PATH:${HADOOP_HOME}/sbin

重新加载环境变量

1
$ source ~/.bashrc

给Hadoop配置java环境

1
$ sudo vi /opt/software/hadoop/hadoop-3.1.3/etc/hadoop/hadoop-env.sh

配置java的环境变量

1
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_212

伪分布式Hadoop的安装

etc/hadoop/hadoop-env.sh 配置

给该文件配置java环境,配置内容如下:

1
2
# set to the root of your Java installation
export JAVA_HOME=/usr/java/latest

etc/hadoop/core-site.xml配置

1
2
3
4
5
6
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

配置hdfs的地址以及端口。

etc/hadoop/hdfs-site.xml配置

1
2
3
4
5
6
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

配置hdfs的副本为1一个。

注意:这里如果不配置端口,在web页访问,默认的端口是:9870

etc/hadoop/mapred-site.xml 配置

1
2
3
4
5
6
7
8
9
10
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>

etc/hadoop/yarn-site.xml配置

1
2
3
4
5
6
7
8
9
10
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>

配置ssh免密登录

见VMware篇的配置介绍。

启动Hadoop相关服务

第一步:格式化namenode的节点
1
$ bin/hdfs namenode -format
第二步:启动dfs
1
$ sbin/start-dfs.sh

启动完成后可以在浏览器访问: http://ip:9870/ ,如果界面显示完整,说明启动成功!

第三步:启动yarn
1
$ sbin/start-yarn.sh

启动成功后,通过jps命令查看是否启动成功!然后在浏览器输入: http://ip:8088/进行访问。

第四步: MapReduce JobHistory Server 启动
1
$ /bin/mapred --daemon stop historyserver

启动成功后,在浏览器输入:http://op: 19888/进行访问。

集群Hadoop的安装

遇到问题

LVM扩容错误

错误详情

1
/etc/lvm/archive/.lvm_ubuntu_2042_1912908381: write error failed: No space left on device

错误原因

磁盘满了

解决办法

查看磁盘的情况
1
$ df -h
1585033504452
1585033504452
扩容磁盘

从上图我们可以看出 /dev/mapper/ubuntu–vg-ubuntu–lv磁盘的使用情况是满的,所以我们要将其进行扩容。

1
$ sudo lvresize -A n -L +10G /dev/mapper/ubuntu--vg-ubuntu--lv
1
$ sudo resize2fs -p /dev/mapper/ubuntu--vg-ubuntu--lv

扩容完成后,我们再查看磁盘的使用状况,我们发现磁盘的状况发生了改变,再试的时候一切就恢复正常了!

LOCKED问题

错误详情

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
2020-03-25 08:17:53,746 ERROR org.apache.hadoop.hdfs.server.common.Storage: It appears that another node  50239@master has already locked the storage directory: /opt/software/hadoop/hadoop-3.1.3/hdfs/data
java.nio.channels.OverlappingFileLockException
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:902)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:867)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:676)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:272)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:407)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:387)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:559)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1743)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1679)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822)
at java.lang.Thread.run(Thread.java:748)
2020-03-25 08:17:53,750 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /opt/software/hadoop/hadoop-3.1.3/hdfs/data. The directory is already locked
2020-03-25 08:17:53,750 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/opt/software/hadoop/hadoop-3.1.3/hdfs/data
java.io.IOException: Cannot lock storage /opt/software/hadoop/hadoop-3.1.3/hdfs/data. The directory is already locked
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:872)
at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:676)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:272)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:407)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:387)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:559)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1743)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1679)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822)
at java.lang.Thread.run(Thread.java:748)
2020-03-25 08:17:53,751 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to master/192.168.118.71:9000. Exiting.
java.io.IOException: All specified directories have failed to load.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:560)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1743)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1679)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:390)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:282)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:822)
at java.lang.Thread.run(Thread.java:748)
2020-03-25 08:17:53,751 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to master/192.168.118.71:9000
2020-03-25 08:17:53,752 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
2020-03-25 08:17:55,753 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode
2020-03-25 08:17:55,769 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

错误原因

文件夹权限不够,给定文件夹的权限为当前用户的权限!!或者直接把

解决办法

赋权限的命令

1
$ chown winstar:winstar /tmp

参考文档

官方文档


扫描以下公众号关注小猿↓↓↓↓↓↓↓↓

更多资讯请在简书、微博、今日头条、掘金、CSDN都可以通过搜索“Share猿”找到小猿哦!!!