如何平衡Hadoop数据节点?
- By : Will
- Category : Cloudera-Hadoop

Cloudera-Hadoop
1 前言
一个问题,一篇文章,一出故事。
笔者生产环境有一套Hadoop平台需要自动平衡数据节点的数据,于是整理此文。
2 最佳实践
2.1 手动执行平衡
2.1.1 获取命令帮助
sudo -u hdfs hdfs balancer -h
可见如下显示,
Usage: hdfs balancer [-policy ] the balancing policy: datanode or blockpool [-threshold ] Percentage of disk capacity [-exclude [-f | ]] Excludes the specified datanodes. [-include [-f | ]] Includes only the specified datanodes. [-idleiterations ] Number of consecutive idle iterations (-1 for Infinite) before exit. [-runDuringUpgrade] Whether to run the balancer during an ongoing HDFS upgrade.This is usually not desired since it will not affect used space on over-utilized machines. Generic options supported are -conf specify an application configuration file -D use value for given property -fs specify a namenode -jt specify a ResourceManager -files specify comma separated files to be copied to the map reduce cluster -libjars specify comma separated jar files to include in the classpath. -archives specify comma separated archives to be unarchived on the compute machines. The general command line syntax is bin/hadoop command [genericOptions] [commandOptions] [root@HD01 ~]# sudo -u hdfs hdfs balancer -h Usage: hdfs balancer [-policy ] the balancing policy: datanode or blockpool [-threshold ] Percentage of disk capacity [-exclude [-f | ]] Excludes the specified datanodes. [-include [-f | ]] Includes only the specified datanodes. [-idleiterations ] Number of consecutive idle iterations (-1 for Infinite) before exit. [-runDuringUpgrade] Whether to run the balancer during an ongoing HDFS upgrade.This is usually not desired since it will not affect used space on over-utilized machines. Generic options supported are -conf specify an application configuration file -D use value for given property -fs specify a namenode -jt specify a ResourceManager -files specify comma separated files to be copied to the map reduce cluster -libjars specify comma separated jar files to include in the classpath. -archives specify comma separated archives to be unarchived on the compute machines. The general command line syntax is bin/hadoop command [genericOptions] [commandOptions]
2.1.2 查询当前的集群数据节点
sudo -u hdfs hdfs dfsadmin -printTopology
可见如下显示,
Rack: /default 10.168.0.102:50010 (hd07.cmdschool.org) 10.168.0.105:50010 (hd22.cmdschool.org) 10.168.0.153:50010 (hd10.cmdschool.org) 10.168.0.19:50010 (hd23.cmdschool.org) 10.168.0.215:50010 (hd06.cmdschool.org) 10.168.0.22:50010 (hd08.cmdschool.org) 10.168.0.23:50010 (hd09.cmdschool.org) 10.168.0.24:50010 (hd01.cmdschool.org) 10.168.0.25:50010 (hd02.cmdschool.org) 10.168.0.26:50010 (hd03.cmdschool.org)
2.1.3 使用命令平衡集群数据节点
sudo -u hdfs hdfs balancer -threshold 5.0 -policy DataNode -include hd01.cmdschool.org,hd02.cmdschool.org,hd03.cmdschool.org,hd06.cmdschool.org,hd07.cmdschool.org,hd08.cmdschool.org,hd09.cmdschool.org,hd10.cmdschool.org,hd22.cmdschool.org,hd23.cmdschool.org
2.2 自动执行平衡
2.2.1 创建执行脚本
vim ~/scripts/hdfs-balancer.sh
加入如下代码,
#!/bin/bash hdfsBalancerLog="/var/log/hdfsBalancerLog.log" if [ `ps -ef | grep balancer | egrep -v "$0|grep" | wc -l` -gt "1" ]; then echo `date +'%Y-%m-%d %H:%M:%S'`" Already running!" >> "$hdfsBalancerLog" exit 1 fi hdfsNodes=`sudo -u hdfs hdfs dfsadmin -printTopology | grep ":50010" | awk -F' ' '{print $2}' | sed -e 's/(//g' -e 's/)//g'` if [ `echo "$hdfsNodes" | wc -l` -lt "3" ]; then echo `date +'%Y-%m-%d %H:%M:%S'`" Number is abnormal of nodes!" >> "$hdfsBalancerLog" exit 1 fi balancerNodes=`echo $hdfsNodes | sed -e 's/ /,/g'` echo `date +'%Y-%m-%d %H:%M:%S'`" Execution Started!" >> "$hdfsBalancerLog" sudo -u hdfs hdfs balancer -threshold 5.0 -policy DataNode -include "$balancerNodes" echo `date +'%Y-%m-%d %H:%M:%S'`" Execution Ended!" >> "$hdfsBalancerLog"
2.2.2 设置脚本触发
crontab -e
加入如下配置,
0 0 * * * sh ~/scripts/hdfs-balancer.sh
参阅文档
========================
https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html
没有评论