目录

Hadoop问题记录

Q&A

1 HDFS 问题

1.1 does not have any open files

ERROR

[MainThread] WARN  org.apache.hadoop.security.UserGroupInformation  - PriviledgedActionException as:ads_db (auth:SIMPLE) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/ads_db/oozie-oozi/0009835-151204024716154-oozie-oozi-W/orion_report_new--pig/output/_temporary/1/_temporary/attempt_1448953762828_116807_m_000000_0/part-00000 (inode 18580914): File does not exist. Holder DFSClient_attempt_1448953762828_116807_m_000000_0_-190684571_1 does not have any open files.
	        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3605)
	        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3693)
	        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3663)
	        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:730)
	        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:243)
	        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:527)
	        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
	        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
	        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
	        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
	        at java.security.AccessController.doPrivileged(Native Method)
	        at javax.security.auth.Subject.doAs(Subject.java:415)
	        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
  • 【解决方法】 Hadoop HDFS Datanode 有一个同时处理文件的上限默认为4096,x2增长原则,修改为8192
dfs_datanode_max_xcievers: 4096 为 dfs_datanode_max_xcievers: 8192

1.2 miss block

$ hdfs debug recoverLease -path  /xxx.gz
recoverLease SUCCEEDED on /xxx.gz

hdfs fsck /xxx.gz -files -blocks -locations

7201-xx-1537955202024 block blk_1206152126
 MISSING 1 blocks of total size 9491 B
0. BP-2092257201-2024:blk_1206152126_132426075 len=9491 MISSING!

Status: CORRUPT
 Total size:	9491 B
 Total dirs:	0
 Total files:	1
 Total symlinks:		0
 Total blocks (validated):	1 (avg. block size 9491 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:	1 (100.0 %)
  dfs.namenode.replication.min:	1
  CORRUPT FILES:	1
  MISSING BLOCKS:	1
  MISSING SIZE:		9491 B
  CORRUPT BLOCKS: 	1
  ********************************
 Minimally replicated blocks:	0 (0.0 %)
 Over-replicated blocks:	0 (0.0 %)
 Under-replicated blocks:	0 (0.0 %)
 Mis-replicated blocks:		0 (0.0 %)
 Default replication factor:	3
 Average block replication:	0.0
 Corrupt blocks:		1
 Missing replicas:		0
 Number of data-nodes:		50
 Number of racks:		1
FSCK ended at Wed Sep 18 20:10:37 CST 2019 in 2 milliseconds

-w570