How to handle HDFS blocks with corrupted replicas or under replicated

Preamble

In Ambari, and in HDFS more precisely, there are two widgets that will jump to your eyes if they are not equal to zero. They are Blocks With Corrupted Replicas and Under Replicated Blocks. In the graphical interface this is something like:

hdfs_blocks01
hdfs_blocks01

Managing under replicated blocks

You get a complete list of impacted file with this command. The grep command is simply to remove all the lines with multiple point symbol:

hdfs@client_node:~$ hdfs fsck / | egrep -v '^\.+'
Connecting to namenode via http://namenode01.domain.com:50070/fsck?ugi=hdfs&path=%2F
FSCK started by hdfs (auth:SIMPLE) from /10.75.144.5 for path / at Tue Jul 23 14:58:19 CEST 2019
/tmp/hive/training/dda2d815-27d3-43e0-9d3b-4aea3983c1a9/hive_2018-09-18_12-13-21_550_5155956994526391933-12/_tez_scratch_dir/split_Map_1/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1087250107_13523212. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/ambari-qa/.staging/job_1525749609269_2679/libjars/hive-shims-0.23-1.2.1000.2.6.4.0-91.jar:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1074437588_697325. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/ambari-qa/.staging/job_1525749609269_2679/libjars/hive-shims-common-1.2.1000.2.6.4.0-91.jar:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1074437448_697185. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/hdfs/.staging/job_1541585350344_0008/job.jar:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1092991377_19266614. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/hdfs/.staging/job_1541585350344_0008/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1092991378_19266615. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/hdfs/.staging/job_1541585350344_0041/job.jar:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1093022580_19297823. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/hdfs/.staging/job_1541585350344_0041/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1093022581_19297824. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1519657336782_0105/job.jar:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1073754565_13755. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1519657336782_0105/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1073754566_13756. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1519657336782_0105/libjars/hive-hcatalog-core.jar:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1073754564_13754. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536057043538_0001/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085621525_11894367. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536057043538_0002/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085621527_11894369. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536057043538_0004/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085621593_11894435. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536057043538_0023/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085622064_11894906. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536057043538_0025/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085622086_11894928. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536057043538_0027/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085622115_11894957. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536057043538_0028/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1085622133_11894975. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536642465198_0002/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397707_12670663. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536642465198_0003/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397706_12670662. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536642465198_0004/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397708_12670664. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536642465198_0005/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397718_12670674. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536642465198_0006/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397720_12670676. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536642465198_0007/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086397721_12670677. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
/user/training/.staging/job_1536642465198_2307/job.split:  Under replicated BP-1711156358-10.75.144.1-1519036486930:blk_1086509846_12782817. Target Replicas is 10 but found 8 live replica(s), 0 decommissioned replica(s) and 0 decommissioning replica(s).
 Total size:    25053842686185 B (Total open files size: 33152660866 B)
 Total dirs:    1500114
 Total files:   12517972
 Total symlinks:                0 (Files currently being written: 268)
 Total blocks (validated):      12534979 (avg. block size 1998714 B) (Total open file blocks (not validated): 325)
 Minimally replicated blocks:   12534979 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       24 (1.9146422E-4 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.9715698
 Corrupt blocks:                0
 Missing replicas:              48 (1.2886386E-4 %)
 Number of data-nodes:          8
 Number of racks:               2
FSCK ended at Tue Jul 23 15:01:08 CEST 2019 in 169140 milliseconds
 
 
The filesystem under path '/' is HEALTHY

All those files have the strange “Target Replicas is 10 but found 8 live replica(s)” message while our default replication factor is 3, so I have decided to set it back to 3 with:

hdfs@client_node:~$ hdfs dfs -setrep 3 /user/training/.staging/job_1536642465198_2307/job.split
Replication 3 set: /user/training/.staging/job_1536642465198_2307/job.split

And it solved the issue (24 to 23 under replicated blocks):

 Total size:    25009165459559 B (Total open files size: 1006863 B)
 Total dirs:    1500150
 Total files:   12528930
 Total symlinks:                0 (Files currently being written: 47)
 Total blocks (validated):      12545325 (avg. block size 1993504 B) (Total open file blocks (not validated): 43)
 Minimally replicated blocks:   12545325 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       23 (1.8333523E-4 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.9715445
 Corrupt blocks:                0
 Missing replicas:              46 (1.2339374E-4 %)
 Number of data-nodes:          8
 Number of racks:               2
FSCK ended at Tue Jul 23 15:49:27 CEST 2019 in 155922 milliseconds

That we also see graphically under Ambari:

hdfs_blocks02
hdfs_blocks02

There is also an option to delete the files if it has no interested or if, like for me, this is a temporary old file like this one of my list:

hdfs@client_node:~$ hdfs dfs -ls -d /tmp/hive/training/dda2d815*
drwx------   - training hdfs          0 2018-09-18 12:53 /tmp/hive/training/dda2d815-27d3-43e0-9d3b-4aea3983c1a9

Then deleting it and skipping the trash (no recover possible, if you are unsure keep the trash activated, but you will have to wait that HDFS purge the trash after fs.trash.interval minutes):

hdfs@client_node:~$ hdfs dfs -rm -r -f -skipTrash /tmp/hive/training/dda2d815-27d3-43e0-9d3b-4aea3983c1a9
Deleted /tmp/hive/training/dda2d815-27d3-43e0-9d3b-4aea3983c1a9

Obviously same exact result:

 Total size:    25019302903260 B (Total open files size: 395112 B)
 Total dirs:    1439889
 Total files:   12474754
 Total symlinks:                0 (Files currently being written: 69)
 Total blocks (validated):      12491149 (avg. block size 2002962 B) (Total open file blocks (not validated): 57)
 Minimally replicated blocks:   12491149 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       22 (1.7612471E-4 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.971359
 Corrupt blocks:                0
 Missing replicas:              44 (1.185481E-4 %)
 Number of data-nodes:          8
 Number of racks:               2
FSCK ended at Tue Jul 23 16:18:51 CEST 2019 in 141604 milliseconds

Managing blocks with corrupted replicas

Simulating corrupted blocks is not piece of cake and even if my Ambari display is showing corrupted blocks, in reality I have none:

hdfs@client_node:~$ hdfs fsck -list-corruptfileblocks
Connecting to namenode via http://namenode01.domain.com:50070/fsck?ugi=hdfs&listcorruptfileblocks=1&path=%2F
The filesystem under path '/' has 0 CORRUPT files

On a TI environment where plenty of people have played with I finally got the (hopefully) rare message of corrupted blocks. In real life it should not happen as by default your HDFS replication factor is 3 (It even forbid me to start HBase):

[hdfs@client_node ~]$ hdfs dfsadmin -report
Configured Capacity: 6139207680 (5.72 GB)
Present Capacity: 5701219000 (5.31 GB)
DFS Remaining: 2659930112 (2.48 GB)
DFS Used: 3041288888 (2.83 GB)
DFS Used%: 53.34%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 10
        Missing blocks (with replication factor 1): 0
        Pending deletion blocks: 0
Erasure Coded Block Groups:
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Pending deletion blocks: 0
.
.

More precisely:

[hdfs@client_node ~]$ hdfs fsck -list-corruptfileblocks
Connecting to namenode via http://namenode01.domain.com:50070/fsck?ugi=hdfs&listcorruptfileblocks=1&path=%2F
The list of corrupt files under path '/' are:
blk_1073744038  /hdp/apps/3.0.0.0-1634/mapreduce/mapreduce.tar.gz
blk_1073744039  /hdp/apps/3.0.0.0-1634/mapreduce/mapreduce.tar.gz
blk_1073744040  /hdp/apps/3.0.0.0-1634/mapreduce/mapreduce.tar.gz
blk_1073744041  /hdp/apps/3.0.0.0-1634/yarn/service-dep.tar.gz
blk_1073744042  /apps/hbase/data/hbase.version
blk_1073744043  /apps/hbase/data/hbase.id
blk_1073744044  /apps/hbase/data/data/hbase/meta/1588230740/.regioninfo
blk_1073744045  /apps/hbase/data/data/hbase/meta/.tabledesc/.tableinfo.0000000001
blk_1073744046  /apps/hbase/data/MasterProcWALs/pv2-00000000000000000001.log
blk_1073744047  /apps/hbase/data/WALs/datanode01.domain.com,16020,1576839178870/datanode01.domain.com%2C16020%2C1576839178870.1576839191114
The filesystem under path '/' has 10 CORRUPT files

Here is why I was not able to start component related to HBAse and even HBase daemon was failing:

[hdfs@client_node ~]$ hdfs dfs -cat /apps/hbase/data/hbase.version
19/12/20 15:41:12 WARN hdfs.DFSClient: No live nodes contain block BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 after checking nodes = [], ignoredNodes = null
19/12/20 15:41:12 INFO hdfs.DFSClient: No node available for BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 file=/apps/hbase/data/hbase.version
19/12/20 15:41:12 INFO hdfs.DFSClient: Could not obtain BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 from any node:  No live nodes contain current block Block locations: Dead nodes: . Will get new block locations from namenode and retry...
19/12/20 15:41:12 WARN hdfs.DFSClient: DFS chooseDataNode: got # 1 IOException, will wait for 2312.7779354694017 msec.
19/12/20 15:41:14 WARN hdfs.DFSClient: No live nodes contain block BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 after checking nodes = [], ignoredNodes = null
19/12/20 15:41:14 INFO hdfs.DFSClient: No node available for BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 file=/apps/hbase/data/hbase.version
19/12/20 15:41:14 INFO hdfs.DFSClient: Could not obtain BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 from any node:  No live nodes contain current block Block locations: Dead nodes: . Will get new block locations from namenode and retry...
19/12/20 15:41:14 WARN hdfs.DFSClient: DFS chooseDataNode: got # 2 IOException, will wait for 8143.439259842901 msec.
19/12/20 15:41:22 WARN hdfs.DFSClient: No live nodes contain block BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 after checking nodes = [], ignoredNodes = null
19/12/20 15:41:22 INFO hdfs.DFSClient: No node available for BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 file=/apps/hbase/data/hbase.version
19/12/20 15:41:22 INFO hdfs.DFSClient: Could not obtain BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 from any node:  No live nodes contain current block Block locations: Dead nodes: . Will get new block locations from namenode and retry...
19/12/20 15:41:22 WARN hdfs.DFSClient: DFS chooseDataNode: got # 3 IOException, will wait for 11760.036759939097 msec.
19/12/20 15:41:34 WARN hdfs.DFSClient: No live nodes contain block BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 after checking nodes = [], ignoredNodes = null
19/12/20 15:41:34 WARN hdfs.DFSClient: Could not obtain block: BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 file=/apps/hbase/data/hbase.version No live nodes contain current block Block locations: Dead nodes: . Throwing a BlockMissingException
19/12/20 15:41:34 WARN hdfs.DFSClient: No live nodes contain block BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 after checking nodes = [], ignoredNodes = null
19/12/20 15:41:34 WARN hdfs.DFSClient: Could not obtain block: BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 file=/apps/hbase/data/hbase.version No live nodes contain current block Block locations: Dead nodes: . Throwing a BlockMissingException
19/12/20 15:41:34 WARN hdfs.DFSClient: DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 file=/apps/hbase/data/hbase.version
        at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:870)
        at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:853)
        at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:832)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:564)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:754)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:820)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:94)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:68)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:129)
        at org.apache.hadoop.fs.shell.Display$Cat.printToStdout(Display.java:101)
        at org.apache.hadoop.fs.shell.Display$Cat.processPath(Display.java:96)
        at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
        at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:303)
        at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:285)
        at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:269)
        at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:120)
        at org.apache.hadoop.fs.shell.Command.run(Command.java:176)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:328)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:391)
cat: Could not obtain block: BP-369465004-10.75.46.68-1539340329592:blk_1073744042_3218 file=/apps/hbase/data/hbase.version

Here no particular miracle on how to solve it:

[hdfs@client_node ~]$ hdfs fsck / -delete
Connecting to namenode via http://namenode01.domain.com:50070/fsck?ugi=hdfs&delete=1&path=%2F
FSCK started by hdfs (auth:SIMPLE) from /10.75.46.69 for path / at Fri Dec 20 15:41:53 CET 2019
 
/apps/hbase/data/MasterProcWALs/pv2-00000000000000000001.log: MISSING 1 blocks of total size 34358 B.
/apps/hbase/data/WALs/datanode01.domain.com,16020,1576839178870/datanode01.domain.com%2C16020%2C1576839178870.1576839191114: MISSING 1 blocks of total size 98 B.
/apps/hbase/data/data/hbase/meta/.tabledesc/.tableinfo.0000000001: MISSING 1 blocks of total size 996 B.
/apps/hbase/data/data/hbase/meta/1588230740/.regioninfo: MISSING 1 blocks of total size 32 B.
/apps/hbase/data/hbase.id: MISSING 1 blocks of total size 42 B.
/apps/hbase/data/hbase.version: MISSING 1 blocks of total size 7 B.
/hdp/apps/3.0.0.0-1634/mapreduce/mapreduce.tar.gz: MISSING 3 blocks of total size 306766434 B.
/hdp/apps/3.0.0.0-1634/yarn/service-dep.tar.gz: MISSING 1 blocks of total size 92348160 B.
Status: CORRUPT
 Number of data-nodes:  3
 Number of racks:               1
 Total dirs:                    780
 Total symlinks:                0
 
Replicated Blocks:
 Total size:    1404802556 B
 Total files:   190 (Files currently being written: 1)
 Total blocks (validated):      47 (avg. block size 29889416 B)
  ********************************
  UNDER MIN REPL'D BLOCKS:      10 (21.276596 %)
  MINIMAL BLOCK REPLICATION:    1
  CORRUPT FILES:        8
  MISSING BLOCKS:       10
  MISSING SIZE:         399150127 B
  ********************************
 Minimally replicated blocks:   37 (78.723404 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.3617022
 Missing blocks:                10
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 
Erasure Coded Block Groups:
 Total size:    0 B
 Total files:   0
 Total block groups (validated):        0
 Minimally erasure-coded block groups:  0
 Over-erasure-coded block groups:       0
 Under-erasure-coded block groups:      0
 Unsatisfactory placement block groups: 0
 Average block group size:      0.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0
FSCK ended at Fri Dec 20 15:41:54 CET 2019 in 306 milliseconds
 
 
The filesystem under path '/' is CORRUPT

Finally solved (with data loss of course):

[hdfs@client_node ~]$ hdfs fsck -list-corruptfileblocks
Connecting to namenode via http://namenode01.domain.com:50070/fsck?ugi=hdfs&listcorruptfileblocks=1&path=%2F
The filesystem under path '/' has 0 CORRUPT files

And:

[hdfs@client_node ~]$ hdfs dfsadmin -report
Configured Capacity: 6139207680 (5.72 GB)
Present Capacity: 5701216450 (5.31 GB)
DFS Remaining: 2659930112 (2.48 GB)
DFS Used: 3041286338 (2.83 GB)
DFS Used%: 53.34%
Replicated Blocks:
        Under replicated blocks: 0
        Blocks with corrupt replicas: 0
        Missing blocks: 0
        Missing blocks (with replication factor 1): 0
        Pending deletion blocks: 0
Erasure Coded Block Groups:
        Low redundancy block groups: 0
        Block groups with corrupt internal blocks: 0
        Missing block groups: 0
        Pending deletion blocks: 0

References

Yannick Jaquier on LinkedinYannick Jaquier on RssYannick Jaquier on Twitter
Yannick Jaquier
Find more about me on social media.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>