Hive concatenate command issues and workaround

Preamble

To maintain good performance we have develop a script (will be shared in another blog post) to concatenate the partitions of our Hive tables every week. It helps in reducing the number of ORC files per partitions and, as such, helps in reducing the number of MAP and Reduce jobs mandatory to access the partitions in Hive queries.

It all went good till one week we started to have error message for partitions of one of our table. The funny situation was that for some partitions it still works well while for other we get an error message.

This table has been created in Hive but we fill the partition in a PySpark script. So partition got created in Spark because obviously this is when you insert rows that the partition directories in HDFS is created…

Our cluster is running HDP-2.6.4.0.

The concerned stack version are:

  • Hive 1.2.1000
  • Spark 2.2.0.2.6.4.0-91

Concatenate command failing for access right

The complete error message is the following:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> alter table prod_ews_refined.tbl_wafer_param_stats partition (fab="CTM8",lot_partition="58053") concatenate;
INFO  : Session is already open
INFO  : Dag name: hive_20190830162358_1591b2d1-7489-4f60-8b9f-725fb50a4648
INFO  : Tez session was closed. Reopening...
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
File Merge ....      RUNNING      5          4        0        1       4       0
--------------------------------------------------------------------------------
VERTICES: 00/01  [====================>>------] 80%   ELAPSED TIME: 5.52 s
--------------------------------------------------------------------------------
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=File Merge, vertexId=vertex_1565718945091_75870_2_00, diagnostics=[Task failed, taskId=task_1565718945091_75870_2_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.close(MergeFileRecordProcessor.java:184)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)
        ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to close AbstractFileMergeOperator
        at org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:272)
        at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:250)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:620)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.close(MergeFileRecordProcessor.java:176)
        ... 15 more
Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=mfgdl_ingestion, access=EXECUTE, inode="/apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/.hive-staging_hive_2019-08-30_16-23-58_034_3892822656895858873-3012/_tmp.-ext-10000/000000_0_copy_1/000000_0_copy_7":mfgdl_ingestion:hadoop:-rw-r--r--
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:292)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:238)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1950)
        at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4142)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1137)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:866)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2167)
        at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442)
        at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1454)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1447)
        at org.apache.hadoop.hive.ql.exec.Utilities.moveFile(Utilities.java:1807)
        at org.apache.hadoop.hive.ql.exec.Utilities.renameOrMoveFiles(Utilities.java:1843)
        at org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:258)
        ... 18 more
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=mfgdl_ingestion, access=EXECUTE, inode="/apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/.hive-staging_hive_2019-08-30_16-23-58_034_3892822656895858873-3012/_tmp.-ext-10000/000000_0_copy_1/000000_0_copy_7":mfgdl_ingestion:hadoop:-rw-r--r--
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:353)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:292)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:238)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1950)
        at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getFileInfo(FSDirStatAndListingOp.java:108)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4142)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:1137)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:866)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:823)
        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165)
        ... 26 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
        at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:150)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.GeneratedConstructorAccessor18.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
        ... 18 more
Caused by: java.io.FileNotFoundException: File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000000_0_copy_5
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1240)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1225)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:266)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1538)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:332)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractFileTail(ReaderImpl.java:355)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:319)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:241)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.<init>(OrcFileStripeMergeRecordReader.java:47)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat.getRecordReader(OrcFileStripeMergeInputFormat.java:37)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
        ... 22 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000000_0_copy_5
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:272)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy12.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1238)
        ... 39 more
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
        at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:150)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
        ... 18 more
Caused by: java.io.FileNotFoundException: File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000000_0_copy_2
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1240)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1225)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:266)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1538)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:332)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractFileTail(ReaderImpl.java:355)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:319)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:241)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.<init>(OrcFileStripeMergeRecordReader.java:47)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat.getRecordReader(OrcFileStripeMergeInputFormat.java:37)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
        ... 23 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000000_0_copy_2
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:272)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy12.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1238)
        ... 40 more
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
        at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:150)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
        ... 18 more
Caused by: java.io.FileNotFoundException: File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000000_0_copy_2
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1240)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1225)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:266)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1538)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:332)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractFileTail(ReaderImpl.java:355)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:319)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:241)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.<init>(OrcFileStripeMergeRecordReader.java:47)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat.getRecordReader(OrcFileStripeMergeInputFormat.java:37)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
        ... 23 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000000_0_copy_2
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:272)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy12.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1238)
        ... 40 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:0, Vertex vertex_1565718945091_75870_2_00 [File Merge] killed/failed due to:OWN_TASK_FAILURE]
ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=2)

While, as written, for some other partitions all is still working fine:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> alter table prod_ews_refined.tbl_wafer_param_stats partition (fab="CTM8",lot_partition="59591") concatenate;
INFO  : Session is already open
INFO  : Dag name: hive_20190830145138_334957cb-f329-43af-953b-d03e213c9b03
INFO  : Tez session was closed. Reopening...
INFO  : Session re-established.
INFO  : Status: Running (Executing on YARN cluster with App id application_1565718945091_74778)
 
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
File Merge .....   SUCCEEDED      4          4        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 31.15 s
--------------------------------------------------------------------------------
INFO  : Loading data to table prod_ews_refined.tbl_wafer_param_stats partition (fab=CTM8, lot_partition=59591) from hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_ews_refined.db/tbl_wafe          r_param_stats/fab=CTM8/lot_partition=59591/.hive-staging_hive_2019-08-30_14-51-38_835_3100511055765258809-3012/-ext-10000
INFO  : Partition prod_ews_refined.tbl_wafer_param_stats{fab=CTM8, lot_partition=59591} stats: [numFiles=4, totalSize=107827]
No rows affected (56.472 seconds)

If we extract the root cause from the long above error message it is:

Caused by: org.apache.hadoop.security.AccessControlException: Permission denied: user=mfgdl_ingestion, access=EXECUTE, inode="/apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/.hive-staging_hive_2019-08-30_16-23-58_034_3892822656895858873-3012/_tmp.-ext-10000/000000_0_copy_1/000000_0_copy_7":mfgdl_ingestion:hadoop:-rw-r--r--

Clearly there is a problem of access right for the concatenate command. The problem is that the command is launched with the exact user as the one that has created and which fill the partition and more importantly the error happens randomly on only few partitions… No doubt we are hitting a bug, cannot say if it is Spark or Hive related…

HDFS default access right with Hive and Spark

I first noticed that for failing partition the rights on the ORC files are not all the same (755, rwxr-xr-x and 644, rw-r–r–):

hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053
Found 35 items
.
.
-rwxr-xr-x   3 mfgdl_ingestion hadoop       1959 2019-08-25 18:12 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000001_0_copy_8
-rwxr-xr-x   3 mfgdl_ingestion hadoop       2563 2019-08-25 18:12 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000002_0_copy_1
-rwxr-xr-x   3 mfgdl_ingestion hadoop       1967 2019-08-25 18:11 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000002_0_copy_2
-rwxr-xr-x   3 mfgdl_ingestion hadoop       2190 2019-08-25 18:08 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000002_0_copy_3
-rwxr-xr-x   3 mfgdl_ingestion hadoop       1985 2019-08-25 18:11 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000002_0_copy_4
-rwxr-xr-x   3 mfgdl_ingestion hadoop       3508 2019-08-26 20:19 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000004_0
-rw-r--r--   3 mfgdl_ingestion hadoop       2009 2019-08-30 06:18 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/part-00004-9be63b72-9ccc-48f9-a422-b9a6420a3f6f.c000.zlib.orc
-rw-r--r--   3 mfgdl_ingestion hadoop       1959 2019-08-27 21:16 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/part-00007-dd3cce93-8301-42b4-add0-570ee27a5d66.c000.zlib.orc
-rw-r--r--   3 mfgdl_ingestion hadoop       2133 2019-08-27 21:16 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/part-00030-dd3cce93-8301-42b4-add0-570ee27a5d66.c000.zlib.orc
-rw-r--r--   3 mfgdl_ingestion hadoop       2137 2019-08-30 06:18 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/part-00047-9be63b72-9ccc-48f9-a422-b9a6420a3f6f.c000.zlib.orc
.
.
.

While for working partition it is all the same (755, rwxr-xr-x):

hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=59591
Found 4 items
-rwxr-xr-x   3 mfgdl_ingestion hadoop      41397 2019-08-30 14:52 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=59591/000000_0
-rwxr-xr-x   3 mfgdl_ingestion hadoop      39713 2019-08-30 14:52 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=59591/000001_0
-rwxr-xr-x   3 mfgdl_ingestion hadoop      21324 2019-08-30 14:52 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=59591/000002_0
-rwxr-xr-x   3 mfgdl_ingestion hadoop       5393 2019-08-30 14:52 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=59591/000003_0

I have also tried to find if all the files are really part of the partitions (no ghost file) and apparently yes. We can only check that number of file (totalNumberFiles) is consistent from Hive but not have the actual list of HDFS files that make the partition:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> use prod_ews_refined;
No rows affected (0.438 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> show table extended like tbl_wafer_param_stats partition (fab="CTM8",lot_partition="58053");
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
|                                                                                                                                                                                                                           tab_name                                                                                                                                                                                                                            |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
| tableName:tbl_wafer_param_stats                                                                                                                                                                                                                                                                                                                                                                                                                               |
| owner:mfgdl_ingestion                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| location:hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053                                                                                                                                                                                                                                                                                                                          |
| inputformat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat                                                                                                                                                                                                                                                                                                                                                                                                   |
| outputformat:org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat                                                                                                                                                                                                                                                                                                                                                                                                 |
| columns:struct columns { string start_t, string finish_t, string lot_id, string wafer_id, string flow_id, i32 param_id, string param_name, float param_low_limit, float param_high_limit, string param_unit, string ingestion_date, float param_p01, float param_q1, float param_median, float param_q3, float param_p99, float param_min_value, double param_avg_value, float param_max_value, double param_stddev, i64 nb_dies_tested, i64 nb_dies_failed}  |
| partitioned:true                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| partitionColumns:struct partition_columns { string fab, string lot_partition}                                                                                                                                                                                                                                                                                                                                                                                 |
| totalNumberFiles:34                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| totalFileSize:88754                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| maxFileSize:16710                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| minFileSize:1954                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| lastAccessTime:1567138817781                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| lastUpdateTime:1567179342705                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--+
15 rows selected (0.444 seconds)

Remark
I have tried the: IN | FROM database_name as described in Official Hive documentation

SHOW TABLE EXTENDED [IN|FROM database_name] LIKE 'identifier_with_wildcards' [PARTITION(partition_spec)];

But I have not been able to make it working so decided to finally use the USE database_name statement…

As written this table is filled using PySpark with a code like:

dataframe.write.mode('append').format('orc').option("compression","zlib").partitionBy('fab','lot_partition').saveAsTable("prod_ews_refined.tbl_wafer_param_stats")

I have checked the default HDFS mask:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set fs.permissions.umask-mode;
+--------------------------------+--+
|              set               |
+--------------------------------+--+
| fs.permissions.umask-mode=022  |
+--------------------------------+--+
1 row selected (0.061 seconds)

It means that file will be created with 644, rw-r–r– (666 – 022) and directory will be created with 755, rwx-r-xr-x (777 – 022) by default.

But digging a bit inside the tree of my database:

hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/prod_ews_refined.db/
Found 7 items
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-09-02 10:12 /apps/hive/warehouse/prod_ews_refined.db/tbl_bin_param_stat
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-08-31 01:38 /apps/hive/warehouse/prod_ews_refined.db/tbl_die_bin
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-09-02 10:36 /apps/hive/warehouse/prod_ews_refined.db/tbl_die_param_flow
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-09-02 10:34 /apps/hive/warehouse/prod_ews_refined.db/tbl_die_sp
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-08-30 19:26 /apps/hive/warehouse/prod_ews_refined.db/tbl_stdf
drwxr-xr-x   - mfgdl_ingestion hadoop          0 2019-09-02 10:39 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-08-30 12:02 /apps/hive/warehouse/prod_ews_refined.db/tbl_wsr_map
hdfs@clientnode:~$ hdfs dfs -ls -d /apps/hive/warehouse/prod_ews_refined.db
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-07-16 23:17 /apps/hive/warehouse/prod_ews_refined.db
hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/
Found 1 items
drwxrwxrwx   - hive hadoop          0 2019-09-02 10:09 /apps/hive/warehouse

Below parameter explain why for sub-directories we always have 777, I have seen that this parameter more apply to files and not to sub-directories (!!):

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.warehouse.subdir.inherit.perms;
+-------------------------------------------+--+
|                    set                    |
+-------------------------------------------+--+
| hive.warehouse.subdir.inherit.perms=true  |
+-------------------------------------------+--+
1 row selected (0.103 seconds)

So this explain why HDFS files (ORC for me) of tables created and filled by Hive have 777 for directories and files. For partitions created by Spark the default HDFS mask (fs.permissions.umask-mode) and the 755 (rwxr–r–) for directories and 644 (rw-r–r–) for files is expected behavior.

So to solve the execute permission issue of the partition I have issued:

hdfs@clientnode:~$ hdfs dfs -chmod 755 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/part*

And the concatenate went well this time:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> alter table prod_ews_refined.tbl_wafer_param_stats partition (fab="CTM8",lot_partition="58053") concatenate;
INFO  : Session is already open
INFO  : Dag name: hive_20190902152242_ac7ab990-fdfd-4094-89b4-4926c49364ee
INFO  : Tez session was closed. Reopening...
INFO  : Session re-established.
INFO  : Status: Running (Executing on YARN cluster with App id application_1565718945091_85673)
 
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
File Merge .....   SUCCEEDED      5          5        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 12.87 s
--------------------------------------------------------------------------------
INFO  : Loading data to table prod_ews_refined.tbl_wafer_param_stats partition (fab=CTM8, lot_partition=58053) from hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/.hive-staging_hive_2019-09-02_15-22-42_356_8162954430835088222-15910/-ext-10000
INFO  : Partition prod_ews_refined.tbl_wafer_param_stats{fab=CTM8, lot_partition=58053} stats: [numFiles=8, numRows=57, totalSize=65242, rawDataSize=42834]
No rows affected (29.993 seconds)

With number of files reduced drastically:

hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053
Found 9 items
drwxr-xr-x   - mfgdl_ingestion hadoop          0 2019-08-29 19:20 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/.hive-staging_hive_2019-08-29_18-10-26_174_7966264139569655341-114
-rwxr-xr-x   3 mfgdl_ingestion hadoop      20051 2019-09-02 15:23 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000000_0
-rwxr-xr-x   3 mfgdl_ingestion hadoop      19020 2019-09-02 15:23 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000001_0
-rwxr-xr-x   3 mfgdl_ingestion hadoop       2009 2019-08-30 06:18 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000001_0_copy_1
-rwxr-xr-x   3 mfgdl_ingestion hadoop       1959 2019-08-27 21:16 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000001_0_copy_2
-rwxr-xr-x   3 mfgdl_ingestion hadoop       2163 2019-08-27 03:42 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000001_0_copy_3
-rwxr-xr-x   3 mfgdl_ingestion hadoop      14413 2019-09-02 15:23 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000002_0
-rwxr-xr-x   3 mfgdl_ingestion hadoop       3508 2019-09-02 15:23 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000003_0
-rwxr-xr-x   3 mfgdl_ingestion hadoop       2119 2019-09-02 15:23 /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=CTM8/lot_partition=58053/000004_0

To go further

To try to clarify this story of default access rights with hive.warehouse.subdir.inherit.perms and fs.permissions.umask-mode I have created a default table having those parameters set respectively to true and 022:

DROP TABLE yannick01 purge;
 
CREATE TABLE DEFAULT.yannick01
(
VALUE string
)
partitioned BY(fab string, int_partition int)
stored AS orc;

I insert a dummy record with:

INSERT INTO DEFAULT.yannick01 PARTITION (fab="CTM8", int_partition=1) VALUES ("One");

As expected I have 777 for all files because my /apps/hive/warehouse directory is 777:

hdfs@clientnode:~$ hdfs dfs -ls -d /apps/hive/warehouse/yannick01
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-09-03 11:21 /apps/hive/warehouse/yannick01
 
hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/yannick01
Found 1 items
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-09-02 18:03 /apps/hive/warehouse/yannick01/fab=CTM8
 
hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/yannick01/fab=CTM8
Found 1 items
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-09-02 18:03 /apps/hive/warehouse/yannick01/fab=CTM8/int_partition=1
 
hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/yannick01/fab=CTM8/int_partition=1
Found 1 items
-rwxrwxrwx   3 mfgdl_ingestion hadoop        214 2019-09-02 18:04 /apps/hive/warehouse/yannick01/fab=CTM8/int_partition=1/000000_0

Now if I re-execute the creation script and set this:

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.warehouse.subdir.inherit.perms=false;
No rows affected (0.003 seconds)
0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> set hive.warehouse.subdir.inherit.perms;
+--------------------------------------------+--+
|                    set                     |
+--------------------------------------------+--+
| hive.warehouse.subdir.inherit.perms=false  |
+--------------------------------------------+--+
1 row selected (0.005 seconds)

I get:

hdfs@clientnode:~$ hdfs dfs -ls -d /apps/hive/warehouse/yannick01
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-09-03 11:21 /apps/hive/warehouse/yannick01
 
hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/yannick01/fab=CTM8
Found 1 items
drwxrwxrwx   - mfgdl_ingestion hadoop          0 2019-09-03 11:22 /apps/hive/warehouse/yannick01/fab=CTM8/int_partition=1
 
hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/yannick01/fab=CTM8/int_partition=1
Found 1 items
-rw-r--r--   3 mfgdl_ingestion hadoop        223 2019-09-03 11:22 /apps/hive/warehouse/yannick01/fab=CTM8/int_partition=1/000000_0

And the concatenate command is working fine because the directory containing the ORC HDFS files has 777. So it is not required to set execute permission on ORC files, having 777 (write for others, hdfs dfs -chmod g+w,o+w or hdfs dfs -chmod 777) on directory where files are stored is also working. But clearly this is much less secure…

0: jdbc:hive2://zookeeper01.domain.com:2181,zoo> alter table default.yannick01 partition (fab="CTM8", int_partition=1) concatenate;
INFO  : Session is already open
INFO  : Dag name: hive_20190904163205_31445337-20a1-42d3-80ac-e08028d6a2a1
INFO  : Status: Running (Executing on YARN cluster with App id application_1565718945091_100063)
 
INFO  : Loading data to table default.yannick01 partition (fab=CTM8, int_partition=1) from hdfs://ManufacturingDataLakeHdfs/apps/hive/warehouse/yannick01/fab=CTM8/int_partition=1/.hive-staging_hive_2019-09-04_16-32-05_102_6350061968665706892-148/-ext-10000
INFO  : Partition default.yannick01{fab=CTM8, int_partition=1} stats: [numFiles=2, numRows=1, totalSize=158170, rawDataSize=87]
No rows affected (3.291 seconds)

Worked well till…

It all worked as expected till we hit a partition with lots of files and even setting the correct rights it failed for:

ERROR : Status: Failed
ERROR : Vertex failed, vertexName=File Merge, vertexId=vertex_1565718945091_143224_2_00, diagnostics=[Task failed, taskId=task_1565718945091_143224_2_00_000001, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.close(MergeFileRecordProcessor.java:184)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164)
        ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to close AbstractFileMergeOperator
        at org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:272)
        at org.apache.hadoop.hive.ql.exec.OrcFileMergeOperator.closeOp(OrcFileMergeOperator.java:250)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:620)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.close(MergeFileRecordProcessor.java:176)
        ... 15 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.lang.IllegalStateException): Current inode is not a directory: 000001_0_copy_21(INodeFile@1491d045), parentDir=_tmp.-ext-10000/
        at org.apache.hadoop.hdfs.server.namenode.INode.asDirectory(INode.java:331)
        at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.verifyFsLimitsForRename(FSDirRenameOp.java:117)
        at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.unprotectedRenameTo(FSDirRenameOp.java:189)
        at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.renameTo(FSDirRenameOp.java:492)
        at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.renameToInt(FSDirRenameOp.java:73)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3938)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename(NameNodeRpcServer.java:993)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename(ClientNamenodeProtocolServerSideTranslatorPB.java:587)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.rename(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.rename(ClientNamenodeProtocolTranslatorPB.java:529)
        at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy12.rename(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:2006)
        at org.apache.hadoop.hdfs.DistributedFileSystem.rename(DistributedFileSystem.java:732)
        at org.apache.hadoop.hive.ql.exec.Utilities.moveFile(Utilities.java:1815)
        at org.apache.hadoop.hive.ql.exec.Utilities.renameOrMoveFiles(Utilities.java:1843)
        at org.apache.hadoop.hive.ql.exec.AbstractFileMergeOperator.closeOp(AbstractFileMergeOperator.java:258)
        ... 18 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
        at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:150)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.GeneratedConstructorAccessor18.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
        ... 18 more
Caused by: java.io.FileNotFoundException: File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053/000001_0_copy_993
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1240)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1225)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:266)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1538)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:332)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractFileTail(ReaderImpl.java:355)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:319)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:241)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.<init>(OrcFileStripeMergeRecordReader.java:47)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat.getRecordReader(OrcFileStripeMergeInputFormat.java:37)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
        ... 22 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053/000001_0_copy_993
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:272)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy12.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1238)
        ... 39 more
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
        at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:150)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
        ... 18 more
Caused by: java.io.FileNotFoundException: File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053/000001_0_copy_969
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1240)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1225)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:266)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1538)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:332)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractFileTail(ReaderImpl.java:355)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:319)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:241)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.<init>(OrcFileStripeMergeRecordReader.java:47)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat.getRecordReader(OrcFileStripeMergeInputFormat.java:37)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
        ... 23 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053/000001_0_copy_969
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:272)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy12.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1238)
        ... 40 more
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileTezProcessor.run(MergeFileTezProcessor.java:42)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
        at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:140)
        at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113)
        at org.apache.hadoop.hive.ql.exec.tez.MergeFileRecordProcessor.run(MergeFileRecordProcessor.java:150)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150)
        ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
        ... 18 more
Caused by: java.io.FileNotFoundException: File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053/000001_0_copy_969
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1240)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1225)
        at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
        at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
        at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
        at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:266)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1538)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:332)
        at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:340)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractFileTail(ReaderImpl.java:355)
        at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:319)
        at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:241)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.<init>(OrcFileStripeMergeRecordReader.java:47)
        at org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeInputFormat.getRecordReader(OrcFileStripeMergeInputFormat.java:37)
        at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
        ... 23 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053/000001_0_copy_969
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
        at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2025)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1996)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1909)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:700)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)
 
        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
        at org.apache.hadoop.ipc.Client.call(Client.java:1498)
        at org.apache.hadoop.ipc.Client.call(Client.java:1398)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
        at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:272)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
        at com.sun.proxy.$Proxy12.getBlockLocations(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1238)
        ... 40 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:1, Vertex vertex_1565718945091_143224_2_00 [File Merge] killed/failed due to:OWN_TASK_FAILURE]
ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=2)

I initially thought it could be a HDFS FileSytem issue so:

hdfs@clientnode:~$ hdfs fsck /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053/
Connecting to namenode via http://namenode01.domain.com:50070/fsck?ugi=hdfs&path=%2Fapps%2Fhive%2Fwarehouse%2Fprod_ews_refined.db%2Ftbl_wafer_param_stats_old%2Ffab%3DC2WF%2Flot_partition%3DQ9053
FSCK started by hdfs (auth:SIMPLE) from /10.75.144.5 for path /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053 at Mon Sep 16 17:01:29 CEST 2019
 
Status: HEALTHY
 Total size:    159181622 B
 Total dirs:    1
 Total files:   21778
 Total symlinks:                0
 Total blocks (validated):      21778 (avg. block size 7309 B)
 Minimally replicated blocks:   21778 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          8
 Number of racks:               2
FSCK ended at Mon Sep 16 17:01:29 CEST 2019 in 377 milliseconds
 
 
The filesystem under path '/apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats_old/fab=C2WF/lot_partition=Q9053' is HEALTHY

To read all the logs in a convenient manner I strongly encourage the usage of Tez View. To do so you can read the application id when executing the command in beeline:

.
.
INFO  : Status: Running (Executing on YARN cluster with App id application_1565718945091_161035)
.
.

And access the resource using this url http://resourcemanager01.domain.com:8088/cluster/app/application_1565718945091_142464 then click on ApplicationMaster link in displayed page. Finally accessing Dag tab and opening details you should see something like:

concatenate01
concatenate01

You can also generate a text file using:

[yarn@clientnode ~]$ yarn logs -applicationId application_1565718945091_141922 > /tmp/yan.txt

From this huge amount of logs I have obviously see plenty of strange error messages but none of them led to a clear conclusion:

  • |OrcFileMergeOperator|: Incompatible ORC file merge! Writer version mismatch for
  • |tez.RecordProcessor|: Hit error while closing operators – failing tree
  • |tez.TezProcessor|: java.lang.RuntimeException: Hive Runtime Error while closing operators
  • org.apache.hadoop.ipc.RemoteException(java.lang.IllegalStateException): Current inode is not a directory:

But the number of files in the partition is decreasing:

hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=C2WF/lot_partition=Q9053 | wc -l
21693
 
hdfs@clientnode:~$ hdfs dfs -ls /apps/hive/warehouse/prod_ews_refined.db/tbl_wafer_param_stats/fab=C2WF/lot_partition=Q9053 | wc -l
21642

So clearly another bug

Last but not least one of my teammate noticed a strange behavior while doing simple count on partition. He generated a pure Hive table by inserting the rows of the one created by Spark:

0: jdbc:hive2://zookeeer01.domain.com:2181,zoo> select count(*) from prod_ews_refined.tbl_hive_generated where fab="R8WF" and lot_partition="G3473";
+------+--+
| _c0  |
+------+--+
| 137  |
+------+--+
1 row selected (0.045 seconds)
0: jdbc:hive2://zookeeer01.domain.com:2181,zoo> select count(*) from prod_ews_refined.tbl_spark_generated where fab="R8WF" and lot_partition="G3473";
+------+--+
| _c0  |
+------+--+
| 130  |
+------+--+
1 row selected (0.058 seconds)

But by exporting the rows of the Spark generated table in a csv file (–outformat=csv2) the output is the correct one:

Connected to: Apache Hive (version 1.2.1000.2.6.4.0-91)
Driver: Hive JDBC (version 1.2.1000.2.6.4.0-91)
Transaction isolation: TRANSACTION_REPEATABLE_READ
INFO  : Tez session hasn't been created yet. Opening session
INFO  : Dag name: select * from prod_ew...ot_partition="G3473"(Stage-1)
INFO  : Status: Running (Executing on YARN cluster with App id application_1565718945091_133078)
 
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      2          2        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 01/01  [==========================>>] 100%  ELAPSED TIME: 2.11 s     
--------------------------------------------------------------------------------
137 rows selected (7.998 seconds)

So one bug more which has lead to the project of upgrading our cluster to latest HDP version to prepare the migration to Cloudera as Hortonworks is dead…

In meanwhile the non-satisfactory solution we have implemented is to fill a pure Hive table with INSERT AS SELECT from the Spark generated table…

References

Yannick Jaquier on LinkedinYannick Jaquier on RssYannick Jaquier on Twitter
Yannick Jaquier
Find more about me on social media.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>