2015년 1월 7일 수요일

[cdh-user] Re: Intermittent Error Starting ApplicationMaster on CDH 5.2.0

In case anyone comes across this, I have not found the root cause of this issue, but it has not happened on CDH 5.3.

I am seeing an intermittent error starting the ApplicationMaster when launching a map reduce job in CDH 5.2.0.  It happens pretty infrequently, maybe once a week, but not on any consistent schedule.  I cannot intentionally reproduce it and have not been able to find any pattern behind occurrences of the error.  However, if I re-run the failed job immediately after seeing this error, it will run successfully without any changes.

Has anyone encountered this before?

This is the stack trace I get:

Application application_1416843883012_0019 failed 2 times due to Error launching appattempt_1416843883012_0019_000002. Got exception: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.apache.hadoop.security.Credentials.readTokenStorageStream(Credentials.java:209)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.setupTokens(AMLauncher.java:226)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.createAMContainerLaunchContext(AMLauncher.java:198)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:108)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:254)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
. Failing the application.


댓글 없음:

댓글 쓰기