2014년 12월 30일 화요일

[scm-user] Error while using Cloudera API enable_nn_ha

I am trying to use the Cloudera Manager API enable_nn_ha() to enable HA on a HDFS cluster. The cluster has multiple nameservices/namenodes and hence I am iteratively calling this API on each namenode in the cluster. Also, I have pre-assigned 3 nodes in the cluster as Journal Nodes and these 3 Journal Nodes will be shared by all NameNodes in the cluster. As per the HDFS HA design, the Journal Nodes can be shared amongst multiple NameNodes as long as we specify a different journal directory for each NameNode in the cluster. During each iteration, I am specifying the same set of 3 Journal Nodes while specifying different journal directories for each iteration.

The first iteration involving the first namenode in the cluster works fine and HA is successfully configured on this first namenode. During the first iteration, the Journal Nodes are created because they do not previously exist in the cluster. I verified the same from the Cloudera Manager web UI. The second iteration on the second namenode in the cluster fails because the enable_nn_ha() API tries to create the 3 Journal Nodes again and detects that they already exist. 

The API fails with the following error:
ApiException: A JOURNALNODE already exists on XXX (error 400)

I tried enabling HA manually on the second NameNode from the Cloudera Manager web UI and it worked fine using the same set of Journal Nodes that were used for the first NameNode. I tried the same on a third namenode and it worked fine as well. So, the API seems to be failing incorrectly if it already finds the Journal Nodes created. The behavior of the API should be to create a Journal Node if it does not exist and if it does it should reuse the Journal Node and use a different journal directory.

Does anyone know of a workaround for this issue or if I can file a bug related to this API failure?


댓글 없음:

댓글 쓰기