Saturday, January 29, 2011

migration attempt to go from 5.2 to 5.3 of HACMP and now when running the verification/synchronization it fails

there was a migration attempt to go from 5.2 to 5.3 of HACMP and now when running the verification/synchronization it fails
with the following:
cldare: Migration has been detected.

ACTION TAKEN:
aix 5.3 hacmp 5.3


* odmget HACMPcluster ==> The cluster_version field should be equal to
the following:
 HACMP 5.2    ==> cluster_version = 7
 HACMP 5.3    ==> cluster_version = 8

 Fix:
 #odmget HACMPcluster > cluster.file
 #vi cluster.file ==> correct the field
 #odmdelete -o HACMPcluster ==> removes the contents of the object class
 #odmadd cluster.file
 #odmget HACMPcluster ==> should now show the correct cluster_version

* odmget HACMPnode | more==> The version for all the nodes in the
cluster should also be:
 HACMP 5.2    ==> version = 7
 HACMP 5.3    ==> version = 8

 Fix:
 #odmget HACMPnode > nodes.file
 #vi nodes.file ==> correct the fields
 #odmdelete -o HACMPnode ==> removes the contents of the object class
 #odmadd nodes.file
 #odmget HACMPnode ==> should now show the correct version

* odmget HACMPrules | more ==> You could run into a problem with the
following 3 rules:
 TE_JOIN_NODE
 TE_FAIL_NODE
 TE_RG_MOVE

 The recovery_prog_path should be set the following:
 "/usr/es/sbin/cluster/events/"

 We have seen issues where for those 3 rules it gets changed to:
 "/usr/lpp/save.config/usr/es/sbin/cluster/events/"

 Fix:
 #odmget HACMPrules > rules.file
 #vi rules.file ==> correct the fields
 #odmdelete -o HACMPrules ==> removes the contents of the object class
 #odmadd rules.file
 #odmget HACMPrules ==> should now show the correct path

 made the changes listed above and we tried the synchronization
again. It failed and said that it could not connect to seconday node.

cd /usr/es/sbin/cluster/etc/rhosts -->ALL IP addresses for both
nodes should be in this file.

 modified the rhost file and he is no longer getting the
migration error, however he is getting other configuration errors.




Event six error ODM internal node failed after restart of cluster
   attempted from his script caused a TE_JOIN_NODE



  After running the following (smitty clstart command> the node cluster
  started correctly instead of his script that failed to syncronize
  things.

      # smitty clstart -> ok

   - verification and synchronization was successful this time.

   - started the cluster on the secondary node.

* odmget HACMPrules | more ==> You could run into a problem with the
  following 3 rules:
  TE_JOIN_NODE; TE_FAIL_NODE; TE_RG_MOVE

 The recovery_prog_path should be set the following:
 "/usr/es/sbin/cluster/events/<event.rp>"

 We have seen issues where for those 3 rules it gets changed to:
 "/usr/lpp/save.config/usr/es/sbin/cluster/events/<event.rp>"

 Fix:
 #odmget HACMPrules > rules.file
 #vi rules.file ==> correct the fields
 #odmdelete -o HACMPrules ==> removes the contents of the object class
 #odmadd rules.file
 #odmget HACMPrules ==> should now show the correct path

-----------------------


We got error message sayng that HACMPrules is not found.
--> on main node: HACMPrules not found in /etc/es/objrepos
    but the symlink exists.
--> we did rcp from the bck node
.
Then we tried to sync --> this time its HACMPsrvc who is not found
--> we did rcp of /etc/es/objrepos/* from bck to main node.
--> the sync from bkp to main node is OK
.
smit clstart -> OK on both nodes

1 comment:

  1. I’m impressed, I need to say. Really not often do I encounter a blog that’s each educative and entertaining, and let me let you know, you may have hit the nail on the head. Your concept is excellent; the difficulty is one thing that not enough persons are talking intelligently about. I am very glad that I stumbled across this in my seek for something relating to this. play casino

    ReplyDelete