Ok, here's a quick one of an issue and resolution (sort of) that I ran into today. Whether or not I actually fixed the error and whether or not the error will come back only time will tell.
Basically, I was running some updates on a few hosts today, taking them in and out of maintenance mode when the following error occurred.
"Cannot complete the configuration of the HA agent on the host. Unable to contact a primary HA agent."
Sounds kind of scary at the start. I had no clue what was causing this and was a little worried that maybe through some sort of magic that a host may be somehow declared isolated and start to restart vm's. Basically, I tried to disable and re-enable HA, however the process seemed to be taking a long time and always getting stuck at 72%. I restarted the vCenter server service in order to cancel the tasks, then tried re-enabling HA again…same slowness, same 72% hang up. After a few attempts I finally decided to wait to see if it was really hung up. After a certain amount of time, not sure how long it was, but it was quite a while, the HA configuration failed and moved on to the next host…which in turn took the same amount of time and failed. It wasn't until I allowed this process to continue and fail on every host that things started to work as expected. After it had timed out itself, it was just a matter of disabling HA, letting that process finish, and then re-enabling HA and letting that process finish that all of my HA errors had cleared. I'm peachy now! Moral of the story, don't mess with the HA tasks, even if they seem to be taking forever, just let them be and time out by themselves 🙂 Also, note to self – Buy Duncan and Franks book – I think I need it!!