Folks this has become an interesting discussion on a VMware Community Forum as of a few weeks ago about this vSphere 5.1 error. In fact I myself ran into the same issue on a new build with the vCloud Service Evaluation cloud a couple of weeks ago. I decided being a good citizen and VMware employee to start an internal discussion of this to see where it could lead. Below is a screen show you MIGHT see showing the error, however what I can tell you now is this error actually has NOTHING to do with the root cause issue so let me explain.
The vSphere Setup To Reproduce:
- vSphere 5.1 and/or vCloud Director
- CentOS 6.3 Guest (although other versions also seem to hang)
- Tools installed post OS installation
- Wait about 10-15 minutes
- Reboot the Guest Virtual Machine
This is rather easy and is something anyone can do. What I can say for sure, this is not tools version related, it is not even virtual hardware related, it is something with the tools and vSphere 5.1 specifically, regardless of versions. I actually tested both:
- HW8 and Tools Version 8 running on vSphere 5.1
- HW9 and Tools Version 9 running on vSphere 5.1
- Uninstalled the tools and things work fine
The error itself is arbitrary and happens to be presented while the Virtual Machine is simply trying to finish the boot. I know this as I was asked to start renaming certain libraries including libguestinfo.so which removed the error in the screen, but the Virtual Machine still stalls on boot. This error is completely un-related to the underlying root cause which is actually that the Virtual Machine is hanging on boot for about 10-15 minutes.
The Real Problem
So in the spirit of troubleshooting, engineering used my test Virtual Machines to go through other libraries until we isolated the libtimeSync.so library. Now, if you recall way back early in vCloud Director I ran into a time sync issue that was ultimately being caused by the fact NTP was not enabled on a cell. However, what I pointed out in that article was that EVERY virtual machine, regardless if the tools are set to sync with the host or not when the tools running will ALWAYS time sync initially on boot if the tools are installed. Then once the tools are running, the settings of sync with host or not are enforced.
What appears to be happening here and we don’t yet have the “why” is that the time sync library is causing the hang on the initial boot. So now that we know the reason it is hanging and the library causing it we can do a quick work around which for most people should be good until we can get a final root cause as to the reason the library is just taking so long.
The vSphere Workaround
This is so simple you are going to wonder why I made you read the rest of the article. Well, frankly I think it’s important to understand the troubleshooting we have taken to understand and isolate the issue so you know why it’s happening not just the “fix”. So what that the answer right now seems to be simple assuming you only have access to the vSphere guest itself like I do in a fully hosted environment:
- Rename libtimeSync.so to libtimeSync.so.bak (or something else you choose)
- There is a KB Article that you might want to read and see it you can try that as well if you have access to the full back end
The File locations are as follows:
Since most of you using Linux are most likely using NTP, this shouldn’t interfere with the overall time sync of your guest, as NTP should start-up soon after and do it’s sync and you should be good to go for now until a more long-term explanation and fix is put forth. This should get you by and allow you to keep the tools installed without the warm reboot hanging issue.
I hope this helps, and always troubleshooting is an ongoing effort fo trial and error. Thanks to the vCloud Service Evaluation setup for providing a fast easy way to re-produce AND provide access to others to help continue to troubleshoot this problem. If this was only in my lab if would have been much harder to provide the access.