Troubleshooting vCloud Director Service Startup

This is just a quick post to help folks troubleshoot the startup of their vCloud Director Cell services.  I have seen only a few things that will prevent a cell from starting up all the way.  From my experience these things are the most common and may present after you actually had the cells running, but made changes.

  • Cannot bind to IP Addresses or Certificates
  • DB Connection Issues
  • DNS Lookup both Forward and Reverse
  • Cannot verify the transfer space

Of the three above the two most common are Database connection issues and the transfer space.  Database connection issues can occur if the table spaces fill up and the DBA’s have not allowed growth, or the Database server is simply down.  The transfer space issue is usually related to the mount point in FSTAB you may have updated to support multiple Cells.  The first place to look is $VCLOUD_HOME/logs/cell.log.  

NOTE:  $VCLOUD_HOME is usually /opt/vmware/cloud-director for 1.0 and /opt/vmware/vcloud-director for 1.5

In both cases one of them can cause a cell not to fully startup.  Below is an example of a fully started cell.log file so you can use it to compare.  This log re-writes every time the application restarts so unless you have a copy from the first time it started up, you can use this for reference.

Application startup begins: 8/21/11 7:30 AM
Successfully bound network port: 80 on host address: 192.168.110.xxx
Successfully bound network port: 443 on host address: 192.168.110.xxx
Application Initialization: 9% complete. Subsystem 'com.vmware.vcloud.common.core' started
Successfully connected to database: jdbc:oracle:thin:@Oracle01.test.local:1521/orcl
Successfully bound network port: 443 on host address: 192.168.110.yyy
Successfully bound network port: 61616 on host address: 192.168.110.xxx
Successfully bound network port: 61613 on host address: 192.168.110.xxx
Application Initialization: 18% complete. Subsystem 'com.vmware.vcloud.common-util' started
Application Initialization: 27% complete. Subsystem 'com.vmware.vcloud.consoleproxy' started
Application Initialization: 36% complete. Subsystem 'com.vmware.vcloud.vlsi-core' started
Application Initialization: 45% complete. Subsystem 'com.vmware.vcloud.vim-proxy' started
Successfully verified transfer spooling area: /opt/vmware/cloud-director/data/transfer
Application Initialization: 54% complete. Subsystem 'com.vmware.vcloud.backend-core' started
Application Initialization: 63% complete. Subsystem 'com.vmware.vcloud.imagetransfer-server' started
Application Initialization: 72% complete. Subsystem 'com.vmware.vcloud.rest-api-handlers' started
Application Initialization: 81% complete. Subsystem 'com.vmware.vcloud.ui.configuration' started
Application Initialization: 90% complete. Subsystem 'com.vmware.vcloud.jax-rs-servlet' started
Application Initialization: 100% complete. Subsystem 'com.vmware.vcloud.ui-vcloud-webapp' started
Application Initialization: Complete. Server is ready in 0:46 (minutes:seconds)
Successfully initialized ConfigurationService session factory
Successfully started scheduler
Successfully started remote JMX connector on port 8999

Some key things to note in the above log for sure.  If the Cell does not get past 9%, check with the DBA’s since the next step is the database connection.  If the startup fails to verify the transfer space you will get an error here.  The most common reason is that the transfer space is not writable.  If the issue is binding to the IP or certificates you should also see that here.  Per Timo’s comment I added DNS in the list above.  I have always had DNS configured in my lab so I have not found this particular.  Thanks to @Timo for pointing that one out!

Transfer Space Permissions

In case I did not document it elsewhere I will do so here on the permissions of $VCLOUD_HOME/opt/data/transfer.  vCloud DIrector creates a “vcloud” user and a “vcloud” group.  When you remount the transfer directory for NFS BOTH the UID and GID for “vcloud” must be set on the mount point.  If only one is it will affect the ability to validate the transfer space.  Note the group and user ownership below.  This must also be RECURSIVE on the sub folders as well.  Depending on the storage device you may need to query the GID and UID and give it to the storage folks for permissions.  If you have multiple cells NOT created from template you may have DIFFERENT GID’s and UID’s and you might want to edit them to be the same.

[[email protected]ColottiVCD01 data]# ls -l
total 16
drwx------  3 vcloud vcloud 4096 Aug 21 07:30 activemq
drwx------  2 vcloud vcloud 4096 Sep 13  2010 generated-bundles
drwxr-x---+ 4 vcloud vcloud   33 Aug 23 09:17 transfer
drwx------  2 vcloud vcloud 4096 Aug 21 07:30 txlog

If all these things are working the cell should startup and you will see something similar to the above example in cell.log

2 comments

  1. Hi Chris,

    Another very usual error is DNS issue at 9%.

    You need to have proper DNS configuration (forward AND reverse).

    PS: you will be able to troubleshoot that quite easily checking the vcloud-container-debug.log in the logs directory (/opt/vmware/cloud-director/logs for v1.0.x and /opt/vmware/vcloud-director/logs in v1.5.x)

    Regards,

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Scroll To Top
%d bloggers like this: