VMware Cloud Director Upgrade History

Hi everyone and welcome to a very short post. This post is going to cover something that I was asked today. The question I have received was “Do you know the upgrade history of our VMware Cloud Director enviroment?”

You can find the upgrade steps in the following link https://docs.vmware.com/en/VMware-Cloud-Director/10.3/VMware-Cloud-Director-Install-Configure-Upgrade-Guide/GUID-BE64E6B6-EEB5-40FD-8797-45EF62854C06.html

Note: The following steps apply only for the appliance deployment.

To troubleshoot the upgrade, there is a file you can check to monitor the upgrade process.

/opt/vmware/var/log/vami/updatecli.log

In the file you can find the following information that shows from which version you are upgrading and the destination version.

Said that, I have created a single liner

more /opt/vmware/var/log/vami/updatecli.log | grep "to version"

The result of that one liner is the complete upgrade history. Keep in mind that if you delete this file or you redeploy a cell for any reason, that information won’t be available.

I hope you enjoyed this rush post.

If your are interested in a specific topic, let me know in the comments section below and I’ll be happy to write a new post about it.

Be sociable, share it!

Troubleshooting VMare Cloud Director Log Bundle generation

I know it’s been a while since my last post, so I’m very happy to be again on the road.

Last week I had some issues in a VMware Cloud Director enviroment that ended up opening a support request with VMware. During the case development, the support team requested me to create a log bundle, but I wasn’t even able to generate it, so that’s where this story begins. Please note that the modifications I’ve done might not be supported by VMware.

For those who are not familiar on how to generate a VCD log bundle, here are the steps https://kb.vmware.com/s/article/1026312

The error that I was having was the following

General message showing the general error message
Some Cells where showing a FAIL status. In this image just one but most of them where showing the same status

The first step to troubleshoot this situation was to identify which cells were the ones with a FAIL status. The CELLID can be obtained with the following command

 more /opt/vmware/vcloud-director/etc/global.properties | grep cell.uuid

To troubleshoot the bundle generation process there are two log files located under /opt/vmware/vcloud-director/logs

vmware-vcd-log-collection-agent.log

This first log file shows if the cell detects the marker file that indicates the cell to start generating a lod bundle. On a regular situation it should look like the following

This means that the cell has detected a marker file
This is the marker file located in
/opt/vmware/vcloud-director/data/transfer/vmware-vcd-support/log-collector

From that side everything was looking perfect, so I went into the other file to understand what was happening.

vmware-vcd-log-publisher.log
Normal log file situation. Started and completed

Analyzing the logs I found two different errors. The first one, that a log collection process was still running. The way I managed to fix this, was to shutdown the cell services and start them again.

Abnormal situation, a log collection process was still running

The second error was just a timeout. I wondered where that timeout could be configured to extend it.

Abnoral situation. Timeout reached and marker has been removed

After I realized about that, I started digging deeper into the support bundle generation script.

/opt/vmware/vcloud-director/bin/vmware-vcd-support 

I’m not going to cover the complete script because it’s not the central topic of the post. In that file I found the some variables that pointed me into the multi-cell-log collection script, so it was time to check that file also.

vmware-vcd-support script variables

I opened the file with the following command and founded that there was a timeout variable that I could modify.

 vi ${VCLOUD_HOME}/bin/vmware-vcd-multi-cell-log-collector
Timeout variable without modifications

So I decided to modify that value and try again. It’s not a value that will wait until it expires, it’s just a maximum timeout. If the cell bundle generation ends earlier, the script will continue.

Modified timeout value

After modifying that timeout everything was working again so I returned to the vmware-vcd-log-publisher.log to analyze how much time was taking, just out of curiosity. It was taking between 8-9 minutes, a little more than the default.

I ended up doing a cleanup of old logs, and the process was smooth again in less than 7 minutes.

There is another reason why this process might fail and its detailed on this kb https://kb.vmware.com/s/article/71349

If your are interested in a specific topic, let me know in the comments section below and I’ll be happy to write a new post about it.

Be sociable, share it!