Tuesday, September 18, 2012

Troubleshooting Disk and Data store Related Issues  SHOOT:2


SHOOT: 2

VMFS Lock Volume is Corrupted

Details

You may observe the following events within the /var/log/vmkernel logs within your VMware ESX host:
Volume 4976b16c-bd394790-6fd8-00215aaf0626 (san-lun-100) may be damaged on disk. Corrupt lock detected at offset 0

Note: In this example 4976b16c-bd394790-6fd8-00215aaf0626 represents the UUID of the VMFS datastore and san-lun-100 represents the name of the VMFS datastore.
You may observe the following events within the /var/log/vmkernel logs within your VMware ESX host:
Resource cluster metadata corruption detected Volume 4976b16c-bd394790-6fd8-00215aaf0626 (san-lun-100) may be damaged on disk.


Note: In this example 4976b16c-bd394790-6fd8-00215aaf0626 represents the UUID of the VMFS datastore and san-lun-100 represents the name of the VMFS datastore.

Solution

The events indicate that the reported VMFS volume is corrupt. The scope and the cause of the corruption may vary. The corruption may affect just one file or the entire volume.
Create a new datastore and restore any information that may have been compromised to the new datastore from existing backups. Do not use the corrupt VMFS datastore any longer.

Note: If some information is still accessible on the datastore that is reportedly corrupt, you may attempt to migrate the information off of the datastore with the use of the vCenter migrate feature, vmkfstools, or the datastore browser. If you are able to migrate any information off of the corrupt datastore, validate the information to ensure that it has not been affected by the corruption.

Determining the cause of the corruption

If you would like assistance in determining the cause of the corruption, VMware technical support can provide assistance in a best effort capacity.
To collect the appropriate information to diagnose the issue:

Note: More information about support service terms and conditions can be found here. Log into the service console as root.
Find the partition that contains the volume. In the case of a spanned volume, this is the head partition. Run the following command to find the value of the partition:

vmkfstools -P /vmfs/volumes/<volumeUUID>

For example, run the following command to find the partition for 4976b16c-bd394790-6fd8-00215aaf0626:

# vmkfstools -P /vmfs/volumes/4976b16c-bd394790-6fd8-00215aaf0626

File system label (if any): san-lun-1000
Mode: public
Capacity 80262201344 (76544 file blocks * 1048576), 36768317440 (35065 blocks) avail
UUID: 49767b15-1f252bd1-1e57-00215aaf0626
Partitions spanned (on "lvm"): naa.60060160b4111600826120bae2e3dd11:1
Make note of the first device listed in the output for the Partitions spanned list. This is the value for the partition. In the above example, the first device is:
naa.60060160b4111600826120bae2e3dd11:1
Using the value from step 3, run the following command to save the vmfs3 metadata region and provide it to VMware customer support:

dd if=/vmfs/devices/disks/<disk:partition> of=/root/dump bs=1M count=1200 conv=notrunc

Note: The variable <disk:partition> is the value recorded in step 3.
Caution: The resulting file is approximately 1200 MB in size. Ensure that you have adequate space on the destination. The destination in the above example is the /root/ folder. To compress the file, you can use an open source utility called gzip. The following is an example of the command:

# gzip /root/dump

Note: For more information on the gzip utility, type man gzip at the console.
Create a new support request. For more information, see How to Submit a Support Request. Upload the resulting file along with a full support bundle to VMware technical support.

0 comments:

Powered by Blogger.