Sunday, March 18, 2012

XenServer 5.6 FP1 VDI Issue

Recently I ran into situation where one of my production VM's  went unresponsive,  the VM console showed various VDI related (xvda) I/O errors and the machine was halted.
It's worth mentioning that my XenServers (5.6 FP1) nodes operate in pool mode, and the problematic VDI resided on iSCSI LUN which seemed to be OK.

There weren't much other options rather than shutting down the VM with:
#xe vm-shutdown vm=vm0001 force=true

However, when I've tried to power on the VM back I got this nasty error:
18-Mar-12 9:42:16 AM Error: Starting VM 'vm0001' - Internal error: Failure("The VDI e17e2406-dbe9-40f6-98c3-af470e8aa91b is already attached in RW mode; it can't be attached in RO mode!")

Here is the workaround which did the job for me:

1. Find the UUID of the Storage Repository and the VM problematic VDI.
#xe sr-list |grep -i -C2 'your LUN name'
#xe vdi-list |grep -i -C2 'vdi name'

2. Next, we need to to remove VDI from the listing:
 #xe vdi-forget uuid=

Do not worry about the contents of the VDI they are fine :)
Verify the VDI is indeed gone:
#xe vdi-list |grep -i 'vdi name'

3. It's time to re-scan the storage repository that hosts the VDI via:
# xe sr-scan sr-uuid=

4. Verify the VDI is back in the listing:
#xe vdi-list sr-uuid=

Please note that the "name" and "description" fields are now empty.

5. Use XenCenter to reattach the VDI to your VM , and start it on different XenServer host inside your pool (right click on the VM, select "storage"->"attach"->).

This should do the magic.