Back to top
 
 
 

Troubleshooting Guide

This document offers cause and solutions to common errors that ConVirt users run in to.

Contents


1.1 Installation

1.1.1 CMS Installation

  • ./setup_convirt errors out with python: can't open file 'setup.py': [Errno 2] No such file or directory

This typically happens when setup can not find proper convirt-enterprise directory containing the code.

Solution :
Check if the convirt-enterprise-<version>.tar.gz tarball was untared in the home directory.
if convirt-enterprise directory is at a different location than in the home directory, please set CONVIRT_DIR and run setup_tg2 and setup_convirt again.




  • Error downloading hashlib during setup
Solution : Make sure http proxy variables are set.
    export http_proxy="http://my-corporate-proxy.domain.com:80/"
    Note : Someone has suggested using https_proxy as well. This has not been verified.
Solution : Install it manually
 wget http://code.krypto.org/python/hashlib/hashlib-20081119.zip
 unzip hashlib-20081119.zip
 source ~/convirt-enterprise/tg2env/bin/activate
 cd hashlib-20081119
 python setup.py install
 deactivate

1.1.2 Managed Server Setup / convirt-tool

  • After running convirt-tool, the managed server "crashed"/"restarted"
Solution :
This happens when the managed server is a part of the cluster. When convirt-tool is setting up required, bridge etc,the managed server looses network connectivity. This results in the fencing mechanism to kick in and shutdown/restart the managed server. If you have required bridges already setup, you should use the "--skip_bridge" option of the convirt-tool.

  • After running convirt-tool, the Xen (Daemon) does not start again.
Solution :
This is seen mostly in SLES 11/SLES 11 SP1 Xen 4.0 environment. The Xen 4.0 expects SSL setup.
convirt-tool creates backup of /etc/xen/xend-config.sxp file in the same directory, revert the .sxp file to its original state and run the convirt-tool command with --xen_ssl option.

1.2 Post Installation

1.2.1 Adding Managed Server from ConVirt UI

  • (111, Connection refused) : This is displayed while adding Xen server. This is because the ConVirt can not talk to the Xend server. To remedy, validate that convirt-tool was run on the managed server and xend is listening on all interfaces.
  netstat -an | grep 800 
  Also, make sure that the firewall is either disabled or allows for 8006 and 8002 port.
  iptable -nL | grep 800

  • Channel Closed. : This typically means that ConVirt is not able to do ssh and/or sftp to the managed server.
Solution :
Check sftp-server location specified in the /etc/ssh/sshd_config on the managed server points to a valid binary.

  • Key Mismatch : This typically can have one of the following problems.
1. During CMS startup, the ssh-agent setup did not happen. This typically will show up as Failed to add identity to the agent. Key based Authentication may not work.
Solution :
Restart CMS by providing correct password to load the keys
Or start CMS in the following fashion
eval `ssh-agent -s`
ssh-add ~/.ssh/cms_id_rsa
./convirt-ctl start


2. Either the server was re-imaged and got a new identity.
Solution :
Make sure that you can login to the managed server without password.
Restart CMS


3. Managed server does not have CMS key in the ~/.ssh/authorized_key file.
Solution :
Add the ~/.ssh/cms_id_rsa.pub from the CMS host and CMS user, to the managed servers ~/.ssh/authorized_keys for root user. A CMS restart may be required.

  • 104 : Connection reset by peer : This typically happens when you have xen server SSL enabled (as in SLES 11/Xen 4.0) and you are trying to connect using XML-RPC protocol.
Solution : Use the XML-RPC over SSL from the drop down on the Add Server dialog.

  • No module named xen.xend.XendClient : This happens when you do not have xen client libraries are not installed.
Solution :
Make sure ./install_dependency was run.
On RHEL, make sure that you have RHEL Virtualization EUS Software Channel subscribed for the system, and then run install_dependency script again. (This will install xen and related dependencies)


1.2.2 VNC Issues


1.2.3 Starting a Virtual Machine

  • Error: Device 0 (vif) could not be connected. Could not find bridge device xenbr0
This is because the guest networking inteface could not be connected to specified bridge.
Solution :
Make sure that convirt-tool was run on the managed server.
Check out that a bridge with the name reported in the errror, exist on the managed server. Use brctl show on the managed server.
User Edit Settings menu item for the virtual machine to change the bridge/network name to bridge on the managed server.
  • ("/etc/qemu-ifup: could not launch network script\nqemu-kvm: -net tap,vlan=0: Device 'tap' could not be initialized\n", 1)
This is because the guest networking inteface could not be connected to specified bridge.
Solution :
Same as above.

1.2.4 High Availability

  • Can not copy the fencing script to peer node ... Exception:[Errno 13] Permission denied: u'/sbin/fence_ipmilan.hash
This is because the fencing script are installed under /usr/sbin on some platforms.
Solution :
Copy or link the script to /sbin directory.


1.2.5 Miscellaneous

  • /var/log/messages contains many 'Did not receive identification string from IP-ADDRESS-OF-CONVIRT SERVER'
This is because the CMS is doing the port check for server up/down.
Solution :
Change /etc/ssh/sshd_config to have
LogLevel ERROR
and restart the sshd.
 

Comments