Troubleshooting Guide
This document offers cause and solutions to common errors that ConVirt users run in to.
Contents |
1.1 Installation
1.1.1 CMS Installation
- ./setup_convirt errors out with python: can't open file 'setup.py': [Errno 2] No such file or directory
This typically happens when setup can not find proper convirt-enterprise directory containing the code.
- Solution :
- Check if the convirt-enterprise-<version>.tar.gz tarball was untared in the home directory.
- if convirt-enterprise directory is at a different location than in the home directory, please set CONVIRT_DIR and run setup_tg2 and setup_convirt again.
- Error downloading hashlib during setup
- Solution : Make sure http proxy variables are set.
export http_proxy="http://my-corporate-proxy.domain.com:80/" Note : Someone has suggested using https_proxy as well. This has not been verified.
- Solution : Install it manually
wget http://code.krypto.org/python/hashlib/hashlib-20081119.zip unzip hashlib-20081119.zip source ~/convirt-enterprise/tg2env/bin/activate cd hashlib-20081119 python setup.py install deactivate
1.1.2 Managed Server Setup / convirt-tool
- After running convirt-tool, the managed server "crashed"/"restarted"
- Solution :
- This happens when the managed server is a part of the cluster. When convirt-tool is setting up required, bridge etc,the managed server looses network connectivity. This results in the fencing mechanism to kick in and shutdown/restart the managed server. If you have required bridges already setup, you should use the "--skip_bridge" option of the convirt-tool.
- After running convirt-tool, the Xen (Daemon) does not start again.
- Solution :
- This is seen mostly in SLES 11/SLES 11 SP1 Xen 4.0 environment. The Xen 4.0 expects SSL setup.
- convirt-tool creates backup of /etc/xen/xend-config.sxp file in the same directory, revert the .sxp file to its original state and run the convirt-tool command with --xen_ssl option.
1.2 Post Installation
1.2.1 Adding Managed Server from ConVirt UI
- (111, Connection refused) : This is displayed while adding Xen server. This is because the ConVirt can not talk to the Xend server. To remedy, validate that convirt-tool was run on the managed server and xend is listening on all interfaces.
netstat -an | grep 800
Also, make sure that the firewall is either disabled or allows for 8006 and 8002 port. iptable -nL | grep 800
- Channel Closed. : This typically means that ConVirt is not able to do ssh and/or sftp to the managed server.
- Solution :
- Check sftp-server location specified in the /etc/ssh/sshd_config on the managed server points to a valid binary.
- Key Mismatch : This typically can have one of the following problems.
- 1. During CMS startup, the ssh-agent setup did not happen. This typically will show up as Failed to add identity to the agent. Key based Authentication may not work.
- Solution :
- Restart CMS by providing correct password to load the keys
- Or start CMS in the following fashion
- eval `ssh-agent -s`
- ssh-add ~/.ssh/cms_id_rsa
- ./convirt-ctl start
- 1. During CMS startup, the ssh-agent setup did not happen. This typically will show up as Failed to add identity to the agent. Key based Authentication may not work.
- 2. Either the server was re-imaged and got a new identity.
- Solution :
- Make sure that you can login to the managed server without password.
- Restart CMS
- 3. Managed server does not have CMS key in the ~/.ssh/authorized_key file.
- Solution :
- Add the ~/.ssh/cms_id_rsa.pub from the CMS host and CMS user, to the managed servers ~/.ssh/authorized_keys for root user. A CMS restart may be required.
- 104 : Connection reset by peer : This typically happens when you have xen server SSL enabled (as in SLES 11/Xen 4.0) and you are trying to connect using XML-RPC protocol.
- Solution : Use the XML-RPC over SSL from the drop down on the Add Server dialog.
- No module named xen.xend.XendClient : This happens when you do not have xen client libraries are not installed.
- Solution :
- Make sure ./install_dependency was run.
- On RHEL, make sure that you have RHEL Virtualization EUS Software Channel subscribed for the system, and then run install_dependency script again. (This will install xen and related dependencies)
1.2.2 VNC Issues
- For all VNC issues refer to this special page. VNC Troubleshooting]
1.2.3 Starting a Virtual Machine
- Error: Device 0 (vif) could not be connected. Could not find bridge device xenbr0
- This is because the guest networking inteface could not be connected to specified bridge.
- Solution :
- Make sure that convirt-tool was run on the managed server.
- Check out that a bridge with the name reported in the errror, exist on the managed server. Use brctl show on the managed server.
- User Edit Settings menu item for the virtual machine to change the bridge/network name to bridge on the managed server.
- ("/etc/qemu-ifup: could not launch network script\nqemu-kvm: -net tap,vlan=0: Device 'tap' could not be initialized\n", 1)
- This is because the guest networking inteface could not be connected to specified bridge.
- Solution :
- Same as above.
1.2.4 High Availability
- Can not copy the fencing script to peer node ... Exception:[Errno 13] Permission denied: u'/sbin/fence_ipmilan.hash
- This is because the fencing script are installed under /usr/sbin on some platforms.
- Solution :
- Copy or link the script to /sbin directory.
1.2.5 Miscellaneous
- /var/log/messages contains many 'Did not receive identification string from IP-ADDRESS-OF-CONVIRT SERVER'
- This is because the CMS is doing the port check for server up/down.
- Solution :
- Change /etc/ssh/sshd_config to have
- LogLevel ERROR
- and restart the sshd.
Comments