On Thu, Mar 10, 2011 at 4:04 AM, Todd Deshane <todd.deshane@xxxxxxx> wrote:
> On Tue, Mar 8, 2011 at 7:49 AM, Kristoffer Egefelt <dr.fersken@xxxxxxxxx>
> wrote:
>> Hi list,
>>
>> we're testing XCP 1.0 build 42052 and have run into a strange issue.
>>
>> One master, one slave in a pool, both configured with plain slb bond0
>> with management interface untagged, and vm networks/vlans on top of
>> the bond0
>> Setup has been running for a week with no problems.
>> This morning we cannot connect to the pool via xencenter.
>> SSH to master works, but theres no management interface configured and
>> no NIC's available - after reboot ssh do not work anymore.
>> Connecting to console, running ethtool eth0 says no link although
>> theres link on the switch.
>> Restarting network, running ifup/down, manually assigning ip's do not
>> work - theres no interfaces.
>>
>> SSH to slave does not work.
>> Connecting to console, I cannot login to xsconsole when trying to do
>> anything.
>>
>> Running xe pool-emergency-transition-to-master on master - server
>> reboots and work :-)
>> Restarting slave, everything works, though the master cannot connect
>> to shared iscsi storage, SR_error 141.
>>
>> Removing and reattaching the SR works.
>>
>> The logfiles is not helping me much figuring out what happened -
>> master complains that it can't reach the slave, slave is not logging
>> anything.
>>
>> Mar 7 21:44:49 node0106 xapi: [ warn|node0106|11 heartbeat|Heartbeat
>> D:b450677c6101|http] HTTP connection closed immediately with before
>> any data received
>> Mar 7 21:44:49 node0106 xapi: [ warn|node0106|11 heartbeat|Heartbeat
>> D:b450677c6101|http] stunnel pid: 4365 caught
>> Xmlrpcclient.Empty_response_from_server
>>
>> [20110307T20:44:19.628Z|error|node0106|6776
>> inet-RPC|session.login_with_password D:9f2cd8a60315|stunnel]
>> get_reusable_stunnel: fresh stunnel failed reusable check; delaying
>> 10.00 seconds before reconnecting to 10.10.3.108:443 (attempt 1 / 10)
>>
>> Any hints is greatly appreciated.
>>
>
> Have you ruled out a hardware problem (for example, did the NIC go bad)?
>
> Thanks,
> Todd
Yeah, no faulty NIC's, but we were creating about 15 vm's and deleting
them again within 30 minuttes when the warnings started..
The system has been running smoothly since though.
I'll try to stress the servers again this weekend, to see what happens.
Thanks
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|