Google Mail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Network link broken but user still connected
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  16 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Follow-up To:
Add Cc | Add Follow-up to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers that you hear
 
Cedric Fontaine  
View profile   Translate to Translated (View Original)
 More options 3 Oct, 14:12
From: Cedric Fontaine <cfonta...@spidmail.net>
Date: Sat, 3 Oct 2009 06:12:12 -0700 (PDT)
Local: Sat 3 Oct 2009 14:12
Subject: Network link broken but user still connected
Hello,

On our ASP server, we're offering access using ssh to our customers.

Each linux login has one or more fixed users on QM and they get
connected directly to qm -AACCOUNT -12 for example using a bash login
procedure.

We have both keepalive on the ssh server directly and on the client
part in the emulator. It works great in 95% of the situation but
sometimes some lines got stuck when network link breaks.

This morning 3 lines where broken and QM and the only solution was to
restart the whole QM. Here are some informations :

qm -U
 12        4146    0 /dev/pts/0              tca
 13       29595    0 /dev/pts/1              tca
 14       17279    0 /dev/pts/2              tca

A ps -auxw on the pib 4146 gave me :
tca       4146  0.0  0.2   3408  1904 ?        S    Oct02   0:02 /usr/
qmsys/bin/qm -12 -ATCA

A netstat shows me that there is no ESTABLISHED or PENDING ssh
connection on the OS part.

A PSTAT gave me no information except that process are not
responding :

:pstat
User Detail
  12 (Not responding)

  13 (Not responding)

  14 (Not responding)

And LOGOUT hangs
:LOGOUT 12
Force logout initiated for user 12

At this point we had to stop and start QM.

Thanks,


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Phillips  
View profile   Translate to Translated (View Original)
 More options 5 Oct, 10:27
From: "Martin Phillips" <martinphill...@ladybridge.com>
Date: Mon, 5 Oct 2009 10:27:18 +0100
Local: Mon 5 Oct 2009 10:27
Subject: Re: Network link broken but user still connected
Hi Cedric,

We have seen something similar where QM is not notified of loss of the
network connection and the process hangs inside a Linux library call where
we cannot see the logout request.

Although we need a better solution for this, you should be able to kill the
QM processes from Linux rather than a complete restart. Our cleanup
mechanism will then recover the licences within five minutes. You can speed
this up by doing
   qm -cleanup

I have forwarded your email to one of our dealers who has identified a
problem in Linux ssh that might explain this.

Martin Phillips
Ladybridge Systems Ltd
17b Coldstream Lane, Hardingstone, Northampton, NN4 6DB
+44-(0)1604-709200


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
eppick77  
View profile   Translate to Translated (View Original)
 More options 5 Oct, 15:08
From: eppick77 <eppic...@yahoo.com>
Date: Mon, 5 Oct 2009 07:08:25 -0700 (PDT)
Local: Mon 5 Oct 2009 15:08
Subject: Re: Network link broken but user still connected
Cedric,

We also get the same problem on occassion.  We are running Centos
5.3.  What are you running?

Eugene

On Oct 3, 9:12 am, Cedric Fontaine <cfonta...@spidmail.net> wrote:


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cedric Fontaine  
View profile   Translate to Translated (View Original)
 More options 5 Oct, 16:31
From: Cedric Fontaine <cfonta...@spidmail.net>
Date: Mon, 05 Oct 2009 11:31:51 -0400
Local: Mon 5 Oct 2009 16:31
Subject: Re: Network link broken but user still connected

eppick77 wrote:
> Cedric,

> We also get the same problem on occassion.  We are running Centos
> 5.3.  What are you running?

We are running Gentoo base system version 1.6.14.

--
Cedric Fontaine
http://www.terroirsquebec.com


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cedric Fontaine  
View profile   Translate to Translated (View Original)
 More options 5 Oct, 17:38
From: Cedric Fontaine <cfonta...@spidmail.net>
Date: Mon, 05 Oct 2009 12:38:58 -0400
Local: Mon 5 Oct 2009 17:38
Subject: Re: Network link broken but user still connected

Martin Phillips wrote:
> Hi Cedric,

> We have seen something similar where QM is not notified of loss of the
> network connection and the process hangs inside a Linux library call where
> we cannot see the logout request.

So I should just kill -9 the qm process on linux and then qm -cleanup ?

--
Cedric Fontaine
http://www.terroirsquebec.com


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Phillips  
View profile   Translate to Translated (View Original)
 More options 7 Oct, 08:31
From: Martin Phillips <MartinPhill...@ladybridge.com>
Date: Wed, 7 Oct 2009 00:31:29 -0700 (PDT)
Local: Wed 7 Oct 2009 08:31
Subject: Re: Network link broken but user still connected
On 5 Oct, 17:38, Cedric Fontaine <cfonta...@spidmail.net> wrote:

> So I should just kill -9 the qm process on linux and then qm -cleanup ?

Although use of kill -9 is not a good idea in most situations, it
should be safe to do when QM is not responding to other termination
requests.

We do need to understand this problem more fully and come up with a
solution though all the evidence we have so far suggests that the hang
is deep inside the Linux networking system and hence outside of our
control.

Martin Phillips, Ladybridge Systems


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tony G  
View profile   Translate to Translated (View Original)
 More options 9 Oct, 05:42
From: "Tony G" <wosclx...@sneakemail.com>
Date: Thu, 8 Oct 2009 21:42:23 -0700
Local: Fri 9 Oct 2009 05:42
Subject: RE: Network link broken but user still connected
I can see it now - Cedric files a support request with his Linux
provider (because we all know he's paying for support on his
FOSS, right?) and he tells them his DBMS provider says there is a
bug in the networking system.  Yes, and the issue will be
resolved quickly as a million highly motivated people devote
their free time to solving the problem.  Somehow I don't think
Cedric is going to get a resolution to this issue anytime soon.

I'm sorry Martin, I really don't expect you to be resolving Linux
issues, but I do see a great deal of irony in all of this.

T


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ashley Chapman  
View profile   Translate to Translated (View Original)
 More options 9 Oct, 06:12
From: Ashley Chapman <ash.chap...@gmail.com>
Date: Fri, 9 Oct 2009 05:12:25 +0000
Local: Fri 9 Oct 2009 06:12
Subject: Re: Network link broken but user still connected
2009/10/9 Tony G <wosclx...@sneakemail.com>:

I sometimes get this sort of finger pointing.  It often happens on
Windows systems, and another MV database that I use.  Nice to see the
same thing happening with Linux.  Don't want the FOSS people missing
out! ;-)

Anyway, I've found an effective way to stop the finger pointing is to
ask the person doing the pointing for EVIDENCE that the bug is where
they say it is.

So, Martin.  Do you have proof that the bug is in the Linux networking code?

Ashley Chapman


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Phillips  
View profile   Translate to Translated (View Original)
 More options 9 Oct, 09:58
From: "Martin Phillips" <martinphill...@ladybridge.com>
Date: Fri, 9 Oct 2009 09:58:51 +0100
Local: Fri 9 Oct 2009 09:58
Subject: Re: Network link broken but user still connected
Hi Tony,

> I'm sorry Martin, I really don't expect you to be resolving
> Linux issues, but I do see a great deal of irony in all of this.

I agree that it is not our job but, in this particular instance, one of our
dealers has identified and fixed a problem that sounds like it could be the
same issue. I have asked him to communicate directly with Cedric (or perhaps
via this list) and he has agreed to do so as soon as time permits.

Re Ashley's comment...

> So, Martin.  Do you have proof that the bug is in the Linux
> networking code?

We have seen two network connection problems that appear to be in Linux. The
one that fits closest to Cedric's problem is where we hang inside a kernel
function (as shown by strace) and never return to QM. This makes it
difficult for us to catch the error.

The other one involves poll() or select() saying "yes, there is data waiting
to be read" and read() saying "no there isn't", resulting in a loop trying
to recover the non-existant data. We have worked around this one inside QM.

Martin Phillips
Ladybridge Systems Ltd
17b Coldstream Lane, Hardingstone, Northampton, NN4 6DB
+44-(0)1604-709200


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cedric  
View profile   Translate to Translated (View Original)
 More options 9 Oct, 19:48
From: Cedric <cfonta...@spidmail.net>
Date: Fri, 9 Oct 2009 11:48:46 -0700 (PDT)
Local: Fri 9 Oct 2009 19:48
Subject: Re: Network link broken but user still connected

> I agree that it is not our job but, in this particular instance, one of our
> dealers has identified and fixed a problem that sounds like it could be the
> same issue. I have asked him to communicate directly with Cedric (or perhaps
> via this list) and he has agreed to do so as soon as time permits.

I didn't receive any direct support for now. I must admit that we're
currently stopping our migration to QM on those servers for now as
this point is a show stopper. We didn 't get any new hangs since last
week but I'm not sure that a kill will help cause it's pretty much
what I've been doing.

Our experience with D3 is that it could happens also on D3 but a
logoff will just bring the line back, as in QM, it will breaks the
whole QM server. Is it possible at least to fix the LOGOUT problem in
this case ?

Thanks,

Cedric


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ashley Chapman  
View profile   Translate to Translated (View Original)
 More options 9 Oct, 21:47
From: Ashley Chapman <ash.chap...@gmail.com>
Date: Fri, 9 Oct 2009 20:47:17 +0000
Local: Fri 9 Oct 2009 21:47
Subject: Re: Network link broken but user still connected
2009/10/9 Cedric <cfonta...@spidmail.net>:

Just a thought...

If there's a suspected problem in the linux internals, then presumably
this problem does not exist for QM on the Windows or BSD platforms.
If that's the case, perhaps you can consider using QM on top of
FreeBSD.  Unless you are tightly tied to Gentoo.

Ashley


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Phillips  
View profile   Translate to Translated (View Original)
 More options 10 Oct, 09:37
From: Martin Phillips <MartinPhill...@ladybridge.com>
Date: Sat, 10 Oct 2009 01:37:56 -0700 (PDT)
Local: Sat 10 Oct 2009 09:37
Subject: Re: Network link broken but user still connected
Hi Cedric,

We need to investigate this more fully. Please let us have full
details of how your connections are set up (direct into QM, via Linux
shell, ssh, etc) and the kernel revision in use.

A core dump of the process when it is stuck would be very helpful.
Failing this, please run strace to record the state of the process and
let us have the output.

Martin Phillips, Ladybridge Systems.


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tony G  
View profile   Translate to Translated (View Original)
 More options 11 Oct, 10:07
From: "Tony G" <wosclx...@sneakemail.com>
Date: Sun, 11 Oct 2009 02:07:19 -0700
Local: Sun 11 Oct 2009 10:07
Subject: RE: Network link broken but user still connected
It occurs to me that one of the best ways to get people to
recognize FOSS OpenQM is to present a reproducible case to the
Linux distro developers with OpenQM as the focal point.  To fix
the Linux problem they might need to install OpenQM, and in doing
so they may want to know more about what it is.  I hope it plays
out like this.

In other words, Martin, it may be better to be less eager to fix
this on your own, even if you can.

T


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Phillips  
View profile   Translate to Translated (View Original)
 More options 15 Oct, 11:45
From: "Martin Phillips" <martinphill...@ladybridge.com>
Date: Thu, 15 Oct 2009 11:45:44 +0100
Local: Thurs 15 Oct 2009 11:45
Subject: Re: Network link broken but user still connected
Hi Cedric,

Any sign of the requested diagnostics?

We have tried repeatedly to reproduce this here but have so far failed. It
is tough to diagnose the cause without an example to look at.

Martin Phillips
Ladybridge Systems Ltd
17b Coldstream Lane, Hardingstone, Northampton, NN4 6DB
+44-(0)1604-709200


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Cedric Fontaine  
View profile   Translate to Translated (View Original)
 More options 16 Oct, 20:43
From: Cedric Fontaine <cfonta...@spidmail.net>
Date: Fri, 16 Oct 2009 12:43:15 -0700 (PDT)
Local: Fri 16 Oct 2009 20:43
Subject: Re: Network link broken but user still connected

On 10 oct, 04:37, Martin Phillips <MartinPhill...@ladybridge.com>
wrote:

> Hi Cedric,

> We need to investigate this more fully. Please let us have full
> details of how your connections are set up (direct into QM, via Linux
> shell, ssh, etc) and the kernel revision in use.

Users are connecting via ssh and then are redirected to Qm using
bash_profile executing "/usr/qmsys/bin/qm -12 -AACCOUNT" for example.

Linux 2.6.28.4-xxxx-std-ipv4-32 #2 SMP Wed Feb 18 16:34:04 UTC 2009
i686 AMD Athlon(tm) X2 Dual Core Processor BE-2300 AuthenticAMD GNU/
Linux

> A core dump of the process when it is stuck would be very helpful.
> Failing this, please run strace to record the state of the process and
> let us have the output.

What are the command lines for core dump or strace ?

Sorry for the late answer. We didn't get any problem since then, but
we didn't change any settings either.


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Phillips  
View profile   Translate to Translated (View Original)
 More options 19 Oct, 18:26
From: "Martin Phillips" <martinphill...@ladybridge.com>
Date: Mon, 19 Oct 2009 18:26:59 +0100
Local: Mon 19 Oct 2009 18:26
Subject: Re: Network link broken but user still connected
Hi Cedric,

This sounds very much like the Linux problem that one of our users tracked
down. I will ask him again to reply.

> What are the command lines for core dump or strace ?

Depending on your system, you should be able to force a core dump with
kill -4 (SIGILL) but it doesn't seem to work on all systems.

Were supported, strace is
   strace -p 1234
where 1234 is the pid of the process that you want to trace.

Martin Phillips
Ladybridge Systems Ltd
17b Coldstream Lane, Hardingstone, Northampton, NN4 6DB
+44-(0)1604-709200


    Reply    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message, you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google