[jacorb-developer] Bidir connection not working for some clients

Gergely Jakab jakab at extech.eu
Tue Jan 7 05:26:06 CET 2014


Hello everyone,

I would need your help with our difficulty as our customers are suffering badly due to (I believe) some misconfiguration.

We are using Bidirectional GIOP in our client-server application because our server needs to call methods on the clients which might be sitting behind a firewall or NAT. This setup was working for us nicely for years up to recently we got some new customer.

So the standard method call from the server to a client through the reference the client provided about its servant looks like this in the server log:
ClientConnectionManager: found ServerGIOPConnection to [public IP:port of the client] from [IP:port of the server] (7c5eb3fc)

Now it sometimes happens that for the clients of the new customer the server does not use the existing bidirectional connection to call methods but creates a new client-side connection from the server to the client which then fails because the reference of the client object contains local IP which can not be connected from the server. Here is how it looks in the server log:
ClientConnectionManager: created new ClientGIOPConnection to 192.168.1.23:50441 (46bb13e2)
Unsuccessful ping becuase of : org.omg.CORBA.TRANSIENT: Retries exceeded, couldn't reconnect to 192.168.1.23:50441  vmcid: 0x0  minor code: 0  completed: No

The second line above is my debug output (“ping” refers to the method which was to be called on the client). This second line comes in about a minute after the first line.
My first idea was that the connection between the client and the server was broken and that’s why the server creates a new connection towards the client, but it is not the case, because in such case the server should throw some COMM_FAILURE exception which it does normally when the client shuts down for instance. Furthermore after this mistake happens on the server, the client goes on calling successfully server methods through the original bidir connection which is still living.

This behaviour is client-specific because when 4 clients are working with the server simultaneously, only 2 of them shows this mistake, the other 2 work fine. So I assume it can be effected by the platform or network of the client machine.
We are using JacORB 2.3.1 on both client and server side. 
We recently made a try to switch to JacORB 3.3 to see if it would solve our issue, but it got even worse. On 3.3 those 2 clients were generating this mistake immediately and always, not only occasionally as in 2.3.1.

I don’t know if it’s relevant or not but when this mistake appears, it happens on some method which is called periodically every 2 minutes and it slows down the server to nearly unresponsive state, so even the other clients for which the connection remains working are not able to work anymore. The restart of the server (application) usually helps on the JacORB 2.3.1, but not on 3.3.

I would appreciate any suggestion how to fix this state.

Thx!
JG



More information about the jacorb-developer mailing list