[jacorb-developer] Race Condition: ConnectionManager releases in use connection
Hugo Roenick
hroenick at gmail.com
Tue Jul 14 16:14:55 CEST 2015
Almost forgot mentioning.
This error also happened on version 3.5 of JacORB. Not only in 3.6 or 3.6.1 as bugs 1011 and 1012.
And the provided logs outputs and call stack are from version 3.5.
Cheers,
-- Hugo
On Jul 14, 2015, at 8:32 AM, Hugo Roenick <hroenick at gmail.com> wrote:
> Hello guys.
>
> On Jul 10, 2015, at 2:20 PM, Hugo Roenick <hroenick at gmail.com> wrote:
>
>> PS: We are now addressing a new possible bug regarding the usage of ForwardRequest with concurrent request to a same CORBA Object (2 threads making request to the same CORBA reference). We will probably present this case as soon we isolate in a simple JacORB test case, and in a new jacorb-dev thread.
>
> We are still working on the isolated test case to provide you, but I'll present the ongoing understanding of the issue we are addressing here. This problem is also associated with the use of ForwardingRequest.
>
> This is a race condition case, that happens when 2 threads are calling a same CORBA object reference and both receives an exception and resolve it by a ForwardRequest. The point is that by handling the ForwardRequest, the ORB tries to rebind de connection and by doing so it will ask for the release of the connection (ClientConnectionManager.releaseConnection).
>
> There is a control logic (ClientConnection's counter named client_count) to prevent the release of the connection if there is other "clients" using this same connection. But apparently 2 threads making a request to the same object does not increment this client_count counter. Because of that, the connection is released (closed) making the second thread receive a COMM_FAILURE when handling the received exception concurrently.
>
> The question that we do not fully understand is what is the semantic on this client_count counter?
> We noticed that if we have a second CORBA object reference in this same connection, the counter is incremented, preventing the release of the connection and the COMM_FAILURE.
>
> Some snipped log messages from the failure case:
>
> ...
> INFO ClientConnectionManager: created new ClientGIOPConnection to 127.0.0.1:2089 (27daf5b)
> ...
> FINE ClientConnectionManager: releasing ClientGIOPConnection to 127.0.0.1:2089 (27daf5b)
> FINE wrote 272 bytes to 127.0.0.1:2089
> FINE ClientGIOPConnection to 127.0.0.1:2089 (27daf5b): close()
> WARNING Abnormal connection termination. Lost 1 outstanding replie(s)!
> FINE wrote GIOP message of size 360 to ClientGIOPConnection to 127.0.0.1:2089 (27daf5b)
> FINE Transport to 127.0.0.1:2089: stream closed Socket closed
> INFO Client-side TCP transport to 127.0.0.1:2089 closed.
> INFO ClientConnectionManager: created new ClientGIOPConnection to 127.0.0.1:2089 (12c03f7e)
> FINE [0/2] creating new thread
> FINE Unknown exception type java.net.SocketException with exception java.net.SocketException: Socket closed
> 2015-07-10 12:02:30 [FINEST] Exception: IDL:omg.org/CORBA/COMM_FAILURE:1.0 Request: signChainByToken
> FINE to_COMM_FAILURE: Caught exception
> java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:152)
> at java.net.SocketInputStream.read(SocketInputStream.java:122)
> at org.jacorb.orb.etf.StreamConnectionBase.read(StreamConnectionBase.java:110)
> at org.jacorb.orb.giop.GIOPConnection.getMessage(GIOPConnection.java:383)
> at org.jacorb.orb.giop.GIOPConnection.receiveMessagesLoop(GIOPConnection.java:543)
> at org.jacorb.orb.giop.GIOPConnection.receiveMessages(GIOPConnection.java:530)
> at org.jacorb.orb.giop.MessageReceptor.doWork(MessageReceptor.java:69)
> at org.jacorb.util.threadpool.ConsumerTie.run(ConsumerTie.java:60)
> at java.lang.Thread.run(Thread.java:744)
> FINE ClientGIOPConnection to 127.0.0.1:2089 (27daf5b): getMessage() -- COMM_FAILURE
> FINE ClientGIOPConnection to 127.0.0.1:2089 (27daf5b): streamClosed()
> FINE ClientGIOPConnection to 127.0.0.1:2089 (27daf5b): close()
> ...
> FINE invoke[<--]: SystemException
> org.omg.CORBA.COMM_FAILURE:
> at org.jacorb.orb.giop.ReplyPlaceholder.getInputStream(ReplyPlaceholder.java:127)
> at org.jacorb.orb.ReplyReceiver.getReply(ReplyReceiver.java:388)
> at org.jacorb.orb.Delegate._invoke_internal(Delegate.java:1475)
> at org.jacorb.orb.Delegate.invoke_internal(Delegate.java:1244)
> at org.jacorb.orb.Delegate.invoke(Delegate.java:1232)
> at org.omg.CORBA.portable.ObjectImpl._invoke(ObjectImpl.java:80)
> at myProject._AccessControlStub."concurrentMethod"(_AccessControlStub.java:511)
> at ...
>
>
> When we have more than one CORBA object references to the same connection, the error does not occur and we can notice this outputs in the log:
>
> ...
> INFO ClientConnectionManager: created new ClientGIOPConnection to 127.0.0.1:2089 (67528259)
> ...
> INFO ClientConnectionManager: found ClientGIOPConnection to 127.0.0.1:2089 (67528259)
> ...
> FINE ClientConnectionManager: cannot release ClientGIOPConnection to 127.0.0.1:2089 (67528259) (still has 1 client(s))
> INFO ClientConnectionManager: found ClientGIOPConnection to 127.0.0.1:2089 (67528259)
> ...
>
>
> I manage to debug my code and place a break point in the ClientConnectionManager.releaseConnection when the connection is about to be released to try to understand the call stack that lead us to this failure. This is the call stack:
>
> ClientConnectionManager.releaseConnection(ClientConnection) line: 161
> Delegate.rebind(ParsedIOR, Protocol[]) line: 754
> Delegate.rebind(Object) line: 703
> DefaultClientInterceptorHandler.invokeInterceptors(ClientRequestInfoImpl, short) line: 354
> DefaultClientInterceptorHandler.handle_receive_exception(SystemException, ReplyInputStream) line: 264
> ReplyReceiver.getReply() line: 461
> Delegate._invoke_internal(Object, OutputStream, ReplyHandler, boolean) line: 1475
> Delegate.invoke_internal(Object, OutputStream, ReplyHandler, boolean) line: 1244
> Delegate.invoke(Object, OutputStream) line: 1232
> _AccessControlStub(ObjectImpl)._invoke(OutputStream) line: 80
> _AccessControlStub.renew() line: 35
>
> Sorry for presenting all this information without the test case. But as we have 2 other issues related to ForwardRequest, I thought that it was appropriate to present this issue sooner. I don't think that these issues have a relating fix, but I'm not expert enough to assert this.
>
> Thank you all for all the time and understanding.
>
> Regards,
> -- Hugo
>
More information about the jacorb-developer
mailing list