Community Tip - Your Friends List is a way to easily have access to the community members that you interact with the most! X
We have about 8 or 10 Kepservers communicating to ThingWorx (and have for several months), and recently started having regular crashes(multiple times a day) of Tomcat in our production environment. It doesn't hang or get slow, Tomcat just stops abruptly. I noticed the crash log from Tomcat listed a boatload of blocked threads, I've attached a sample here. Through lots of trial and error (and much help from PTC support), we identified that it must be somehow related to our remote connections, because with the WSExecutionProcessinng subsystem turned off for several days we had no issues.
We stopped the actual crashing by adding the JVM argument "-XX:-UseAESIntrinsics" after deep googling and shots in the dark led us to find a lot of similarities with this issue. However, the underlying issue I think is still happening, as we still get daily errors in the Tomcat std_err log that look like the below.
I'm running thin on things to investigate (but do still have a few), but has anyone here seen anything like this? Or potentially have insight? Been through most of the obvious stuff and working with support a few weeks on this and I feel we're running dry. Thanks in advance!
example stomcat8-stderr log - What would cause this?:
Oct 17, 2017 4:04:58 AM org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer doClose
INFO: Failed to close the ServletOutputStream connection cleanly
java.io.IOException: An existing connection was forcibly closed by the remote host
at sun.nio.ch.SocketDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(Unknown Source)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
at sun.nio.ch.IOUtil.write(Unknown Source)
at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
at org.apache.tomcat.util.net.SecureNioChannel.flush(SecureNioChannel.java:134)
at org.apache.tomcat.util.net.SecureNioChannel.close(SecureNioChannel.java:370)
at org.apache.tomcat.util.net.SecureNioChannel.close(SecureNioChannel.java:398)
at org.apache.coyote.http11.upgrade.NioServletOutputStream.doClose(NioServletOutputStream.java:138)
at org.apache.coyote.http11.upgrade.AbstractServletOutputStream.close(AbstractServletOutputStream.java:140)
at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doClose(WsRemoteEndpointImplServer.java:143)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.close(WsRemoteEndpointImplBase.java:638)
at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.onWritePossible(WsRemoteEndpointImplServer.java:118)
at org.apache.tomcat.websocket.server.WsRemoteEndpointImplServer.doWrite(WsRemoteEndpointImplServer.java:81)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.writeMessagePart(WsRemoteEndpointImplBase.java:450)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessage(WsRemoteEndpointImplBase.java:338)
at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.startMessageBlock(WsRemoteEndpointImplBase.java:270)
at org.apache.tomcat.websocket.WsSession.sendCloseMessage(WsSession.java:570)
at org.apache.tomcat.websocket.WsSession.onClose(WsSession.java:510)
at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.close(WsHttpUpgradeHandler.java:183)
at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.access$200(WsHttpUpgradeHandler.java:48)
at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler$WsReadListener.onDataAvailable(WsHttpUpgradeHandler.java:214)
at org.apache.coyote.http11.upgrade.AbstractServletInputStream.onDataAvailable(AbstractServletInputStream.java:198)
at org.apache.coyote.http11.upgrade.AbstractProcessor.upgradeDispatch(AbstractProcessor.java:96)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:663)
at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:223)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1517)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1474)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Unknown Source)
Solved! Go to Solution.
Hi Alex,
I was working with you on this case a little while ago and it looks like the log messages you posted are a result of the garbage collector settings we changed to stabilize your environment. We changed the garbage collection java option from its default of UseG1GC which will result in similar errors.
I would also like to emphasize the fact that the log messages are at the "INFO" level and not "ERROR" or "SEVERE".
If the current setting is keeping your instance from flooding the websocket execution processor, you can keep the setting because this is only an Info level message rather than an error. This shouldn’t have adverse effects on performance.
Thanks,
Saeed
Hi Alex,
I have not seen this problem before. Any insight from the Community would be welcome here, especially if any users have encountered an issue like this before and have a suggestion for a resolution.
In the meantime, the best path forward for this will be to continue working with the ThingWorx Tech Support team within your current case.
Best regards,
Steven M
At times, Tomcat logs may not give full picture. Can you check your Application.logs to see any additional information available there?
Thanks,
Varathan
Hi Alex,
I was working with you on this case a little while ago and it looks like the log messages you posted are a result of the garbage collector settings we changed to stabilize your environment. We changed the garbage collection java option from its default of UseG1GC which will result in similar errors.
I would also like to emphasize the fact that the log messages are at the "INFO" level and not "ERROR" or "SEVERE".
If the current setting is keeping your instance from flooding the websocket execution processor, you can keep the setting because this is only an Info level message rather than an error. This shouldn’t have adverse effects on performance.
Thanks,
Saeed