Skip to main content
1-Visitor
January 7, 2014
Question

Windchill Server Status - Site Not Responding?

  • January 7, 2014
  • 4 replies
  • 9523 views

Hi all,

I've recently installed an monolithic instance of Windchill 10.1 M040 (SQL Server) and have spent several weeks tailoring it to our requirements, testing workflows & lifecycles document uploads, check ins & outs and so far everything's working well. However, I've just noticed that when I interogate the "Server Status" the report states the following;

File Servers: 0 available; 1 Unavailable

and the bottom of the report states;

https://<our winchill domain name>/Windchill/servlet/WindchillGWmasterSITE NOT RESPONDING2014-01-07 14:23:59.570 +00000%


?

I can ping the server without any issues.

Can anyone shed any light on what might be wrong?

Many thanks!

4 replies

14-Alexandrite
January 7, 2014

Are you using HTTP or HTTPS for your environment? Try setting the wt.fv logger to Debug and you should see a verbose error message on why it may be having an issue.


Follow this PTC TS document on how to change Log4J Logging quickly:
https://www.ptc.com/appserver/cs/view/solution.jsp?n=141146

1-Visitor
January 7, 2014

Thanks Tim,

We're using HTTPS.

I've set the logger to Debug (I used wt.util.jmxSetLogLevel -all Log4j Debug) and then waited for the next Server Status ping to occur, but trawling through the logs reveals no clues (at least to me).

That said, I'm not sure what it is I'm actually looking for, all I can say is that they look no different to me than they normally do with no obvious errors are jumping out.

14-Alexandrite
January 7, 2014

Try this (it has happened before with us on our Apache installation):

In $APACHE_HOME/conf/extra/, there is a file "modjk.conf"

Add the following line to the bottom of this file and restart Apache:
JkMountCopy All

This is something about multiple Tomcat processes.

22-Sapphire I
January 7, 2014

We always see this as well - it says SITE NOT RESPONDING in red as you show for some time. A bit later (maybe 2 minutes), it correctly shows connected. At about that time we get a JConsole email that says all is well with the vault. Meanwhile, users can work normally including all content operations. Somehow it appears to be a delay in how the server status page updates.

12-Amethyst
January 7, 2014

The basic reason for this is really, really simple:

When the method server pings https://<our winchill domain name>/Windchill/servlet/WindchillGW/wt.httpgw.HTTPServer/ping, it cannot successfully ping it, i.e. it does not receive a 200 response code to this URL request prior to the request timing out.

Which method server is doing this ping is largely indeterminant, I believe -- as this responsibility gets handed between method servers. I am pretty sure, however, that only foreground method servers do such pings.

If the foreground method servers cannot successfully do such a ping (e.g. you don't have a web server directly on each of your cluster nodes or you improperly configured to require authentication for this URL), then a failure will always be indicated.

Looking further into the code, "SITE NOT RESPONDING" indicates that other probing requests against https://<our winchill domain name>/Windchill also fail -- leading to the conclusion that the overall site (including simple anonymous static pages usually served by the web server) cannot respond (vs. the method server not being responsive, for instance).

1-Visitor
February 13, 2014

Hi Jesse,

Thanks for your response. how ever I'm left puzzled as to whether the is a solution for this momentaneous unresposiveness of the File server's method server. Reason I'm puzzled is that during this period I can ping the server and I can also remotely log onto the file server and even log onto Windchill from the file server workstation " with my servers it is exactly 5 minutes" and then the "Site Status Change Notification" saying the site is available again is received.

Thanks and Kind regards,

Tshepo

12-Amethyst
February 13, 2014

Well, the SITE NOT RESPONDING status indicates that pings to all of the following failed to respond with a 200 response in a (nearly) timely manner:

https://<our winchill domain name>/Windchill/servlet/WindchillGW/wt.httpgw.HTTPServer/ping

https://<our winchill domain name>/Windchill/servlet/WindchillGW/wtcore/test/dynAnon.jsp

https://<our winchill domain name>/Windchill/servlet/WindchillGW/wtcore/test/staticAnon.html

If you've routed requests for static pages through Tomcat, however, then all of these would be routed to a method server -- and thus one sufficiently unresponsive method server will result in SITE NOT RESPONDING.

Beyond this, my only guess would be that a background method server on a cluster node without any foreground method servers somehow decided to execute the ping -- which it shouldn't. That would either be a bug or a configuration error, but in either case that would be something to confer with technical support about so they can confer with the appropriate development team.

4-Participant
April 13, 2016

I had the same thing showing up after rehosting the server with a different URL. It also answered all of the pings and testing. I ended up importing in the certificate via the keytools import into the jssecacerts keystore and that fixed it for me.