cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

ROC and Confustion Matrix failed with wrong values

drichter
14-Alexandrite

ROC and Confustion Matrix failed with wrong values

Hi,

I'm working with Analytics 8.1 and get strange results with my data. To find the problem I try the "Analytics Builder Quickstart" tutorial with the vibrations data. Everything works find until the step "Generate and Enhance Model". The Model will be build but all ROC values and the Confusion Matrix are wrong.

The other stuff Profiles and Signales are ok.

26 REPLIES 26
cmorfin
19-Tanzanite
(To:drichter)

Hi David

Could you please clarify at what step of "Generate and Enhance Model you get this result.

Is it at the first model creation or after some of the revise steps ?

Also could you please attach the screenshot of:

- Help > About from Composer

- Settings > Verify Configuration from Builder

- Model Details in Builder form the view page of the model as seen in your screenshot

- Model Configuration  in Builder form the view page of the model as seen in your screenshot

We can then test here with the same version to check if we see the same thing.

Regards

Christophe

drichter
14-Alexandrite
(To:cmorfin)

Hi Christophe,

this result comes by creating the model, but also by enhance the model.

Help > About from Composer (Version: 8.1.1-b108)

Settings > Verify Configuration from Builder (Version: 8.1.040000)

Model Details

Model Configuration

So I hope this is helpful.

cmorfin
19-Tanzanite
(To:drichter)

Hi David

Thank you for this.

Regardign the Model Details screenshot, the one you posted is not theone I wanted. The one you posted is the Model Job details, but I am after the model details. you can find it by:

- selecting the model in the model list page

- select View

- select the Model Details button

Also could you please, send screenshots of your extensions : in Composer Import/Export > Manage

Thank you

Kind rgeards

Christophe

drichter
14-Alexandrite
(To:cmorfin)

So I hope this is the right one:

Composer Import/Export > Manage

I'm getting the same results after building the model on Thingworx 8.1.0-b52 and Analytics 8.1.04 (ROC = 0.5 for false positives)

Anyone can clarify / explain such results?

cmorfin
19-Tanzanite
(To:DmitryTsarev)

Hi Dmitry

This is odd because I do not reproduce this result either with ThingWorx 8.1.0 or 8.1.1 and Analytics 8.1.

Is it possible (or have you already tried) to create a new model (just give a new different name) with same dataset ? do you get the same 0.5 result with the new model ?

Also what happen if you upload the data again into a new dataset and create a new model ?

If you always get the 0.5 results, could you possibly upload the dataset (json + csv) you used ?

I understand this is the one from the QuickStart guide but since I am not reproducing the issue with it, maybe something happens at some point on that dataset , so I would rather have your json and csv in a zip file.

Also are you using your own ThingWorx and Analytics instances or are you using some trial hosted ones ?

If you are usign your own:

- what is the OS used for both ThingWorx and Analytics server ?

- what is the regional settings / locale environment of those servers ?

Thanks

Kind regards

Christophe

drichter
14-Alexandrite
(To:cmorfin)

I try this with so many times with different model names, with different datasets, with different goals (boolean types).

Here some of used datasets:

- AnalyticsTestStream.zip (goal: OverheatingError)

- analytics_vibration (test data of the current tutorial, goal: low_grease)


Both instances run on the same machine: Ubuntu 16.04.3 LTS

Regional settings: Deutsch (Deutschland) -> German (Germany)

Hi Christophe Morfin

Thank you for the prompt response. I created a new dataset and a new model - got the same result. Please see the dataset (.csv and .json), as well as the video showcasing the process and the result, attached to this post.

I'm doing the tests on my development machine - it's RedHat 7.4 (both TW Foundation and Analytics are installed on the same PC). Regional settings and date / number formats are set to German (showcased in the video).

Given Christophe Morfin​ 's interest in Regional Settings and seeing that both I and David Richter​ have German (which has "," (comma) instead of "." (point) as the decimal separator), I changed data format to English / UK, and was pretty sure that it was the cause. But re-uploading the dataset and recreating the model didn't help - still no luck.


PS I only did logout / login after changing the data format, which should be enough, but probably rebooting is still needed and would resolve the issue.

drichter
14-Alexandrite
(To:DmitryTsarev)

Maybe the regional settings have something todo with this problem but I think the "." as decimal separator come from java or javascript not from the regional settings.

It would be interesting if other german people have the same problem or if some non-german have the problems too?

cmorfin
19-Tanzanite
(To:drichter)

David Richter​, Dmitry Tsarev

Thank you very much for your input above.

I went on and made additional testing using a Linux (CentOS) with German locale, but I was still unable to reproduce the ROC=0.5 result.

Is it possible that you send me the following data:

- javaps.txt file created after executing the command: ps -ef | grep java > javaps.txt
- a zip of ThingWorxStorage/logs
- a zip of /opt/ThingWorxAnalyticsServer/data/logs

- also do you use a docker or native installation of ThingWorx and Analytics ?


Thank you

Kind regards

Christophe


drichter
14-Alexandrite
(To:cmorfin)

So I have only the log of analytics and the javaaps.txt. The ThingWorx log is not realy interessting I think.

ThingWorx and Analytics both are native installed.

cmorfin
19-Tanzanite
(To:drichter)

Hi David

Thank you for those log, could you still upload the ThingWorx log.

I am particularly interested in the ApplicationLog.log with the startup phase.

Thank you

kind regards

Christophe

Christophe Morfin​ here are my logs and ps output

I'm using native installation.

There are several "Wrapped com.thingworx.exceptions.CouldNotConnectException: redhat-vm.irisoft.ru-AnalyticsServer_DataThing" errors in the logs - they occur, as far as I understand, because of TW Foundation and Analytics start order. I haven't sorted this out gracefully yet, so I just restart all the twas-* services after TW Foundation is up.

cmorfin
19-Tanzanite
(To:DmitryTsarev)

Hi Dmitry

Thank you for this data.

in the javaps.txt output file I do not see the java process for Tomcat. You did mentioned that both ThingWorx and Analytics are installed on the same machine so I would expect a Tomcat process for ThingWorx.

Do you know why that is ?

If ThingWorx was not started at the time , could you start it and send me a new javaps.txt file ?

Thank you

Christophe

Sorry, my bad. Tomcat indeed wasn't started at the time. Here is the new javaps.

As a side note to make things clear... I don't encounter any unexpected results in other areas of Analytics I tested - namely, Signals and Profiles work and provide results as expected.

Also worth mentioning... I haven't switched back to German data format after switching to English / UK yesterday.

cmorfin
19-Tanzanite
(To:DmitryTsarev)

Dmitry Tsarev​, David Richter

Thank you for the data you sent.

I have made multiple test, trying to use the same jdk and tomcat version you are using, play with settings, but I just can't reproduce the issue.

I talk to R&D about it and they do not have much info so far. However the ROC computation is very different in the forthcoming release 8.2, so they propose that you wait and try again in 8.2 (should be out early February) to see how it then behave.

Let me know if you have comments.

I would also be interested din getting the name of the installer you used - there is indeed an issue with the version as it appears in the ui and both 8.1.0 and 8.1.1 report the same version in the UI. I nonetheless have tried with both version with the same correct result, but I'd like to be sure of the version you are using.

Thanks

Christophe

Christophe Morfin​ thank you for the investigation.

The name of the installer I used i s'ThingWorxAnalyticsServerForLinux\ThingWorxAnalyticsServer-8.1.1-linux-x64-installer.run'

I've just tried to create the same model with the same dataset on another system I have at hand - Ubuntu 16.04 LTS, and the issue reproduced.


I'm currently only playing with Analytics - no serious pre-sale deadlines, so waiting for 8.2 is not a problem. I'll report back here with the results as soon as it's out and I do the tests.


Meanwhile, if you're willing to spend a bit more time on this... Maybe you could share another dataset and the resulting correct ROC I'll try it on the malfunctioning systems?

Christophe Morfin

I've just installed TW Foundation + TW Analytics and got the same result as before.

cmorfin
19-Tanzanite
(To:DmitryTsarev)

Hi Dmitry

Just to clarify, when you say you installed Foundation and Analytics, do you mean release 8.2 of both software ?

If yes did you install release 8.2 of the Analytics extension too ?

Could you maybe post a new screenshot of the result in 8.2 ?

Also could you clarify what deployment you used: native or docker and is it on Windows or Linux (in case there are some changes compare to your previous tests) ?

Note that 8.2 has a native Windows installer which can be of interest. You do need to be aware of article CS279183 though if you are using a non English OS.

Thanks

Christophe

Hi Christophe Morfin

Sorry for the unclear message. Yes, I installed / upgraded both TW Foundation 8.2 and TW Analytics 8.2

I keep using the RedHat 7.4 machine (data format is still set to English / UK since I changed it back then during the initial attempts to pinpoint the cause of the issue).

I used native deployment.

I didn't observe the issue described in the article you mentioned and article claims that it's Windows-only issue.

Please see some screenshots in the attachment

Christophe Morfin

Apparently there is a small but probably valuable difference with 8.1 (I overlooked it when making screenshots) - the ROC value itself is now calculated correctly (or, at least, looks so) - it's 0.8890 (used to be 0.5000 with 8.1).

But the ROC table and graph are still wrong, as well as the Confusion Matrix (there are no False and True Negatives).

I opened the TS case - maybe TS guys and gals have something to suggest.

cmorfin
19-Tanzanite
(To:DmitryTsarev)

Hi Dmitry

Thank you for your post.

I did notice the difference with the ROC and it is expected indeed.

I was talking to my colleague who has got your case so we can work together since I have made a lot of testing on this issue.

I did make some progress though.

I was able to reproduce the issue in 8.2 on my Windows machine.

However I do reproduce only when the regional settings are not English (I tried German and French).

In the video you attached to the case , you mentioned you change the data format to English but have got the Regional Settings to German, so maybe this need to be changed too.

I would think there is still something in non English on your machine that still make you having the issue.

I have not been able to isolate what specific settings in Regional Settings give the issue, I know it is not the decimal marker as I was expecting though.

I am still making more test on this, but wanted to bring you an update on the status thus far.

Kind regards

Christophe

cmorfin
19-Tanzanite
(To:cmorfin)

Hi Dmitry Tsarev​, David Richter

I have now been able to reproduce also on Linux.

I overlooked a point during my test and that is also probably why changing to English did not work for Dmitry.

The Regional Settigns hsoudl not be chanegd via the UI, as this wil lchaneg only the currently logged user environement.

ThingWorx Analytics runs a twxanalytics user not the logged one.

So the language needs to be changed for all users.

This is done by modifying /etc/locale.conf on CentOS / RedHat

See https://www.cyberciti.biz/faq/how-to-set-locales-i18n-on-a-linux-unix/ for settings on Ubuntu.

So if you update the settign for all user to English and restart, then the result will be fine.

I have reported this to R&D, see https://www.ptc.com/en/support/article?n=CS279443

Kind regards

Christophe Morfin

Whee! This nails it down!

The only issue I'm still experiencing is with the ROC matrix. It has the same TP / FP values for all thresholds. I just retrained the existing model - perhaps starting from scratch will resolve it?

I'm just now trying to avoid setting locale system-wide and set it only for twxanalytics user.

cmorfin
19-Tanzanite
(To:DmitryTsarev)

Hi Dmitry

Good to see it worked for you now.

The fact that all values are the same for all threshold is a different issue and restarting from scratch will not change anything.

This is currently being investigated by R&D.

The ROC curve though does show the best threshold value that is retain for the model, so you do get the relevant information, but the table does look indeed odd.

Thanks

Christophe

Announcements