cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - You can subscribe to a forum, label or individual post and receive email notifications when someone posts a new topic or reply. Learn more! X

Using Thingworx Analytics to find correlation between fields in a dataset

DmitryTsarev
17-Peridot

Using Thingworx Analytics to find correlation between fields in a dataset

How TW Analytics could be used for finding dependancies / correleations between fields in a dataset.

I.e. for the dataset on the screenshot below I would expect to get insights like

 

If x2=3 then x1 is guaranteed to be 7

If x5=3 and x4!=2 then x1 is rather likely (80%) to be 8

If x1=6 then x3 is most likely (~85%) to be 7

and so on

 

The first approach that came to my mind is that this is basically a categorical (ordinal?) data, but currently handling it in TWA is only possible using custom code / mashups etc

 

So then I thought about adding another boolean column, say, 'target' and set it to '1' for a given value of a field (say, for x1=8) and then dropping the x1 field from the dataset.

 

This kind of works, but requires a bit of overhead activities.

 

Do I understand the things right?

Am I missing some other approaches?

 

ACCEPTED SOLUTION

Accepted Solutions
jgreiner
13-Aquamarine
(To:DmitryTsarev)

Hi Dmitry,

 

I am following up on this post and wanted to let you know that a feature request (JIRA TA-2595) has been opened requesting Signals and Profiles jobs to handle Categorical and Ordinal Goals.

 

Thanks,

 

John

View solution in original post

5 REPLIES 5

Just wanted to clarify a bit...

When using "dataType": "INTEGER", "opType": "CONTINUOUS" the model Analytics builds seem to be correct. And I can even run predictive scoring against the model (seem to work correct).

 

The problem is that Profiles and Signals don't show the expected insights.

On the Signals screenshot below (goal set to x1 when building the model) Analytics rates x3 field with value of 22 very low, but actually x3=22 gives 100% chance that x1=1

Not to mention that x3 doesn't even appear in Profiles (because of that MaximizeGoal true/false parameter).

 

Should I really handle this with opType=categorical?

 

Or it should work with Integer and I just missunderstand some vital concept?

jgreiner
13-Aquamarine
(To:DmitryTsarev)

Hello,

 

I have reached out to our (PTC) Dev team regarding  your questions below and will get back to you when I hear back from them.

 

Please let me know if you have any other questions or concerns in the meantime.

 

Warm Regards,

 

John

Hi John

 

Thanks for the response. I've actually gave up on the issue and settled down for waiting the release which supports categorical type of goals to validate my assumptions.

 

So far I concluded that the currently implemented analytics with integer goal are suited for the cases like "how to maximize the amount of goods produced daily" (where goal would be the Integer value reflecting amount of produced goods) or "how to minimize the amount of defective goods" (goal - Integer value reflecting amount of defective goods).

 

And in my case I really don't want to maximize or minimize the goal, I just need to find correlation between particular values of the goal and other fields. So now it looks to me that for such kind of problems, indeed the categorical type of goal is the way to go.

 

 

Best regards,

Dmitry

jgreiner
13-Aquamarine
(To:DmitryTsarev)

Hi Dmitry,

 

I am following up on this post and wanted to let you know that a feature request (JIRA TA-2595) has been opened requesting Signals and Profiles jobs to handle Categorical and Ordinal Goals.

 

Thanks,

 

John

Thank you for the update, John, much appreciated.

Announcements


Top Tags