Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

** Community Tip** - You can subscribe to a forum, label or individual post and receive email notifications when someone posts a new topic or reply. Learn more!
X

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Jul 07, 2021
01:52 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator

Jul 07, 2021
01:52 PM

Goodness of fit (evaluate the significance of the models with correlation coefficients)

Hello,

I have data X and Y. I have set up 4 different models for these data. I would like to evaluate the significance of the models. You can see in the graph that model 2 (green) is the best. I want to achieve this result not by looking, but by using a correlation coefficient.

My problem is that none of Mathcad's correlation coefficients give correct results. In Mathcad help, the Pearson correlation coefficient is often used for such purposes. From a workshop on Mathcad or from Wikipedia I know the general measure of determination. This provides plausible results. Only when the deviation is too large, as in model 1, does it deliver values greater than 1.

Does anyone have an idea why this is the case?

Thank you very much.

Best regards

Paul

Solved! Go to Solution.

1 ACCEPTED SOLUTION

Accepted Solutions

Jul 07, 2021
07:16 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator

Jul 07, 2021
07:16 PM

Here are my 2 cents:

MeanSquaredError seems to give you a nice value to determine the best fit (lowest value).

Keep in mind that Pearson is a measure for __linear correlation__ which might be the reason that model 2 has not the highest value. Its a measure for how good the data are near a straight line if you plot model... over Y (X is not used at all!). Given that, its looks by optical inspection that model 3 and model 2 have the best correlation in that respect:

3 REPLIES 3

Jul 07, 2021
07:16 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator

Jul 07, 2021
07:16 PM

Here are my 2 cents:

MeanSquaredError seems to give you a nice value to determine the best fit (lowest value).

Keep in mind that Pearson is a measure for __linear correlation__ which might be the reason that model 2 has not the highest value. Its a measure for how good the data are near a straight line if you plot model... over Y (X is not used at all!). Given that, its looks by optical inspection that model 3 and model 2 have the best correlation in that respect:

Jul 08, 2021
10:27 AM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator

Jul 08, 2021
10:27 AM

Hello Werner,

Thank you very much for your help at this point. I will use the MSE. It is robust and simply defined.

Just because of my personal interest: I had also read it in several places that Pearson only works for linear equations. However, time and again I see Pearson being used for non-linear equations as well. For example, the QuickSheet in Mathcad 15 on logarithmic regression.

Then I had the following idea. You have linearised the models in your representation. I used this linearisation and inserted it into Pearson. Now it should be valid and deliver correct results. But the values increase to over 1. I can only explain this by the error that occurs when fitting the straight line.

Many thanks and best regards

Paul

Jul 08, 2021
07:08 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Notify Moderator

Jul 08, 2021
07:08 PM

I am not sure and out of my comfort zone here, but, as I understand it, Pearson is a measure for the linear correlation of two sets of (1-dimensional) data, in your case Y and modelx. What you have are two 2-dimensional point clouds Y over X and modelx over X. Not sure if Pearson as you had defined it could help here.