extracting data point from the plot and then some

jasius-disabled · ‎Sep 09, 2009

Hi,

I would like some help and advice with the following problem:

I have two data series plotted (see attached). Those are the point I obtained experimentally. My first question is if I can connect those point with a line (scatter plot connected by lines, somehow I couldn't) and extract that point data with a selected interval between the points (so my data instead of scatter becomes linear plot)

and second

I need to show the difference between those two data series. they are comprised of two parabole, on centered at ~1.1 and the other one at 1.5. I would need somehow to show where and when this inflection at 1.3 appears (I have more data series in between and at some point parabole at 1.3 moves down and this inflection apperas). I need to somehow show that onset. Would first or second derivative show that? How do I proceed with this data (that's why I wanted to extract linear points to the try derivatives on them

hectic but would appreciate any help

JOnas

Al2000 · ‎Sep 09, 2009

On 9/9/2009 2:39:04 PM, jasius wrote:
>Hi,
>
>I would like some help and
>advice with the following
>problem:
>
>I have two data series plotted
>(see attached). Those are the
>point I obtained
>experimentally. My first
>question is if I can connect
>those point with a line
>(scatter plot connected by
>lines, somehow I couldn't)...
...

The points in the plot don't connect 'cause you're plotting 1-row arrays instead of vectors. Change your data to vectors or plot the transposes.

Showing the inflection point depends on how the data is generated. If you know the location of the inflection point just use markers or plot an (x,y) pair showing the inflection point.

If you don't where is the inflection point it must be located using some numerical method, ranging from something as crude as looking for the maximun value in some interval to fitting a curve and getting it's max.

Saludos,

Al

TomGutman · ‎Sep 09, 2009

Mathcad represents vectors by one column matrices.
__________________
� � � � Tom Gutman

jasius-disabled · ‎Sep 09, 2009

Gents,

I have 2001 version so couldn't open the second attachment.

I do want to show the inflection but not simply on the image, but numerically, possibly doing the differentiation. So how do I do that differentiation with the data I plotted?

And, Tom, I am not following you at all, please be simple

Al2000 · ‎Sep 09, 2009

On 9/9/2009 4:39:40 PM, jasius wrote:
>Gents,

I have 2001 version so
>couldn't open the second
>attachment.

I do want to show
>the inflection but not simply
>on the image, but numerically,
>possibly doing the
>differentiation. So how do I
>do that differentiation with
>the data I plotted?

Down left is a simple numerical analysis to find the point of minimal curvature. To the right is the fit of one polynomial as you sugested in your first post about the origin of your data. Use the linfit function of MC if you prefer.

Saludos,

Al

TomGutman · ‎Sep 09, 2009

It helps to specify the version of Mathcad that you are using up front. The various versions are not compatible. My post was a bit cryptic, but the information was supposed to be in the attached file.

It may be just as well that you were unable to open it, as it turns out that I posted the wrong file anyway. Here is the file again, modified to work in MC11 (will probably work in MC2001) and in MC2000 format.
__________________
� � � � Tom Gutman

ptc-1368288 · ‎Sep 09, 2009

Looking at the data plot you won't get a function for that. At the best, it might be possible to find the best interpolating method. From there, if the interpolation looks good enough to the eye, it would be easy to export a well populated data table. Then the question is: what do you want ? The best interpolation ? If you just want the point of the minimum value for each data only, it is sufficient to fit the segment and find the minimum analytically.

jmG

jasius-disabled · ‎Sep 09, 2009

All I want to somehow tell when this point of inflection starts. I have several data sets in between those two. So I need to apply some method to distinctively tell when it's onset begins.

To do that I thought I would need to extrapolate data between my points and re-plot it. so hence the first part of the question

jasius-disabled · ‎Sep 09, 2009

Come to think of it you are right: I need a well populated table and then somehow numerically determining when this 1.3 inflection becomes "significant". Again, I was thinking about derivatives

Al I think somewhat answered my first part of the question

Al2000 · ‎Sep 09, 2009

On 9/9/2009 5:57:33 PM, jasius wrote:
>Al I think somewhat answered
>my first part of the question

Notice that if you fit the 4 degree polynomial, then your wanted inflection point is simply -a_3/(4*a_4), about 1.28 with your data.

You should decide if it is proper to use a higher order fit. A 5 degree polynomial give a slightly lower standard deviation and you can easily solve the second degree equation to find the point of inflection. That's up to you and how you need/want to model your data.

I can't help with the rest of your question 'cause I simply don't understand what are you looking for.

Saludos,

Al

Al2000 · ‎Sep 10, 2009

Your references to two parabolas was intriguing me...

TomGutman · ‎Sep 10, 2009

Basic algebra -- every polynomial over the reals factors, over the reals, into linear and quadratic factors. Thus evey quartic polynomial can be expressed as the product of two quadratic polynomials. This factorization is not unique in that:

a) if the quartic factors into four linear factors, you can take these in any pairwise combination to get quadratic factors, and

b) you can scale the two quadratics by any real number and its reciprocal.

b) is usually dealt with by working with monic polynomials, adding a single scale factor to the overall result, if needed. a) is an inherent property of a four root quartic.
__________________
� � � � Tom Gutman

ptc-1368288 · ‎Sep 09, 2009

On 9/9/2009 5:57:33 PM, jasius wrote:
>Come to think of it you are
>right: I need a well populated
>table and then somehow
>numerically determining when
>this 1.3 inflection becomes
>"significant". Again, I was
>thinking about derivatives
>
>Al I think somewhat answered
>my first part of the question
______________________________

1. "data" is your data set
2. "Data"is a ranged part of data (left entire)
3. 4th order polynomial fit is just great
5. "u" is the the independent, discretized (discretize at will)
5. Deltafit is the rate of change over the range.
It is the "cumulative rate of change".

Does that help ?

jmG

ptc-1368288 · ‎Sep 09, 2009

Your project is now complete as per my understanding;

1. The data range is not needed, the excellent polynomial fit.
2. The cumulative "rate of change" is given
3. The LocalMax/LocalMin is added c/w the functions.

Locate visually on the graph, collect.

jmG

ptc-1368288 · ‎Sep 10, 2009

... last and major detail:

You mentioned several columns of data covering a range of experiments. Easy to calculate all the fits, then on the same principle of multifit, collect all rate of change and with a bit of twist or manually, collect all LocalMin/LocalMax.

jmG

jasius-disabled · ‎Sep 10, 2009

When I open your attachment it has two red plots with no data. How is that complete?

I thank you all for your help. Honestly, that didn't do anything I wanted as you probably couldn't understand my questions and I got lost in your explanations. But the fact that you all jumped in is really flattering and shows great strength of this collaboratory

ptc-1368288 · ‎Sep 10, 2009

>When I open your attachment it has two red plots with no data <<br> _____________________

Which attachment, as you had several from jmG and more collabs !
How collabs can know which one you are talking about ?

jmG

ptc-1368288 · ‎Sep 09, 2009

In the attached saved 2001, the "data" table is your two graphs. The polyline produces same fillet as AutoCad. I have used the 16 points and after discretizing, 1800 points (in fact 1600 = 100*16). You can track the traces with the tracking tool and get your project. Please read and come back with more specifics of what you are looking for. The blue bump is max at 1.2685, 1.112, at that 1.2685 the black graph is 4.3675

jmG

RichardJ · ‎Sep 10, 2009

Is this what you are trying to do?

This could also be done using the quartic fit Tom posted. I used splines just for generality.

Richard

jasius-disabled · ‎Sep 10, 2009

So actually there might be some ray of light for me (that's what I decided from the last post). I added to it all of the experimental values I have and all the z concentrations for the inflection point search

I didn't know how to finish it up, though

Just to be sure, we are talking about the inflection at 1.3, not about those at 1.1 or 1.5 (bottoms of parabola)

thanks for all of the help

JOnas

RichardJ · ‎Sep 10, 2009

On 9/10/2009 5:30:12 PM, jasius wrote:
>So actually there might be
>some ray of light for me
>(that's what I decided from
>the last post). I added to it
>all of the experimental values
>I have and all the z
>concentrations for the
>inflection point search
>
>I didn't know how to finish it
>up, though

To do that I need to know what the numbers 21, 55, 300, 111, etc mean. We need to be able to create a vector of data values in the third dimension.

>Just to be sure, we are
>talking about the inflection
>at 1.3, not about those at 1.1
>or 1.5 (bottoms of parabola)

It's at about 1.23, not 1.3. Look at z_7, which is close to the point you are looking for (unless I misunderstand the question).

Richard

RichardJ · ‎Sep 10, 2009

On 9/10/2009 6:42:56 PM, rijackson wrote:

>To do that I need to know what the
>numbers 21, 55, 300, 111, etc mean. We
>need to be able to create a vector of
>data values in the third dimension.

Don't bother. I just scrolled down the worksheet and found it. I am curious about what a concentration of 300 could mean, but I don't need to know to get this to work.

Richard

jasius-disabled · ‎Sep 10, 2009

z values are solvent conductivity values (epsilon in the literature). different conductivity solvent gives me different position of proton in quantum chemical calculations. as I pointed out before, every curve has a shape of two parabola in them, e.g. energy minima on the potential energy surface. species can only be stable if it has a barrier (inflection) on that PES curve. so I want to know when this inflection becomes apparent so I can tell at which epsilon values the species on the left becomes stable instead of rolling down on the PES to the lowest energy minima (right one)

RichardJ · ‎Sep 10, 2009

Thanks for the explanation. That satisfies my curiosity, and also tells me that I was correct about what you wanted to find from the data.

Although, unless your system has zero internal energy then the potential barrier needs to be more than just "greater than zero" to stop your system from going mostly over to the right.

Richard

ptc-1368288 · ‎Sep 10, 2009

>I want to know when this inflection becomes apparent so I can tell at which epsilon values the species on the left becomes stable instead of rolling down on the PES to the lowest energy minima (right one)<.
____________________________

As explained above: you will never know from spline ! Do the effort of reading my work sheet and range the data 3...13 and see a near perfect fit. From there, you can narrow the max bump 1.1925 from erroneous data collection.

jmG

TomGutman · ‎Sep 10, 2009

There are a few problems trying to use your data.

You need to do a 2D interpolation. But that (without getting into a lot of trouble) requires representing the data as a single matrix, with the rows representing values of a and the columns representing values of z (or vice versa -- one can easily transpose a matrix). But you vector of values have different numbers of entries for different z values. You need to have your data on a common basis to be able to collect it and analyze it in toto.

You seem to have some confusion in terminology. You keep talking about points of inflection at points where you clearly have extrema, something quite different. A point of inflection is a point at which the curvature changes sign, meaning that the second derivative changes sign (and is therefore zero, often extended to encompass points where the second derivative goes to zero even if it does not change sign). An extrema is a local maximum or minumum, diagnosed as where the first derivative changes sign.

In general maxima and minima alternate, with points of inflection between them. Your original b data (red curve) shows one minimum at 1.499 and two points of inflection at 1.195 and 1.371. The c data (blue curve) shows three extrema, two minima at 1.123 and 1.474 and a maximum at 1.268 with two points of inflection at 1.187 and 1.39. I gather you are interested in the value of z at which the first two extrema merge into a single stationary point of inflection.
__________________
� � � � Tom Gutman

TomGutman · ‎Sep 10, 2009

Another issue with your data. I notice that each of your data sets has a minimum value of exactly zero. That seems to be unlikely in real world terms, and suggests that each data set has undergone some independent normalization relative to its own minimum. Fine for each individual data set, but analysis of the ensemble is likely to work better if all the data are on a common basis.

BTW, are these data really experimental data (actual measurements) or are they calculated values based on some model? If modeled, is it a model that it would be reasonable to do in Mathcad? Then one could do calculations on the actual model and not just interpolation through selected points.
__________________
� � � � Tom Gutman

ptc-1368288 · ‎Sep 10, 2009

>So actually there might be some ray of light for me<<br> ___________________________

Just a big bit darker. Your data set should be like usual, i.e: in table format. How can you get 10 digits from data collection, then round. Splining on erroneous data is no better than a good fit. You can see that from the contested max bump that there will be no answer. The 4th order polyfit makes sense. The best fit ? none of the proposer pretended so.
Don't worry, any fitting session starts by a long preach.

jmG

jasius-disabled · ‎Sep 10, 2009

Oh my... Now I feel guilty that I posted this question here...

Anyways, the data is calculated quantum chemically, no relation to Mathcad. I get energy value which of course do not have zeroes. What I showed here was normalized data, where the lowest values is set to zero.

TomGutman · ‎Sep 10, 2009

No need to feel guilty about posting here. It's a learning experience. And do remember that not everything that is posted here is correct -- collaboratory members each speak for themselves, no one else. The general rule is caveat emptor. If you don't understand something, ask. You might get an explanation, you might discover that there is nothing to understand.

There is nothing wrong in using Mathcad to analyze results calculated elsewhere. If you could do the calculations in Mathcad, you'd have some additional avenues of approach and you could directly play with the z parameter (BTW, Mathcad supports Greek letters, if the literature uses ε it may be easier for you (and possibly those familiar with the literature) if you use that also). If, for whatever reason, it is impractical to do the calculations in Mathcad you have to rely on some sort of fitting and interpolation.

Whether the unnormalized data would be better is unclear. Sometimes the original data has simple functional forms that make fitting work better. Sometimes not. But it's still better to start with the original data (if available), as it's easy enough to normalize relative to the smallest value. It's not so easy (if at all possible) to recover the normalization factor, once it's been applied.

Another consideration for calculated data. Retyping data tends to be time consuming and error prone. Not to mention boring. Mathcad has reasonable flexibility in reading files. If you can post the data as data files, as produced by the calculating system, you can usually get Mathcad to read the data, and then manipulate it as needed. Worst case is writing scripted components to read and parse data, as VBScript has goo facilities for dealing with arrays and text strings.
__________________
� � � � Tom Gutman