Nested Arrays - shape function

StuartBruff · ‎Oct 18, 2016

It's time for my semi-regular update, rationalization and general cogitation on the whichness and the whereforeness of multi-dimensional arrays (MDAs) in Mathcad. This time, I'm going to start off with a revamp of my Nested Array support functions. To kick off the process, I'd like to present the shape function. shape is intended to specify the structure of a nested array with the intent of being able to both generate and populate an array given this specification.

shape(A😞 returns the number of elements in each dimension of A.

If A is not an array (eg, a scalar, string or function), then shape returns 0.
If A is a vector or (non-nested) array, then shape returns a 2-vector giving the number of rows and number of columns in A.
If A is nested and all the elements of A are arrays of the same shape, then shape returns the shape of (one of) the elements.
If A is nested and the elements of A have different shapes, then shape returns an array showing the shape of each nested element.

shape acts recursively on each nested element of A.

Following the local function definitions, shape checks if A is an array, and if it isn't then it returns 0 (the shape of a scalar, string or range). If A is an array, then shape further checks if A is nested. If A isn't nested, then shape returns a 2-vector giving the number of rows and the number of columns in A.

If A is nested, however, then shape vectorizes itself over A to return an array (the subshape array) containing the shape of each of A's elements. It then checks if those shapes are identical. If they are then, shape returns a single one of the shapes, otherwise it returns the whole of the subshape array.

All comments gratefully received.

Stuart

LucMeekes · ‎Oct 18, 2016

OK, it's a bit of nagging, but:

The function does not take care of the fact that ORIGIN might be > 0...

Luc

StuartBruff · ‎Oct 19, 2016

LucMeekes wrote:

OK, it's a bit of nagging, but:

The function does not take care of the fact that ORIGIN might be > 0...

Luc

I included a (literally) last minute variant that should have been ORIGIN-independent, which I only gave a cursory check. Does it not work?

Stuart

LucMeekes · ‎Oct 19, 2016

Hmm, weird.

My comment was based upon the picture (I hadn't opened the attachment yet).

From the looks of it I concluded (due to A[0,0 and nextlevel[0,0 ) that it would not work if ORIGIN=1 e.g.

Now I opened the attachment, and tried it out. The shape() function works, surprisingly.

Is that due to the fact that it's defined globally? Yes! (still weird) If I change the

to an :=

then the shape function doesn't work anymore.

Notice that many of your utility functions don't work either with ORIGIN=1, because they are defined with :=.

Luc

StuartBruff · ‎Oct 21, 2016

LucMeekes wrote:

Hmm, weird.

My comment was based upon the picture (I hadn't opened the attachment yet).

From the looks of it I concluded (due to A[0,0 and nextlevel[0,0 ) that it would not work if ORIGIN=1 e.g.

Now I opened the attachment, and tried it out. The shape() function works, surprisingly.

Is that due to the fact that it's defined globally? Yes! (still weird) If I change the

to an :=

then the shape function doesn't work anymore.

Notice that many of your utility functions don't work either with ORIGIN=1, because they are defined with :=.

Hi Luc,

Do you mean the ORIGIN-independent version doesn't work either? (I'm Mathcadless for the next few days, so can't check)

I'm almost exclusively an "ORIGIN = 0" person, so tend to write as such. I do have an ORIGIN-independent form for most of my functions (which I check using -9 and +9 ORIGINs). When I do use ORIGIN, I usually explicitly define it at various points throughout the worksheet, rather than using the dialog, and put my (collapsed) Utilities Area at the top, so it's not normally a problem for me.

I used the global define operator because shape is better defined after its first use in my current nested-array development worksheet, and the "shape" worksheet is simply an extract from the development worksheet.

Stuart

LucMeekes · ‎Oct 21, 2016

I never use the dialog, I put an

ORIGIN:=1

at the very top of the sheet, and things happened as described above.

Luc

VladimirN · ‎Oct 21, 2016

Same here.

StuartBruff · ‎Oct 21, 2016

LucMeekes wrote:

I never use the dialog, I put an

ORIGIN:=1

at the very top of the sheet, and things happened as described above.

As expected, then , given that any := function uses the value of ORIGIN in force at its point of definition.

I usually put all of my utility definitions in a collapsed Area at the top of a worksheet, so they pick up the default worsheet ORIGIN = 0. On occasion, and for specific purposes, I've added ORIGIN:=n in the Utilities Area, either as the last statement or before any functions written in terms of ORIGIN = n. (My most frequent reason for doing this is to translate a function from, or validate it in, an ORIGIN = 1 environment (eg, Matlab))

Stuart

AlvaroDíaz · ‎Oct 19, 2016

Hi Stuart.

Very interesting function.

My observations are not for this shape function, but maybe for another one.

First, the structure for the answer. It's assumed that the use of shape is for some iterative process over some nested array. But given that the shape structure is also a nested array, my first impression is that I go to have less issues working directly with the array, not with the structure information which returns the shape function. Is there a way to organize the structure information as a single vector, indicating where the row and col dimensions appear? Just like match function.

Second, even I don't understand why, but I'm assume that probably there are some good reasons for returning only one shape if all the shapes are identical. In this other function, I don't see if it could introduce issues for working with the structure information over the given nested array.

Best regards.

Alvaro.

StuartBruff · ‎Oct 21, 2016

AlvaroDíaz wrote:

First, the structure for the answer. It's assumed that the use of shape is for some iterative process over some nested array. But given that the shape structure is also a nested array, my first impression is that I go to have less issues working directly with the array, not with the structure information which returns the shape function.

Hi Alvaro, thanks for the comments.

Yes, the shape is indeed used for operating on nested arrays, the primary ones being a generic fill function (which can also reshape arrays) and a partition function (useful for creating block matrices).

You might have fewer issues working directly with the target array itself, if the array is small. But you will almost certainly find that it is easier to work with a shape for anything of even moderate size. For example, if I wish to reshape a 3x4x4x3 array into a 4x3x12 array, giving the shape rather than an example of the intended array is much simpler, both cognitively and physically. It also makes it easier to check that the actual output array has the shape you wanted. Now scale that up to a 300x400 array ...

Is there a way to organize the structure information as a single vector, indicating where the row and col dimensions appear? Just like match function.

Yes. Indeed, I have a variant of shape that returns a vector of vectors ... but a problem with this, and other linear representations, is that they lose some of the "essence" of Mathcad's 2D array structure. I find it easier to understand the structure of, say, array A2 in my original post by looking at the given shape than by looking at a list of lists. This is particularly the case when the array is irregular (ragged) rather than hyper-rectangular.

Second, even I don't understand why, but I'm assume that probably there are some good reasons for returning only one shape if all the shapes are identical. In this other function, I don't see if it could introduce issues for working with the structure information over the given nested array.

One of the ideals of any notation is to provide information in a compact form that unambiguously removes replication of data.

Consider the element [1,1] (or [2,2] According to ORIGIN taste!). I could have chosen to put 4 copies of the nested element structure in a 2x2 array rather than the more compact representation shown. This would have been fine for such a small array, but would have made it significantly harder to read for much larger or more complicated structures.

Fortunately, if somebody does prefer a "fully expanded" structure, then one of my fill function variants will convert the compact form accordingly.

I'm probably going to change the representation of a vector to give just it's length ... It's just that it's easier to deal with a shape in a uniform manner by giving the column count as well.

Another concept I've been toying with is to allow a user to give a name to a shape, and then specify arbitrary individual occurrences of that shape by name. In addition, the user would be able to specify exceptions to a repeating shape, eg A2 above could be given as a 2x2 shape with an exception at [1,1]. The downside is that this would require a more complex parser, and I don't feel like putting the effort into that at the moment .. too big a distraction from the main task!

Stuart