Skip to main content
13-Aquamarine
March 6, 2024
Question

On loop execution speed

  • March 6, 2024
  • 1 reply
  • 2041 views

In another thread the execution speed of loops was mentioned as a side issue. That warranted some testing.

 

It turns out that using if-construction inside the loop to deal with the first index instead of assigning that before the loop means that the loop takes almost twice as long ... if the loop just assigns a single value to a parameter. If the loop contents are even moderately complex, the effect of the if-construction will soon disappear into noise.

 

Another raised question was the speed of the two forks in the if-construction. From this simple test that is minimal. The random variation is much higher than the difference, but on the average the result below seems to be about right.

 

Another interesting thing, which doesn't show below, is the effect of multithreading. These examples are likely about as bad for multithreading as possible and that was also reflected in the results. The overhead from it was such that each execution time roughly quadrupled from the shown ones, which didn't use it.

 

JKT_0-1709720200919.png

 

1 reply

JKT13-AquamarineAuthor
13-Aquamarine
March 6, 2024

I couldn't resist: Similar samples of matrix generation. Defining the matrix in dual loop starting from top left does take longer than doing it from bottom right, but the difference is relatively small. I expected more. On the other hand, using matrix instead of dual loop was faster as I expected, but it still took about half as long as the dual loop. The loops fared rather well in my opinion. If there is any real calculation for the matrix values, the difference would likely be insignificant.

 

In this case multithreading worked a bit differently. The first two clocked relatively consistently a bit over 2 seconds each. The second may have still been slightly faster. The last two were interesting, though. The time in the first of them varied from about 0.6 to somewhat over 1s. The second was much faster. In some cases it was roughly the same as without multithreading and the max was maybe 0.7s. That kind of difference I didn't expect. Another thing I didn't expect is that on average every third run the latter two produced an error when using multithreading. There are obviously some bugs in the code.

 

JKT_0-1709722036078.png

 

25-Diamond I
March 6, 2024

To be fair you should also let the loop-programs call a function f rather than assigning a constant 1.
I also guess that the function should in some way use its arguments to avoid the effects of some sort of code optimization.

I made tests by defining user-written Matrix functions which mimic what the built-in matrix function does.

I guess if we create the whole matrix in front of the loops by assigning the last element any value, there should be no significant difference if we loop up or down.

My tests showed that

1) your machine is faster than mine 😉
2) timing is quite unreliable or there is a large random variation

3) the built-in matrix function is not that much faster than I thought!

 

Werner_E_3-1709731043377.png

Its surprising to see that even though the matrix is pre-created the Matrix2 function seems to be (very slightly) faster.

 

All timing was done with multithreading turned off as I won't trust any timing done with the time() function with multithreading on.

Nonetheless I tried with multithreading turned on and when I tried to recalculate the whole sheet, all four regions failed because of "overflow or infinite recursion". Letting them evaluate one after the other worked and the results were similar to the ones without multithreading:

 

BTW, I also tried the very same in real Mathcad (V.15). Here are the results:

Werner_E_2-1709730191173.png

 

Prime 9 and Mathcad 15  files attached

 

JKT13-AquamarineAuthor
13-Aquamarine
March 6, 2024

@Werner_E wrote:

To be fair you should also let the loop-programs call a function f rather than assigning a constant 1.
I also guess that the function should in some way use its arguments to avoid the effects of some sort of code optimization.


I didn't think of that. I just tried to eliminate everything else besides what I wanted to "measure".

 


I guess if we create the whole matrix in front of the loops by assigning the last element any value, there should be no significant difference if we loop up or down.

My tests showed that

1) your machine is faster than mine 😉
2) timing is quite unreliable or there is a large random variation

3) the built-in matrix function is not that much faster than I thought!


That's why I did it both ways - one of the ways had to re-size the matrix multiple times. A long vector would have done it even more times.

 

I found the same about timing. It obviously depends on other processes going on in the machine and - I suspect - what the window manager in Prime happens to be doing. That's why an average of 10 or so runs would be much better. The relative slow speed of the matrix was indeed a surprise.

 


Its surprising to see that even though the matrix is pre-created the Matrix2 function seems to be (very slightly) faster.

That is surprising indeed ... even though the difference is minimal as you wrote.

 


All timing was done with multithreading turned off as I won't trust any timing done with the time() function with multithreading on.

Nonetheless I tried with multithreading turned on and when I tried to recalculate the whole sheet, all four regions failed because of "overflow or infinite recursion". Letting them evaluate one after the other worked and the results were similar to the ones without multithreading:

 

BTW, I also tried the very same in real Mathcad (V.15). Here are the results:

Werner_E_2-1709730191173.png

 

Prime 9 and Mathcad 15  files attached

 


Makes you really trust the multithreading, doesn't it!

 

There's not that much difference between Mathcad 15 and Prime 9 results. That makes me wonder what makes the huge difference between them. It could be partially in the window manager, but it seems to me that the worst culprit is how Prime tries to calculate the entire document every time you make changes. When I'm editing in the middle of the document that can generate a long backlog of updates to the engine and when same thing or something else is changed that accumulates. The situation is even worse if the edit creates an error as those seem to take much longer to handle. Mathcad 15 didn't try to calculate beyond the bottom of visible page, so there never was that much to re-calculate and even if errors occurred those didn't take that long either as there were just a few of them.