Optimising matrix row calculations for speed..?

Philip.Oakley · ‎Apr 30, 2014

I have a couple of image processing problems that need images to be transformed in a row oriented fashion, so that the same function (e.g. cfft) is applied to each row. Thus the output matrix and input matrix are the same size (exempting any Real->Complex change).

Is there an agreed fastest method for extracting, processing and re-assembling the matrices? I need to do the processing many hundreds of times on large images (~300x200 or bigger)

Given that there is no row extract, is it better to transpose and column extract, or just do a for loop along the row?

Given that re-combining the row vectors into a matrix is tricky, is it better to keep the output as a vector of vectors? my down stream processing combines a pair of transformed outputs so the indexing change can probably be handle row by row.

Any thoughts on the best speed up?

Philip

AlanStevens · ‎Apr 30, 2014

Here's one simple speed-up; though as it takes about 2/3 of the time of the original the speed-up is not massive and is probably not what you are really looking for.

Alan

AlanStevens · ‎Apr 30, 2014

And here's an obvious (in retrospect!) much quicker option.

Alan

Philip.Oakley · ‎Apr 30, 2014

I'd forgotten / hadn't realised that I could stuff the column vector back into the matrix inside a programme. I had it in my head that that was one of the things that couldn't be done.

Werner_E · ‎Apr 30, 2014

No gain in speed compared to what Alan had posted, just a small tidy up. A bit shorter, no need for global size variables (M,N) and saving a local temporary matrix (??). The marginal gain in speed seen in the screenshot isn't for real, I guess, but due to Mathcads inaccurate timing capabilities. For better accuracy in time measurement we would had to use the dll which Richard had posted a week ago.

PS: The loop is counting from top down. This is a relict from a prior version where I was using an additional local matrix which was preallocated that way (the same effect Alan achieves with his assignment just above the loop. As the routine is written now we could as well us a loop from 0 up, it won't make any difference.