Solved: Filtering data set with random integers

Raiko · ‎Oct 21, 2016

Hello fellow MCers,

I have a problem with a data set filter. My aim is to have a function that creates out of a given set of data points another set that contains 2 to the power of k elements being randomly selected; with k an integer.

So I created these functions in MC15. The annoying fact is that it works fine when I have a data set whose number of elements is close to 2k; e.g. 525.However, it fails to converge to a solution if the number of elements in the initial set is close to the next 2k.

E.g. data set containing 555 elements. It puts out a 512 element vector - fine.

Change number of data elements to say, 999, it has troubles finding a solution. Whereas

the next 2k element size of 1033 yields 1024.

My suspicion is that it probably has to to do with the way I'm generating a set of random integers (for the trim function) by invoking the runif function and truncating the real numbers to an integer.

Thanks in advance

Raiko

StuartBruff · ‎Oct 21, 2016

Raiko Milanovic wrote:

Hello Stuart,

here is a pdf of my worksheet.. By and large I did it the way you proposed

Raiko

Thanks, Raiko.

A few observations.

Isy will be quicker if you use "return 1" rather "q <-q+1" and continuing checking after you've found a pair of equal indices.

It will (on average) be quicker to create a vector of valid indices and then to use the "augment random" method I outlined previously. This will guarantee there will be no duplicate indices, hence doing away with the need for Isy.

I think you could directly calculate k by floor(log (rows (X),2)).

Stuart

View solution in original post

StuartBruff · ‎Oct 21, 2016

Unfortunately, I'm Mathcadless at the moment, so can't see what you've done - apologies if you already know this method. One of the easiest ways to pick random elements from a vector is to use runif to create a vector of the same size as the original vector, augment the two vectors, sort on t he "runif" column and then extract the (now randomly sorted) "original" column. Then use submatrix on that to get as many elements as you need. Something like:

v:=[1,2 ....]

tmp:=augment (v,runif(rows(v),0,1))

tmp:=csort(tmp,1)

r:=tmp<0>

r:=submatrix(r,0,2^k, 0,0) ... or submatrix (r,ORIGIN,2^k, ORIGIN,ORIGIN)

Stuart

Raiko · ‎Oct 21, 2016

Hello Stuart,

here is a pdf of my worksheet.. By and large I did it the way you proposed

Raiko

StuartBruff · ‎Oct 21, 2016

Raiko Milanovic wrote:

Hello Stuart,

here is a pdf of my worksheet.. By and large I did it the way you proposed

Raiko

Thanks, Raiko.

A few observations.

Isy will be quicker if you use "return 1" rather "q <-q+1" and continuing checking after you've found a pair of equal indices.

It will (on average) be quicker to create a vector of valid indices and then to use the "augment random" method I outlined previously. This will guarantee there will be no duplicate indices, hence doing away with the need for Isy.

I think you could directly calculate k by floor(log (rows (X),2)).

Stuart

Raiko · ‎Oct 21, 2016

Thank you Stuart, it worked

Raiko

Filtering data set with random integers

Filtering data set with random integers

Integer filter

integer value error

Filter data using comboboxes

Random unique integers - problem

filtering a data matrix of a particular value