cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Visit the PTCooler (the community lounge) to get to know your fellow community members and check out some of Dale's Friday Humor posts! X

Filtering data set with random integers

Raiko
17-Peridot

Filtering data set with random integers

Hello fellow MCers,

I have a problem with a data set filter. My aim is to have a function that creates out of a given set of data points another set that contains 2 to the power of k elements being randomly selected; with k an integer.

So I created these functions in MC15. The annoying fact is that it works fine when I have a data set whose number of elements is close to 2k; e.g. 525.However, it fails to converge to a solution if the number of elements in the initial set is close to the next 2k.

E.g. data set containing 555 elements. It puts out a 512 element vector - fine.

Change number of data elements to say, 999, it has troubles finding a solution. Whereas

the next 2k element size of 1033 yields 1024.

My suspicion is that it probably has to to do with the way I'm generating a set of random integers (for the trim function) by invoking the runif function and truncating the real numbers to an integer.

Thanks in advance

Raiko

ACCEPTED SOLUTION

Accepted Solutions
StuartBruff
23-Emerald II
(To:Raiko)

Raiko Milanovic wrote:

Hello Stuart,

here is a pdf of my worksheet.. By and large I did it the way you proposed

Raiko

Thanks, Raiko.

A few observations. 

Isy will be quicker if you use "return 1" rather "q <-q+1" and continuing checking after you've found a pair of equal indices.

It will (on average) be quicker to create a vector of valid indices and then to use the "augment random" method I outlined previously.  This will guarantee there will be no duplicate indices, hence doing away with the need for Isy.

I think you could directly calculate k by floor(log (rows (X),2)).

Stuart

View solution in original post

4 REPLIES 4
StuartBruff
23-Emerald II
(To:Raiko)

Unfortunately, I'm Mathcadless at the moment, so can't see what you've done - apologies if you already know this method.  One of the easiest ways to pick random elements from a vector is to use runif to create a vector of the same size as the original vector, augment the two vectors, sort on t he "runif" column and then extract the (now randomly sorted) "original" column. Then use submatrix on that to get as many elements as you need. Something like:

v:=[1,2 ....]

tmp:=augment (v,runif(rows(v),0,1))

tmp:=csort(tmp,1)

r:=tmp<0>

r:=submatrix(r,0,2^k, 0,0) ... or submatrix (r,ORIGIN,2^k, ORIGIN,ORIGIN)

Stuart

Raiko
17-Peridot
(To:StuartBruff)

Hello Stuart,

here is a pdf of my worksheet.. By and large I did it the way you proposed

Raiko

StuartBruff
23-Emerald II
(To:Raiko)

Raiko Milanovic wrote:

Hello Stuart,

here is a pdf of my worksheet.. By and large I did it the way you proposed

Raiko

Thanks, Raiko.

A few observations. 

Isy will be quicker if you use "return 1" rather "q <-q+1" and continuing checking after you've found a pair of equal indices.

It will (on average) be quicker to create a vector of valid indices and then to use the "augment random" method I outlined previously.  This will guarantee there will be no duplicate indices, hence doing away with the need for Isy.

I think you could directly calculate k by floor(log (rows (X),2)).

Stuart

Raiko
17-Peridot
(To:StuartBruff)

Thank you Stuart, it worked

Raiko

Announcements

Top Tags