cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Learn all about PTC Community Badges. Engage with PTC and see how many you can earn! X

Decimation (downsampling) of ASCII (txt) input files

ClaudioPedrazzi
11-Garnet

Decimation (downsampling) of ASCII (txt) input files

Hi everybody,

although I looked a little in past questions, I don't seem to find anything suitable.  I am confronted with the problem of reducing the sampling rate of a set of files, if possible with a variable reduction factor. For example setting reduction := 60 I would then have a value every minute, instead of a value every second, and so on. All my attempts are baffled by error: "insufficient memory for the required operation".

I built an example with a limited amount of data, in order to show what I am trying to do. But the real files that I have got can reach 1.5 Millions of lines. They are also not readable with Excel.

Moreover, the procedure seems to me very slow.  There has to be a better approach, but at the moment I cannot see it.

Thanks a lot in advance for any hint!

Regards

Claudio

PS: a small collateral question would be if there is some easy Mathcad-way to convert the time stamp "hh:mm:ss" in a normal number (like for example seconds from midnight or so).

1 ACCEPTED SOLUTION

Accepted Solutions

Does the attached help at all?

Alan

View solution in original post

11 REPLIES 11

Does the attached help at all?

Alan

Hi Alan,

thanks a lot!!!

I can understand what you did. The idea with an index (j) having an index itself (ir) is really interesting, I never would have thought of it!

I have tried your version with one of my production files, one with 1.2 million lines. it works!  If I then change the file name and try to execute the Worksheet again for a new file, I get "insufficient memory"... but this is a minor inconvenience, I have just to close Mathcad and restart it, and then it works.

Thank you also for the function for computing the number of seconds. I have a question concerning that one, probably it has nothing to do with the function itself, but I still would like to understand.

When I use WRITEPRN to write out the file, although the data seem correct in Mathcad, they appear wrong in the file, precisely they appear wrong as soon as the time goes back to 00:00:00 ... what in my actual production data can happen. I just cannot understand that!  This is certainly an error in WRITEPRN function.  I include a picture of the problem. The "yellow line" is the first wrong one: the value 91 seconds should go in the first column!

Unbenannt.JPG

If you (or anyone) have any suggestion what I am doing wrong here, it would also be appreciated!

Best regards

Claudio

I understood the problem with WRITEPRN.  In my opinion, it is not intended for writing out files with a given number of columns. It is just intended for writing out a file that is readable with READPRN. I tested this rereading the file and subtracting the two matrixes.  They match.  I was confused because I wanted a file similar to my input one, with a constant number of columns.

I guess if I want that I will have to use other functions.  Still no idea which one .

Regards

Claudio

Claudio Pedrazzi wrote:

When I use WRITEPRN to write out the file, although the data seem correct in Mathcad, they appear wrong in the file, precisely they appear wrong as soon as the time goes back to 00:00:00 ... what in my actual production data can happen. I just cannot understand that!  This is certainly an error in WRITEPRN function.  I include a picture of the problem. The "yellow line" is the first wrong one: the value 91 seconds should go in the first column!

Unbenannt.JPG

If you (or anyone) have any suggestion what I am doing wrong here, it would also be appreciated!

Best regards

Claudio

One way around this is as follows:

output.PNG

However, the downside to this is that you have to type the filename directly into the file component; you can't use a generic variable name.  Swings and Roundabouts!

Alan

Thanks Alan,

that is a possibility... although I was planning (after having solved all problems) some kind of loop reading and writing files, so I would like the flexibility of WRITEPRN.

But I will keep this way in mind also.

Claudio

Claudio Pedrazzi wrote:

I understood the problem with WRITEPRN.  In my opinion, it is not intended for writing out files with a given number of columns. It is just intended for writing out a file that is readable with READPRN. I tested this rereading the file and subtracting the two matrixes.  They match.  I was confused because I wanted a file similar to my input one, with a constant number of columns.

I guess if I want that I will have to use other functions.  Still no idea which one .

Regards

Claudio

Interesting. I modified the worksheet to include a  I used Notepad++ to look at the rollover and the WRITEPRN file looked OK to me ...


Stuart

Hi Stuart,

actually I would like to use WRITEPRN (because of the flexibility in using the name as as string variable).  Your worksheet, downloaded and executed on my PC (Windows 7) gives the following 😞

Unbenannt.JPG

I also use Notepad++ and I activated the option to show all special characters.  But it has nothing to do with Notepad++: the problem is visible also if one opens the file with Excel. 

Thanks for modifying the "sample" dataset in order to show the problem! So we can test and exchange results.

Could I have somewhere a different setting from your?  I also tried a little experimenting with PRNCOLWIDTH and PRNPRECISION but it does not change the problem. What seems to happen is that the function WRITEPRN reaches the nearest column to 80, and then makes a CR+LF.  Possibly this "limit" of 80 columns is somewhere in the settings, but I do not know where!

Best regards

Claudio

"But it has nothing to do with Notepad++: the problem is visible also if one opens the file with Excel. "

It works ok if you use the WRITEEXCEL function - see attached (however, I don't know if EXCEL can cope with your large files).

Alan

Alan, thanks a lot!

It is even better than my original idea!  I simply did not think to the EXCEL read/write functions.

Concerning the limits of EXCEL (around 1.1 million lines), well this is one of the reasons why I had to "invent" this whole downsampling application. The original files are not readable with EXCEL, but the downsampled ones, yes they are readable.  And if I later need to reread the files to glue the downsampled one together, I can do that with READEXCEL.

The problem with PRNWRITE is a little mystery, but at this point I don't care anymore.  I wrote a separate test application only for the WRITEPRN function, with internally generated data, but a similar matrix to the one of my "Example.out", and, guess what, it works perfectly and goes quietly over the 80th column!

Unbenannt.JPG

I appreciate a lot the help received from all of you!  This Forum is really really helpful!

PS: a small collateral question would be if there is some easy Mathcad-way to convert the time stamp "hh:mm:ss" in a normal number (like for example seconds from midnight or so).

See here: Date Calendar and Time functions.mcd - PTC Community

Thanks a lot Richard,

was not aware of that one.

Top Tags