cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - You can change your system assigned username to something more personal in your community settings. X

Import data from TXT - case study

PawelDabrowski
5-Regular Member

Import data from TXT - case study

Good afternoon,

 

I have a multi-column txt file. I want to perform an operation on two fields. Each of them has the following layout: number/number (eg. 4/5 or 4/5). I need to extract the first character from both of them and then add (eg 4/5 and 4/6 ==> 4 + 5 = 9) The problem is that MathCAD 15 reads the field as mathematical operation and gives me 1.0 or some fraction. How could I import it as a text, pick two characters, convert to INT and then get the result?

 

The second issue with the data is that some parameters are given only every 4th row, not each of them (eg. needed PDOP). I need to perform a simple statistic analysis (mean, stdev, min, max, med). As the columns with data (eg. PDOP) have empty rows I cannot do it.

 

I would be very grateful for your help. Perhaps @WernerE or @Werner_E  would have some advice for me. 🙂

 

Please find attached MathCAD worksheet and a sample of my datafile.

1 ACCEPTED SOLUTION

Accepted Solutions

You may have to press the F9 function key.

Automatic calculation is switched off.

Edit:

I've modified the file per your correction, and switched automatic calculation on.

This should do it.

 

Success!
Luc

View solution in original post

15 REPLIES 15

The second part of your question is easy enough:

(though since all the values are the same it's not clear why you want to perform any stats on them!).

 

g0.jpg

Alan

 

PawelDabrowski
5-Regular Member
(To:AlanStevens)

Thank you Alan,

 

This is just a sample, my data is ~200,000 rows long 🙂

The first part is also not too bad. Essentially it means to read the file as a ";" delimited file and replace entries containing "/" with the result of calculation where "/" is replaced with "+".

See attached.

 

Success!
Luc

In the message the OP wrote "I need to extract the first character from both of them and then add" but the example he gave did differently! In the file once again he wrote

Werner_E_0-1590430466855.png

So maybe he is looking for something along the lines of

Werner_E_0-1590430670284.png

 

 

 

Here is an alternative solution to your second problem. The "on error" statement is necessary only in case that PDOP contains not a single entry with "-" as match would throw an error in that case.

Werner_E_0-1590427846780.png

Your first problem is more difficult. I could only think of two ways which both are not single liners
1) preprocess the input file by a self written filter program or maybe even a batch file to replace every "/" character by something else, probably a semicolon would be best if you are sure that every line of data really contains an expression in the form "a/b"

2) Read the csv file into Mathcad byte by byte (using READBIN e.g.) and write a parser in Mathcad to extract the data in the way you need. Seems to be the more complex but maybe also the more flexible way.

 

PawelDabrowski
5-Regular Member
(To:Werner_E)

Thank you @Werner_E 

PawelDabrowski
5-Regular Member
(To:Werner_E)

Pfff.... you did it just in the second? I am impressed. I did a mistake in the description. @Werner_E was right.

 

I pasted the modifed file in my directory but unfortunately the worksheet doesn't provide results. What is wrong?

You may have to press the F9 function key.

Automatic calculation is switched off.

Edit:

I've modified the file per your correction, and switched automatic calculation on.

This should do it.

 

Success!
Luc

PawelDabrowski
5-Regular Member
(To:LucMeekes)

Thank you, it works now. One final minor correction needed is to skip the first two lines in file. I tried to change the range to ORIGIN+2 but I was unsuccessful. With this, all would work perfectly.

Yes, that doesn't work. Try using the submatrix function on the result.

 

Success!

Luc


I tried to change the range to ORIGIN+2 but I was unsuccessful.

It would work if you additionally replace every R[i by R[i-2.

But I think using "submatrix" either on the result or on the input vectors col20 and col21 is a cleaner solution.
You could simply replace the last R by

Werner_E_3-1590461601118.png

 

PawelDabrowski
5-Regular Member
(To:Werner_E)

Thank you @Werner_E , as always your help is priceless

Hi,

 

Pre-process the file.

remove first line sep=; with a text editor.  I use Edit Pad Classic.  it is free and handles large files easily.

headers are now on top row.

Pre-process the file again

Open in excel in the resulting wizard screens use delimited and use "semicolon" and "/" as special delimiter.

Capture.JPG

Capture2.JPG

Capture3.JPG

Now save this excel file as comma separated text file.

This pre-processing takes little time.

Now open in Mathcad using READCSV

On opening the 4/6 etc are seperated into 4 and 6 as seperate columns.

 

Easy

 

Cheers

Terry 

 

Hi,

Doing the pre-processing I also note the first line of actual data has one extra blank column?

This extra had to be taken out manually.

Cheers Terry

Thank you for your help

Top Tags