cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 

Community Tip - Need to share some code when posting a question or reply? Make sure to use the "Insert code sample" menu option. Learn more! X

HELP; CORRUPTED INPUT DATA FROM CSV FILE

regcurry
16-Pearl

HELP; CORRUPTED INPUT DATA FROM CSV FILE

Three files are attached:

DATE.csv

DATA.csv

DATE.xmcd

 

When reading the two csv files into the Mathcad file, the first element of the DATA file is corrupted and looks like

regcurry_0-1652478161678.png

However, the DATE.csv file reads in correctly as:

regcurry_1-1652478229071.png

 

What is causing the corrupted first element when reading in the DATA.csv file.

 

Thanks for any help provided.

Reg Curry

 

 

 

Reg
ACCEPTED SOLUTION

Accepted Solutions

Hi,

Retyped the date that is first entry in the data file and this eliminated the three rogue characters.

Capture.JPG

View solution in original post

12 REPLIES 12

Hi,

Retyped the date that is first entry in the data file and this eliminated the three rogue characters.

Capture.JPG

👍

Thanks much.

Reg

Terry,

I cannot just retype the 0,0 entry in the csv file.  I had to do it this way.  Am I missing something?

 

 

regcurry_1-1652546286544.png

 

 

Reg
LucMeekes
23-Emerald III
(To:regcurry)

Your problem occurs because there is actually some data in front of the first date, in the DATA file.

I read the first 16 bytes of each file to show:

LucMeekes_1-1652510286983.png

Note that 49 is the ASCII character "1", that's where the date starts.

Whatever you used to generate those .CSV files, has put those extra bytes with values 239, 187 and 191 in front of the first date.

 

Success!
Luc

 

Thanks. That’s the problem. The data starts a day 50 instead of day 0. I’ll fix that.

Thanks again.🙏🏼
Reg

Luc,

I really appreciate your help; however, I do not understand.  Recall my post of my COVID-19 research on 10-21-2021.  These are csv files that I downloaded from the CDC.  Right after 10-21-2021, I had to have two emergency surgeries.  I am just now recovering to the point that I am resurrecting that research.  Before the surgeries, all worked fine, now it doesn’t.  I have not changed anything other than downloading the updated CDC files.  Can you give a little more information on how you viewed the ASCII codes in the files?  I don’t understand how the spurious entrees are getting in the file.  Perhaps the CDC changed something.  If they are there, why don’t I see them in the raw downloaded csv file.   For now, until I understand what’s going on, I will have to use Terry’s solution.

Reg

 

PS:  Forget my first reply.

Reg
LucMeekes
23-Emerald III
(To:regcurry)

Hi Reg,

 

I downloaded the two .CSV files that you attached at the start of this thread. Opened them with Notepad, to see nothing wrong apparently.

Then I started Mathcad and used READBIN() to take the first 16 bytes out of each of the two files and display the data values.

The extra bytes at the start are in that one .CSV file, as shown in my previous reaction. I opened the same file using Excel, and it also starts with the date, showing no signs of extra characters. Apparently these characters don't show normally.

I guess, if you want to 'repair' the DATA file using Notepad, you'll have to copy the entire data set (the 'text' in Notepad), and paste it to a new (Notepad) file, which you can save under the same (overwrite), or a new file name.

 

Success!
Luc

 

 

Luc,
Oh! Thanks 🙏🏼. I will try that.
Reg

Hi Reg,

 

I used an old program called "Programmer's File Editor" .

Program shows up spurious characters in files.so you can delete them.

Program is freeware and has proved useful to me a number of times.

https://www.lancaster.ac.uk/~steveb/cpaap/pfe/pfefiles.htm

Cheers

Terry

 

Capture.JPG

Thanks again for your help.
Reg

Hi,

I did not read all replies ... so I am not sure whether following information is new for you.

I opened both csv files in Notepad++.

DATE FILE.csv ... file format is UTF-8

MartinHanak_0-1655015449842.png

CDC COVID-19 DATA.csv ... file format is UTF-8-BOM ... this explains 3 "invisible" characters at the beginning of the file

MartinHanak_1-1655015615579.png

CDC COVID-19 DATA.csv ... file size = 31 239 bytes

 

SOLUTION

In Notepad++

  • create new file
  • select the entire contents of the file CDC COVID-19 DATA.csv using CTRL+A
  • copy selection into clipboard using CTRL+C
  • copy clipboard contents into new file using CTRL+V
  • save new file as new CDC COVID-19 DATA.csv
  • new CDC COVID-19 DATA.csv ... file format = UTF-8 ... files size = 31 236 bytes (3 "invisible" characters were removed)
  • import new CDC COVID-19 DATA.csv into Mathcad

 


Martin Hanák

Thanks; however, since I do this frequently I found another solution.  I left the headers in the original file and used submatrix to omit the header rather than deleting that first row.  It solved the problem.  For some reason, deleting the header row introduced the problem.  Not sure why it never did that months ago.

Reg
Announcements

Top Tags