Community Tip - Need to share some code when posting a question or reply? Make sure to use the "Insert code sample" menu option. Learn more! X
Three files are attached:
DATE.csv
DATA.csv
DATE.xmcd
When reading the two csv files into the Mathcad file, the first element of the DATA file is corrupted and looks like
However, the DATE.csv file reads in correctly as:
What is causing the corrupted first element when reading in the DATA.csv file.
Thanks for any help provided.
Reg Curry
Solved! Go to Solution.
Hi,
Retyped the date that is first entry in the data file and this eliminated the three rogue characters.
Hi,
Retyped the date that is first entry in the data file and this eliminated the three rogue characters.
👍
Thanks much.
Terry,
I cannot just retype the 0,0 entry in the csv file. I had to do it this way. Am I missing something?
Your problem occurs because there is actually some data in front of the first date, in the DATA file.
I read the first 16 bytes of each file to show:
Note that 49 is the ASCII character "1", that's where the date starts.
Whatever you used to generate those .CSV files, has put those extra bytes with values 239, 187 and 191 in front of the first date.
Success!
Luc
Luc,
I really appreciate your help; however, I do not understand. Recall my post of my COVID-19 research on 10-21-2021. These are csv files that I downloaded from the CDC. Right after 10-21-2021, I had to have two emergency surgeries. I am just now recovering to the point that I am resurrecting that research. Before the surgeries, all worked fine, now it doesn’t. I have not changed anything other than downloading the updated CDC files. Can you give a little more information on how you viewed the ASCII codes in the files? I don’t understand how the spurious entrees are getting in the file. Perhaps the CDC changed something. If they are there, why don’t I see them in the raw downloaded csv file. For now, until I understand what’s going on, I will have to use Terry’s solution.
Reg
PS: Forget my first reply.
Hi Reg,
I downloaded the two .CSV files that you attached at the start of this thread. Opened them with Notepad, to see nothing wrong apparently.
Then I started Mathcad and used READBIN() to take the first 16 bytes out of each of the two files and display the data values.
The extra bytes at the start are in that one .CSV file, as shown in my previous reaction. I opened the same file using Excel, and it also starts with the date, showing no signs of extra characters. Apparently these characters don't show normally.
I guess, if you want to 'repair' the DATA file using Notepad, you'll have to copy the entire data set (the 'text' in Notepad), and paste it to a new (Notepad) file, which you can save under the same (overwrite), or a new file name.
Success!
Luc
Hi Reg,
I used an old program called "Programmer's File Editor" .
Program shows up spurious characters in files.so you can delete them.
Program is freeware and has proved useful to me a number of times.
https://www.lancaster.ac.uk/~steveb/cpaap/pfe/pfefiles.htm
Cheers
Terry
Hi,
I did not read all replies ... so I am not sure whether following information is new for you.
I opened both csv files in Notepad++.
DATE FILE.csv ... file format is UTF-8
CDC COVID-19 DATA.csv ... file format is UTF-8-BOM ... this explains 3 "invisible" characters at the beginning of the file
CDC COVID-19 DATA.csv ... file size = 31 239 bytes
SOLUTION
In Notepad++
Thanks; however, since I do this frequently I found another solution. I left the headers in the original file and used submatrix to omit the header rather than deleting that first row. It solved the problem. For some reason, deleting the header row introduced the problem. Not sure why it never did that months ago.