I have a CSV file from the US Census that looks like this:
"ZIP5","ZIP4","ZIP9","STATE CODE","STATE","COUNTY CODE","COUNTY NAME","CBSA CODE","CBSA TITLE","CBSA LSAD","METRO DIVISION CODE","METRO DIVISION TITLE","METRO DIVISION LSAD","CSA CODE","CSA TITLE","CSA LSAD"
"04841",,"04841","23","ME","013","Knox County","40500","Rockland, ME","Micropolitan Statistical Area",,,,,,
"04843",,"04843","23","ME","013","Knox County","40500","Rockland, ME","Micropolitan Statistical Area",,,,,,
"04846",,"04846","23","ME","013","Knox County","40500","Rockland, ME","Micropolitan Statistical Area",,,,,,
"04847",,"04847","23","ME","013","Knox County","40500","Rockland, ME","Micropolitan Statistical Area",,,,,,
"04848",,"04848","23","ME","027","Waldo County",,,,,,,,,
"04849",,"04849","23","ME","027","Waldo County",,,,,,,,,
"04850",,"04850","23","ME","027","Waldo County",,,,,,,,,
"04851",,"04851","23","ME","013","Knox County","40500","Rockland, ME","Micropolitan Statistical Area",,,,,,
"04852",,"04852","23","ME","015","Lincoln County",,,,,,,,,
The file has over 2 million records. Most of the records don't have data in all the fields.
Here is the MySQL record layout I defined for the above CSV file:
+----------------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| ZIP5 | varchar(5) | NO | | NULL | |
| ZIP4 | varchar(5) | NO | | NULL | |
| ZIP9 | varchar(10) | NO | | NULL | |
| STATE_CODE | varchar(2) | NO | | NULL | |
| STATE | varchar(2) | NO | | NULL | |
| COUNTY_CODE | varchar(3) | NO | | NULL | |
| COUNTY_NAME | varchar(50) | NO | | NULL | |
| CBSA_CODE | varchar(5) | NO | | NULL | |
| CBSA_TITLE | varchar(50) | NO | | NULL | |
| CBSA_LSAD | varchar(50) | NO | | NULL | |
| METRO_DIVISION_CODE | varchar(5) | NO | | NULL | |
| METRO_DIVISION_TITLE | varchar(50) | NO | | NULL | |
| METRO_DIVISION_LSAD | varchar(50) | NO | | NULL | |
| CSA_CODE | varchar(3) | NO | | NULL | |
| CSA_TITLE | varchar(50) | NO | | NULL | |
| CSA_LSAD | varchar(50) | NO | | NULL | |
+----------------------+------------------+------+-----+---------+----------------+
(I just realized I should define ZIP5 as a Primary key?)
I have read that if you have an empty field in a CSV file, you should change it to \N, but is there a way to do this easily? I could write a PHP program to do this, but with over 2 million records it would take a very long time and my server doesn't have a lot of RAM.
How can I import this CSV file to MySQL successfully the easiest way? Are there some parameters on the LOAD command in MySQL that would do this? The way it works now, it complains that ZIP5 has data truncation and when I look in MySQL it has quotes in the zip code and only the first 4 digits. Thanks!