I work with huge CSV data files and am planning to do few checks before inserting the data line by line to MySQL using Python. As the data files are pretty large opening the files take a hell lot of time. Therefore my aim is to load them without manually analyzing them. I'll be using Python to do the analysis for me. I have started writing the code but got stuck while inserting the data. I'm sure this is a basic issue and am not able to figure it out as I'm a bit new to Python. The Demo data:
id,first_name,last_name,email,boole,coin
1,Emilio,Pettie,[email protected],true,1Lj8Z4Em68hwqRAUXZKW7C7h2KgH5cGpTe
2,Raynard,Fairholme,[email protected],true,1AEwLuECKYD1Bb6EGaBQC1TJS1mtvHBmy3
3,Zonda,Bampkin,[email protected],false,14AHvnRjXExdgfqZBnWUyVi7aWZR8SFBoL
4,Thurstan,Sherville,[email protected],true,19iiiJ53zxmJnbmW7gKH2hoMwpiaqkit8E
5,Jonathan,Jewkes,[email protected],false,18E22TTK68ukQVLWK6oZNfFbzP2uHqaW7o
6,Dolores,Carmichael,[email protected],false,15BBePy5J3WY1QQLTjA79iYQMjDRubv2BD
7,Kleon,Wesker,[email protected],false,1NfYtAuq6M3cXGhDJuDBnCjdEBRSKsfRVJ
8,Laureen,Writtle,[email protected],true,14UgbrWz9wi2UptALs2dFeQRdUiMaLee57
9,Gypsy,Coombes,[email protected],true,1Hn3JBtjytwbBMVJgM7ixAi1sXf56KFM3R
10,Kevina,Boulger,[email protected],false,1GABbcoRTVsX1qzD8uiGtsPtuD1kvzokK1
The code :
import string
import csv
import mysql.connector
mydb=mysql.connector.connect(host="localhost",user="root",password="password",autocommit=True)
mycursor = mydb.cursor()
sql_str=''
sql_str1=''
mycursor.execute("drop table if exists rd.data")
with open(r"C:\Users\rcsid\Documents\Office Programs\Working prog\MOCK_DATA.csv") as csvfile:
csv_reader = csv.DictReader(csvfile)
line_count = 0
for row in csv_reader:
if line_count == 0:
sql_str=f'create table rd.data ( {" varchar(50), ".join(row)} varchar(50))'
mycursor.execute(sql_str)
sql_str1=f'insert into rd.data values ( {", ".join(row)})'
print(sql_str1)
mycursor.execute(sql_str1)
line_count += 1
I was able to create the table and the header part. But am unable to load the data. The print(sql_str1) output is :
insert into rd.data values ( id, first_name, last_name, email, boole, coin)
insert into rd.data values ( id, first_name, last_name, email, boole, coin)
insert into rd.data values ( id, first_name, last_name, email, boole, coin)
insert into rd.data values ( id, first_name, last_name, email, boole, coin)
And the data getting inserted is null for all the values. Can you please let me know how to capture the data in the csv. I know this maybe a basic syntax. Also I know the syntax cur.execute('INSERT INTO table (columns) VALUES(%s, ....)', row) but don't want to use this as I'll need to open the file to check the header part.