Clean data in a csv file


#1

I am working with a dataset which has crime data for Chicago

There are a few lines where multiple values are separated by comma and put in double quotes. Is there a way to clean the data so that the text under double quotes can be read as one column

The text is below and the columns in bold is what I want to read as a single column

10366565,HZ102660,01/03/2016 01:50:00 PM,020XX S WABASH AVE,1310,CRIMINAL DAMAGE,TO PROPERTY,“SCHOOL, PRIVATE, BUILDING”,false,false,0131,001,3,33,14,1177070,1890608,2016,01/10/2016 08:46:55 AM,41.855167994,-87.625552607,"(41.855167994, -87.625552607)"


#2

This might not be the most pythonic way to clean your data but it works.

import csv
with open("dataset.csv", 'r+') as file:
   csvfile = csv.reader(file)
   list_csv = list(csvfile)
   for i in range(len(list_csv)):
       list_csv[i][7] = list_csv[i][7] + list_csv[i][8] + list_csv[i][9]
       del list_csv[i][8]
       del list_csv[i][8]
print(list_csv)