2017-04-27 CSV

posted Apr 27, 2017, 7:16 AM by Samuel Konstantinovich   [ updated Apr 27, 2017, 8:31 AM ]
More CSV file parsing!

Attached to this post are two files, one for linux, one for windows.


From Yesterday: 
Convert a csv string into a list of lists:
text =  '''a,b,c
d,e,f
1,2,3'''

1. Break it up line by line (split on new lines)
2. replace each line with a list (split on commas)

These 2 steps are absolutely critical. They are common, and should not take very long. If you dont remember study the following:

Lists = text.split("\n")
i = 0
while i < len(Lists):
  Lists[i] = Lists[i].split(",")
  i+=1

This is not a very long block of code, but if you don't understand a  part of it, then it might take you a long time to replicate.
altrernate loop:

Lists = text.split("\n")
for i in range(len(Lists)):
  Lists[i] = Lists[i].split(",")


Classwork (finish at home!)

Look at the data files that are included on the post (at the bottom), which columns are numerical?

We have:
first,last,age,height,score

Your goals:
1. Calculate the total height of all people in the list. (sum a column)
2. Calculate the average age. (average a column)
3. Calculate which person has the highest score (find which row) Then print out that person's information.


How do we convert the data file into some structure in python that would be easy to work with?

"""first_name,last_name,age(years),height(inches),score
Carin,Marritt,82,46.6,6108
Orelie,Josifovitz,8,48.8,1936
Erena,Cottham,80,75.5,5163
..."""

Hints: 
1. After splitting the data, take a small (3-4 line) slice of the list to work with. 
2. Make a function to calculate the sum of a column in a 2d list. Test it on some lists that you make before you test it on the real data.

Try to make a function to find the sum of the nth column of this list:
[[1,2,3,4,5],
[8,8,2,2,2],
[4,3,2,1,1],
[23,44,11,11,1]]

or, leave them as strings:
[['1','2','3','4','5'],
['8','8','2','2','2'],
['4','3','2','1','1']]

sumCol ( ListOfLists, n)  

This will make the whole lab much easier. 
ċ
datalinux.csv
(5k)
Samuel Konstantinovich,
Apr 27, 2017, 7:16 AM
ċ
datawindows.csv
(6k)
Samuel Konstantinovich,
Apr 27, 2017, 7:16 AM
Comments