Exercises - Python Collections and Pandas

Originally published at: http://www.itversity.com/topic/exercises-python-collections-and-pandas/

Let us do exercise to understand collections API in detail Data can be retrieved from this link in my git account Data is tab separated Structure of data state constituency candidate name sex age category party name party symbol general postal total percentage of total votes percentage of polled votes total number of voters Each…

Here is the code which I used using Pandas:

#Exercise 1 – Get all the distinct constituency
election = pd.read_csv(“C:/Users/bmehta/Downloads/ls2014.tsv.txt”, delimiter = ‘\t’)
elec = election.groupby(election[‘constituency’])[‘constituency’].nunique()
print (elec)

#Exercise 2 – Get number of constituencies by state sorted in descending order by number of constituencies
constbystate = election.groupby(election[‘state’])[‘constituency’].nunique().sort_values(ascending = False)
constbystate

#Exercise 3 – Get the number of seats for each party in each state – output should be state,bjp,inc,
election.groupby([‘state’,‘partyname’])[‘total’].sum()

#Exercise 4 – Get the percentage of polled votes of each party formulla (number of votes per party across all the constituencies / total number of votes all the constituencies)

election.groupby([‘partyname’])[‘total’].sum()/int(election[‘total’].sum())

#Exercise 5 – Get top 10 candidates by margin (number of votes for winner – number of votes for 1st runner)
topwinners = election.groupby(‘candidate_name’)[‘total’].sum().sort_values(ascending = False).nlargest(10)
firstrunners = filtered.iloc[1:]

I was really not sure of Q5. Has anyone solved it?.

Thanks,
Bhavik

Hi ,
I have solved the mentioned exercise using sql function .

– Get top 10 candidates by margin (number of votes for winner – number of votes for 1st runner)
SELECT * FROM
(
SELECT state,constituency,partyname,candidate_name, total AS votes_earned,
total - lead(total) OVER (ORDER BY total DESC) AS margin_of_votes
FROM election_results_2014
ORDER BY total DESC
)
WHERE rownum<=10