Here comes the July 25th 2018 and Pakistan will see the 13th election (1954, 1962, 1970, 1977, 1985, 1988, 1990, 1993, 1997, 2002, 2008 and 2013) since independence. It’s middle of the week (Wednesday) with an expected temperature of 27-33 degree Celsius with almost no chances of rain anywhere in the country.
We predict the historic voters’ turn out in this election of 57-61%. Historically the average turn out is 45% since 1977 (lowest 35% in 1997, highest 55% in 1977 and 53% in last elections). Pakistan ranked 164th out of 169 nations in voters’ turn out; Australia being the first with 94.5% turn out.
Voters’ participation in the country is very diverse, historically Musakhel and Kohlu yield less than 25% whereas Layyah and Khanewal yield more than 60% and everything else is in between. Punjab has the highest and Balochistan has the lowest voters’ turnout.
The contest will bring 3,675 candidates for 272 national assembly seats, that is 13 candidates on average per seat. PTI has unleashed 244 candidates (highest in number by any political party). Islamabad will see 76 candidates just for 3 seats fighting to rule the capital that guarantees the psychological edge.
There a quite few interesting facts about these elections, for example we will see the highest number of Lotas (candidates who often change their party affiliation) ever. PTI believes to win the election no matter what may come while the survey pundits predicts the PML(N) lead of at least 13% over PTI.
The history of elections and the charges of corruption, voters’ fraud, ghost votes, interferences by deep state or violence go hand by hand. There is (almost) no country in the world without the fear or accusations of such incidents in their elections.
We are releasing the complete National Assembly Elections’ Results dataset for 2002, 2008 and 2013 elections in CSV files for public and calling all data scientists, international observers and journalists out there to help us achieve our inspirations.
You can download the complete dataset from this link.
The dataset should be referenced as “Zeeshan-ul-hassan Usmani, Sana Rasheed, Muhammad Usman, Muhammad Ilyas and Qazi Humayun, Pakistan Elections Complete Dataset (2002, 2008, 2013), Kaggle, July 7, 2018.”
Here is the list of ideas we are working on and like you to help with. Please post your kernels and analysis
- Map each NA constituency to a District. Get the list of Districts in Pakistan. So we will know how many constituencies we have in each district and which ones? Please update the dataset version on this page.
- Find and Convert the current 2018 candidates list to Excel sheet and upload here
- Find out total no of candidates in 2018 elections, from each party, from each province, total no of parties and Avg. no of candidates per seat
- Calculate the voter’s turn out in each NA. Highest, lowest etc. Make a historical timeframe so we would know how many people voted in each NA in 2002, 2008 and 2013
- Do analysis on invalid votes in each NA in all elections. Do we see any patterns here?
- Can we predict the effect of rain on voter’s turn out in a given constituency?
- Find out how many NEW candidates we have this time who have never contested any elections before? How many in each party?
- Can we make District Profiles with good visuals and heat-maps of which party would be leading in which district?
- Can we color the map of Pakistan (as we do in the US with Red and Blue) for each district? We can have a color or PML(N), PTI, PPP and MMA (only four major parties to start with)
- Can we find out Swing Districts and the Confirmed Districts for major parties?
- Are there any external datasets that we can join with this dataset to do some analysis? Please post the links or update the datasets here
- Make the Candidates’ profile so we know his party position in each election and whether he lost or won the last election(s). You can whatever values and information as you like
- Get the “Lota” Score for each candidate. So anyone with more than 2 would be a “Certified Lota”. These candidates are the ones who have changed their parties by x no of times, from independent to PPP, from PTI to PML etc.
- Get the “Confirmed Constituencies” where historically we have only one sided results. For example, PPP would always win from NA-XYZ or Zardari have never lost an election doesn’t matter where he ran from. Which party would definitely win which seats?
- Get the list of “Swing Constituencies” which historically are as random as anybody’s guess. For example, NA-XYZ voted for PTI in 2002, then went to PPP in 2008, then to PMLN in 2013 and so on. Once we have this list we can go further down and talk in detail the margins of win/loss in previous elections, who are the candidates (their profiles, district profiles, voter turnout etc.) and even results of bi-elections. But it is very important to get this list in first place. This is where can apply some models to do predict which way it will sway
- Make the “Party Potential” list. For Example, PML(N) with all its candidates, profiles etc. has the potential to win 86 seats, PTI 65, PPP 43 etc. Here we can predict which party would form the government in which province?
- Find out how many people voted so far in Pakistan in last 3 elections. Max, Min, Avg. Per Seat, Per Province? Can we hypothesize that that avg. no of voters in Punjab per seat (who go out and vote) is double than the avg. no of voters in KPK? Or voter turnout in Bunner is less than 25% while in Chakwal It is more than 65%?
- Popular Vote winner. Even if PML(N) lose, can we say that it will fetch max no of votes from the country by vote count only? Or is it true for PPP or PTI?
- Find “Fake Candidates” the people who are running but have no chance to win. Like no past elections or political history. These are the one who will withdraw 24 hours before the elections
- Find the “Independents” who will go to the highest bidder after winning
- Find anything interesting you can on candidates. Like is it true if candidates’ name start from M or A, he has twice the chances of winning than the candidates whose names start with other letters?
- Surprise Me!