SlideShare a Scribd company logo
Hands-on Training

Data – what and how?
A case of CA Election 70

YoungInnovations
OpenNepal
Data → Story
●

Find data

●

Wrangle/Cleanup the data

●

Merge data with others (if any)

●

Filter and sort the data

●

Analyze data

●

Visualize data (story)
CA Election 2070
●

What is data?
–

The candidates (age, gender, party)

–

The constituencies (vdc, ward, party)

–

The results (with votes, winner)

–

…..
Where to find it?
●

http://election.gov.np

●

The following FPTP results data in XML
Not lucky every time finding data
●

Scrapping (requires programming knowledge)
–

Using google scraper

●

PDF conversion

●

PDF manual transcribe
Chrome Scraper Extension
●

Search for “Chrome extension Scraper” from
Chrome browser to install
Scraper in Action
PDF to Text
●

Online tools available

●

Linux has different set of utilities

●

PDF is still a big nuisance (though something is
better than nothing)
PDF to Text
http://www.election.gov.np/election/uploads/fil
es/ecn_report/constwisecandidatecount.pdf
PDF to Text
●

Linux utility - pdftotext
CSV
●
●

●

CSV - Comma Separated Value
Opens in MS Excel, Open Office, Google
Spreadsheet
Easy to work with
CA XML Data to CSV
XML to CSV?
●

Online services are available

●

Might need help from technologist

●

In linux (there might be several ways, e.g)
xml2 < FPTP-CA70.xml | 2csv FPTP
DISTNAME CONST CANDIDATE AGE SEX
PARTYNAME SYMBOLNAME TOTALVOTE
STATUS > FPTP-CA70.csv
OpenNepal
●

Repository of datasets
–

●
●

●

data in csv, xml or json format

Request for dataset
Request for help in conversion from one format
to another, scrapping data, ...
OpenNepal Community (GoogleGroup) is very
vibrant
CA Results CSV data
●

Converted from XML

http://dev.yipl.com.np/data-training/data/FPTP-CA70.csv
Processing/Cleaning CSV – Basics
●

Add header

●

Sorting (by different fields)

●

Filter

●

Simple formulas
Add headers
●

Insert row at the top

●

Add header for each column
Sorting
●

Sorting by Age – Ascending, Descending

●

Find out youngest winning candidate age
Filtering
●

Filter the list of winning female candidates
Some exercise
●

●

●

●

●

Are there people who didn't receive a single vote?
What is the highest and lowest number of votes of
candidate who didn't win?
Find the percentage of female and male
candidates, percentage of winning female
candidates?
Try the above exercise in one district of your
interest?
Think of other things you can do with this basic
skills
More questions
●

●

●

How many parties have candidates in all 240
constituencies?
How many male and female candidates are
there in Nepali Congress? Ratio of male-female
in far-west districts?
Which party has the highest number of female
candidates?
Data Processing - Pivottable
PivotTable - more
●

Breakdown of
independent
candidates
Lets again see numbers
●

Sorted by total
number of
candidates
Visualization
●

Bar graph of male-female candidates of top few
districts
What else visualizations are
possible?
●

https://github.com/mbostock/d3/wiki/Gallery
What else visualizations are
possible?
●

https://github.com/mbostock/d3/wiki/Gallery
Geocoding
●

Geo-coding
–

–

●

the conversion of a human-readable location name
into a numeric (or other machine-processable)
location such as a longitude and latitude
Kathmandu => [geocoding] => {latitude: 27.70169,
longitude: 85.3206}

Online tools available for geocoding
–

Google fusion table

–

cartodb
Lat-long in maps.google.com
●

Put the lat long (27.70169 85.3206) in google
map search box
Services available for geocoding
http://open.mapquestapi.com/nominatim/v1/sea
rch?format=xml&q=Kathmandu,Nepal
Problems with this CSV
●

Unicode in districts name

●

Can't geocode (currently only english)
Adding english district name

http://dev.yipl.com.np/data-training/data/FPTP-CA70-eng.csv
Google Fusion Table
●

tables.googlelabs.com (need @gmail account)
Imported data
Geocoding
Using filter in the map
Use of heatmap based on votes
Thank you

More Related Content

PPT
Facebook
PPT
Mobile Technologies and Climate Change
PPT
Wiki:Collaborative tool for building documents
PPT
Crowdsourcing
PPTX
Data Literacy Training - Using Climate Change and Budget data of Nepal
PPT
Digital Library Repository: Invenio vs Dspace
PPT
Google Wave Intro
PPT
The Power of Regular Expression: use in notepad++
Facebook
Mobile Technologies and Climate Change
Wiki:Collaborative tool for building documents
Crowdsourcing
Data Literacy Training - Using Climate Change and Budget data of Nepal
Digital Library Repository: Invenio vs Dspace
Google Wave Intro
The Power of Regular Expression: use in notepad++

Recently uploaded (20)

PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Getting Started with Data Integration: FME Form 101
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
The various Industrial Revolutions .pptx
PPTX
TLE Review Electricity (Electricity).pptx
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
1. Introduction to Computer Programming.pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
August Patch Tuesday
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
Programs and apps: productivity, graphics, security and other tools
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
A contest of sentiment analysis: k-nearest neighbor versus neural network
Module 1.ppt Iot fundamentals and Architecture
Getting Started with Data Integration: FME Form 101
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Univ-Connecticut-ChatGPT-Presentaion.pdf
Enhancing emotion recognition model for a student engagement use case through...
The various Industrial Revolutions .pptx
TLE Review Electricity (Electricity).pptx
WOOl fibre morphology and structure.pdf for textiles
Getting started with AI Agents and Multi-Agent Systems
A comparative study of natural language inference in Swahili using monolingua...
1. Introduction to Computer Programming.pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
NewMind AI Weekly Chronicles – August ’25 Week III
Developing a website for English-speaking practice to English as a foreign la...
August Patch Tuesday
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Programs and apps: productivity, graphics, security and other tools
Ad
Ad

Data Literacy Training - case of CA Election 70