SlideShare a Scribd company logo
Basics Exercise Next meetings
Big Data and Automated Content Analysis
Week 2 – Wednesday
»Getting started with Python«
Damian Trilling
d.c.trilling@uva.nl
@damian0604
www.damiantrilling.net
Afdeling Communicatiewetenschap
Universiteit van Amsterdam
8 April 2014
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Today
1 The very, very, basics of programming with Python
Datatypes
Indention: The Python way of structuring your program
2 Exercise
3 Next meetings
Big Data and Automated Content Analysis Damian Trilling
The very, very, basics of programming
You’ve read all this in chapter 3.
Basics Exercise Next meetings
Datatypes
Python lingo
Basic datatypes (variables)
int 32
float 1.75
bool True, False
string "Damian"
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Basic datatypes (variables)
int 32
float 1.75
bool True, False
string "Damian"
"5" and 5 is not the same.
But you can transform it: int("5") will return 5.
You cannot calculate 3 * "5".
But you can calculate 3 * int("5")
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
list ages = [18,22,45,23]
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
More advanced datatypes
list firstnames = [’Damian’,’Lori’,’Bjoern’]
lastnames =
[’Trilling’,’Meester’,’Burscher’]
list ages = [18,22,45,23]
dict familynames= {’Bjoern’: ’Burscher’,
’Damian’: ’Trilling’, ’Lori’: ’Meester’}
dict {’Bjoern’: 26, ’Damian’: 31, ’Lori’:
25}
Note that the elements of a list, the keys of a dict, and the values
of a dict can have any datatype! (It should be consistent, though!)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Functions
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
methods are similar to functions, but directly associated with
an object. "SCREAM".lower() returns the string
"scream"
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Datatypes
Python lingo
Functions
functions Take an input and return something else
int(32.43) returns the integer 32. len("Hello")
returns the integer 5.
methods are similar to functions, but directly associated with
an object. "SCREAM".lower() returns the string
"scream"
Both functions and methods end with (). Between the (),
arguments can (sometimes have to) be supplied.
Big Data and Automated Content Analysis Damian Trilling
Indention: The Python way of structuring your program
Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 firstnames=[’Damian’,’Lori’,’Bjoern’]
2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26}
3 print ("The names and ages of all BigData people:")
4 for naam in firstnames:
5 print (naam,age[naam])
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 firstnames=[’Damian’,’Lori’,’Bjoern’]
2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26}
3 print ("The names and ages of all BigData people:")
4 for naam in firstnames:
5 print (naam,age[naam])
Don’t mix up TABs and spaces! Both are valid, but you have
to be consequent!!!
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
Structure
The program is structured by TABs or SPACEs
1 print ("The names and ages of all BigData people:")
2 for naam in firstnames:
3 print (naam,age[naam])
4 if naam=="Damian":
5 print ("He teaches this course")
6 elif naam=="Lori":
7 print ("She was an assistant last year")
8 elif naam=="Bjoern":
9 print ("He helps on Wednesdays")
10 else:
11 print ("No idea who this is")
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
• an alternative block should be executed if an error occurs
(try and except statements)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Indention
Indention
The line before an indented block starts with a statement
indicating what should be done with the block and ends with a :
Indention of the block indicates that
• it is to be executed repeatedly (for statement) – e.g., for
each element from a list
• it is only to be executed under specific conditions (if, elif,
and else statements)
• an alternative block should be executed if an error occurs
(try and except statements)
• a file is opened, but should be closed again after the block has
been executed (with statement)
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
We’ll now together do the exercise “Describing an existing
structured dataset”.
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Next meetings
Big Data and Automated Content Analysis Damian Trilling
Basics Exercise Next meetings
Week 3: Data harvesting and storage
Monday, 13–4
A conceptual overview of APIs, scrapers, crawlers, RSS-feeds,
databases, and different file formats
Wednesday, 15–4
Writing some first data collection scripts
Preparation
• Conceptual level: Read the article by Morstatter, Pfeffer, Liu,
and Carley (2013) about the limitations of the Twitter API.
• Technical level: Make sure you are comfortable with the
techniques we’ve covered so far. Play around. Give
yourself some tasks and solve them. Google.
Big Data and Automated Content Analysis Damian Trilling

More Related Content

What's hot (20)

PPTX
Introduction to Python for Data Science and Machine Learning
PPTX
Introduction to python
PPTX
Python for Big Data Analytics
PDF
Python45 2
PPTX
Python training
PPTX
Python 3 Programming Language
PPTX
Python ppt
PPTX
Ground Gurus - Python Code Camp - Day 3 - Classes
PPTX
Pa1 session 2
PDF
Introduction To Programming with Python
PPTX
Python programming l2
PPTX
Python Tutorial Part 1
Introduction to Python for Data Science and Machine Learning
Introduction to python
Python for Big Data Analytics
Python45 2
Python training
Python 3 Programming Language
Python ppt
Ground Gurus - Python Code Camp - Day 3 - Classes
Pa1 session 2
Introduction To Programming with Python
Python programming l2
Python Tutorial Part 1
Ad

Similar to BD-ACA week2 (20)

PDF
business analytic meeting 1 tunghai university.pdf
PPTX
Integrating Python with SQL (12345).pptx
PDF
CPPDS Slide.pdf
PDF
Python: An introduction A summer workshop
PPTX
Exploring Data Science Using Python Tools
PPTX
Python4HPC.pptx
PPTX
PYTHON 101.pptx
PPTX
UNIT 1 PYTHON introduction and basic level
PDF
Introduction To Python
PPTX
Python-Certification-Training-Day-1-2.pptx
PPTX
Introduction-to-Python-Programming1.pptx
PPTX
Pythonon (1).pptx
PPT
Python programming
PDF
Unit1 pps
PDF
‘How to develop Pythonic coding rather than Python coding – Logic Perspective’
PPT
FALLSEM2022-23_ITA3007_ETH_VL2022230100613_Reference_Material_I_23-09-2022_py...
PPTX
Python and You Series
PPTX
Fundamentals of Python Programming
PPTX
Introduction about Python by JanBask Training
business analytic meeting 1 tunghai university.pdf
Integrating Python with SQL (12345).pptx
CPPDS Slide.pdf
Python: An introduction A summer workshop
Exploring Data Science Using Python Tools
Python4HPC.pptx
PYTHON 101.pptx
UNIT 1 PYTHON introduction and basic level
Introduction To Python
Python-Certification-Training-Day-1-2.pptx
Introduction-to-Python-Programming1.pptx
Pythonon (1).pptx
Python programming
Unit1 pps
‘How to develop Pythonic coding rather than Python coding – Logic Perspective’
FALLSEM2022-23_ITA3007_ETH_VL2022230100613_Reference_Material_I_23-09-2022_py...
Python and You Series
Fundamentals of Python Programming
Introduction about Python by JanBask Training
Ad

More from Department of Communication Science, University of Amsterdam (16)

PDF
Media diets in an age of apps and social media: Dealing with a third layer of...
PDF
Conceptualizing and measuring news exposure as network of users and news items
PDF
Data Science: Case "Political Communication 2/2"
PDF
Data Science: Case "Political Communication 1/2"

BD-ACA week2

  • 1. Basics Exercise Next meetings Big Data and Automated Content Analysis Week 2 – Wednesday »Getting started with Python« Damian Trilling d.c.trilling@uva.nl @damian0604 www.damiantrilling.net Afdeling Communicatiewetenschap Universiteit van Amsterdam 8 April 2014 Big Data and Automated Content Analysis Damian Trilling
  • 2. Basics Exercise Next meetings Today 1 The very, very, basics of programming with Python Datatypes Indention: The Python way of structuring your program 2 Exercise 3 Next meetings Big Data and Automated Content Analysis Damian Trilling
  • 3. The very, very, basics of programming You’ve read all this in chapter 3.
  • 4. Basics Exercise Next meetings Datatypes Python lingo Basic datatypes (variables) int 32 float 1.75 bool True, False string "Damian" Big Data and Automated Content Analysis Damian Trilling
  • 5. Basics Exercise Next meetings Datatypes Python lingo Basic datatypes (variables) int 32 float 1.75 bool True, False string "Damian" "5" and 5 is not the same. But you can transform it: int("5") will return 5. You cannot calculate 3 * "5". But you can calculate 3 * int("5") Big Data and Automated Content Analysis Damian Trilling
  • 6. Basics Exercise Next meetings Datatypes Python lingo More advanced datatypes Note that the elements of a list, the keys of a dict, and the values of a dict can have any datatype! (It should be consistent, though!) Big Data and Automated Content Analysis Damian Trilling
  • 7. Basics Exercise Next meetings Datatypes Python lingo More advanced datatypes list firstnames = [’Damian’,’Lori’,’Bjoern’] lastnames = [’Trilling’,’Meester’,’Burscher’] Note that the elements of a list, the keys of a dict, and the values of a dict can have any datatype! (It should be consistent, though!) Big Data and Automated Content Analysis Damian Trilling
  • 8. Basics Exercise Next meetings Datatypes Python lingo More advanced datatypes list firstnames = [’Damian’,’Lori’,’Bjoern’] lastnames = [’Trilling’,’Meester’,’Burscher’] list ages = [18,22,45,23] Note that the elements of a list, the keys of a dict, and the values of a dict can have any datatype! (It should be consistent, though!) Big Data and Automated Content Analysis Damian Trilling
  • 9. Basics Exercise Next meetings Datatypes Python lingo More advanced datatypes list firstnames = [’Damian’,’Lori’,’Bjoern’] lastnames = [’Trilling’,’Meester’,’Burscher’] list ages = [18,22,45,23] dict familynames= {’Bjoern’: ’Burscher’, ’Damian’: ’Trilling’, ’Lori’: ’Meester’} dict {’Bjoern’: 26, ’Damian’: 31, ’Lori’: 25} Note that the elements of a list, the keys of a dict, and the values of a dict can have any datatype! (It should be consistent, though!) Big Data and Automated Content Analysis Damian Trilling
  • 10. Basics Exercise Next meetings Datatypes Python lingo Functions Big Data and Automated Content Analysis Damian Trilling
  • 11. Basics Exercise Next meetings Datatypes Python lingo Functions functions Take an input and return something else int(32.43) returns the integer 32. len("Hello") returns the integer 5. Big Data and Automated Content Analysis Damian Trilling
  • 12. Basics Exercise Next meetings Datatypes Python lingo Functions functions Take an input and return something else int(32.43) returns the integer 32. len("Hello") returns the integer 5. methods are similar to functions, but directly associated with an object. "SCREAM".lower() returns the string "scream" Big Data and Automated Content Analysis Damian Trilling
  • 13. Basics Exercise Next meetings Datatypes Python lingo Functions functions Take an input and return something else int(32.43) returns the integer 32. len("Hello") returns the integer 5. methods are similar to functions, but directly associated with an object. "SCREAM".lower() returns the string "scream" Both functions and methods end with (). Between the (), arguments can (sometimes have to) be supplied. Big Data and Automated Content Analysis Damian Trilling
  • 14. Indention: The Python way of structuring your program
  • 15. Basics Exercise Next meetings Indention Indention Structure The program is structured by TABs or SPACEs 1 firstnames=[’Damian’,’Lori’,’Bjoern’] 2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26} 3 print ("The names and ages of all BigData people:") 4 for naam in firstnames: 5 print (naam,age[naam]) Big Data and Automated Content Analysis Damian Trilling
  • 16. Basics Exercise Next meetings Indention Indention Structure The program is structured by TABs or SPACEs 1 firstnames=[’Damian’,’Lori’,’Bjoern’] 2 age={’Bjoern’: 27, ’Damian’: 32, ’Lori’: 26} 3 print ("The names and ages of all BigData people:") 4 for naam in firstnames: 5 print (naam,age[naam]) Don’t mix up TABs and spaces! Both are valid, but you have to be consequent!!! Big Data and Automated Content Analysis Damian Trilling
  • 17. Basics Exercise Next meetings Indention Indention Structure The program is structured by TABs or SPACEs 1 print ("The names and ages of all BigData people:") 2 for naam in firstnames: 3 print (naam,age[naam]) 4 if naam=="Damian": 5 print ("He teaches this course") 6 elif naam=="Lori": 7 print ("She was an assistant last year") 8 elif naam=="Bjoern": 9 print ("He helps on Wednesdays") 10 else: 11 print ("No idea who this is") Big Data and Automated Content Analysis Damian Trilling
  • 18. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Big Data and Automated Content Analysis Damian Trilling
  • 19. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that Big Data and Automated Content Analysis Damian Trilling
  • 20. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that • it is to be executed repeatedly (for statement) – e.g., for each element from a list Big Data and Automated Content Analysis Damian Trilling
  • 21. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that • it is to be executed repeatedly (for statement) – e.g., for each element from a list • it is only to be executed under specific conditions (if, elif, and else statements) Big Data and Automated Content Analysis Damian Trilling
  • 22. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that • it is to be executed repeatedly (for statement) – e.g., for each element from a list • it is only to be executed under specific conditions (if, elif, and else statements) • an alternative block should be executed if an error occurs (try and except statements) Big Data and Automated Content Analysis Damian Trilling
  • 23. Basics Exercise Next meetings Indention Indention The line before an indented block starts with a statement indicating what should be done with the block and ends with a : Indention of the block indicates that • it is to be executed repeatedly (for statement) – e.g., for each element from a list • it is only to be executed under specific conditions (if, elif, and else statements) • an alternative block should be executed if an error occurs (try and except statements) • a file is opened, but should be closed again after the block has been executed (with statement) Big Data and Automated Content Analysis Damian Trilling
  • 24. Basics Exercise Next meetings We’ll now together do the exercise “Describing an existing structured dataset”. Big Data and Automated Content Analysis Damian Trilling
  • 25. Basics Exercise Next meetings Next meetings Big Data and Automated Content Analysis Damian Trilling
  • 26. Basics Exercise Next meetings Week 3: Data harvesting and storage Monday, 13–4 A conceptual overview of APIs, scrapers, crawlers, RSS-feeds, databases, and different file formats Wednesday, 15–4 Writing some first data collection scripts Preparation • Conceptual level: Read the article by Morstatter, Pfeffer, Liu, and Carley (2013) about the limitations of the Twitter API. • Technical level: Make sure you are comfortable with the techniques we’ve covered so far. Play around. Give yourself some tasks and solve them. Google. Big Data and Automated Content Analysis Damian Trilling