SlideShare a Scribd company logo
Exercises in Style
Crista Lopes
Available on Amazon
Ejercicios de estilo en la programación
PROGRAMMING STYLES
Rules and constraints in software construction
Programming Styles
⊳ Ways of expressing tasks
⊳ Exist and recur at all scales
⊳ Frozen in Programming
Languages
Why Are Styles Important?
⊳ Basic frames of reference for
solutions
• Like scaffolding
⊳ Common vocabularies
• It’s a cultural thing
⊳ Some better than others
• Depending on many things!
Programming Styles
How do you teach this?
Raymond Queneau
Queneau’s Exercises in Style
⊳ Metaphor
⊳ Surprises
⊳ Dream
⊳ Prognostication
⊳ Hesitation
⊳ Precision
⊳ Negativities
⊳ Asides
⊳ Anagrams
⊳ Logical analysis
⊳ Past
⊳ Present
⊳ …
⊳ (99)
Queneau’s “Styles”
⊳ Constraints
⊳ “A Void” (La Disparition)
by Georges Perec
• No letter ‘e’
Exercises in Programming Style
The story:
Term Frequency
given a text file,
output a list of the 25
most frequently-occurring
words, ordered by decreasing
frequency
Exercises in Programming Style
The story:
Term Frequency
given a text file,
output a list of the 25
most frequently-occurring
words, ordered by decreasing
frequency
mr - 786
elizabeth - 635
very - 488
darcy - 418
such - 395
mrs - 343
much - 329
more - 327
bennet - 323
bingley - 306
jane - 295
miss - 283
one - 275
know - 239
before - 229
herself - 227
though - 226
well - 224
never - 220
…
TFPride and Prejudice
STYLE #1
@cristalopes #style1 name
Ejercicios de estilo en la programación
# the global list of [word, frequency] pairs
word_freqs = []
# the list of stop words
with open('../stop_words.txt') as f:
stop_words = f.read().split(',')
stop_words.extend(list(string.ascii_lowercase))
for line in open(sys.argv[1]):
for c in line:
Style #1 Main Characteristics
⊳ No abstractions
⊳ No use of library functions
@cristalopes #style1 name
Style #1 Main Characteristics
⊳ No abstractions
⊳ No use of library functions
Brain-dump Style
@cristalopes #style1 name
STYLE #2
@cristalopes #style2 name
import re, string, sys
stops = set(open("../stop_words.txt").read().split(",") + list(string.ascii_lowercase))
words = [x.lower() for x in re.split("[^a-zA-Z]+", open(sys.argv[1]).read()) if len(x) > 0 and x.lower() not in stops]
unique_words = list(set(words))
unique_words.sort(lambda x, y: cmp(words.count(y), words.count(x)))
print "n".join(["%s - %s" % (x, words.count(x)) for x in unique_words[:25]])
Credit: Laurie Tratt, Kings College London
import re, string, sys
stops = set(open("../stop_words.txt").read().split(",") +
list(string.ascii_lowercase))
words = [x.lower() for x in re.split("[^a-zA-Z]+",
open(sys.argv[1]).read())
if len(x) > 0 and x.lower() not in stops]
unique_words = list(set(words))
unique_words.sort(lambda x, y: cmp(words.count(y),
words.count(x)))
print "n".join(["%s - %s" % (x, words.count(x))
for x in unique_words[:25]])
import re, string, sys
stops = set(open("../stop_words.txt").read().split(",") +
list(string.ascii_lowercase))
words = [x.lower() for x in re.split("[^a-zA-Z]+",
open(sys.argv[1]).read())
if len(x) > 0 and x.lower() not in stops]
unique_words = list(set(words))
unique_words.sort(lambda x,y:cmp(words.count(y),
words.count(x)))
print "n".join(["%s - %s" % (x, words.count(x))
for x in unique_words[:25]])
Style #2 Main Characteristics
⊳ As few lines of code as possible
@cristalopes #style2 name
Style #2 Main Characteristics
⊳ As few lines of code as possible
Code Golf Style
@cristalopes #style2 name
Style #2 Main Characteristics
⊳ As few lines of code as possible
Try Hard Style
@cristalopes #style2 name
STYLE #3
@cristalopes #style3 name
Ejercicios de estilo en la programación
#
# Main
#
read_file(sys.argv[1])
filter_normalize()
scan()
rem_stop_words()
frequencies()
sort()
for tf in word_freqs[0:25]:
print tf[0], ' - ', tf[1]
def read_file(path):
def filter_normalize():
def scan():
def rem_stop_words():
def frequencies():
def sort():
data=[]
words=[]
freqs=[]
Style #3 Main Characteristics
⊳ Procedural abstractions
• maybe input, no output
⊳ Shared state
⊳ Larger problem solved by
applying procedures, one after
the other, changing the shared
state
@cristalopes #style3 name
Style #3 Main Characteristics
⊳ Procedural abstractions
• maybe input, no output
⊳ Shared state
⊳ Commands
Cook Book Style
@cristalopes #style3 name
STYLE #4
@cristalopes #style4 name
Ejercicios de estilo en la programación
#
# Main
#
wfreqs=st(fq(r(sc(n(fc(rf(sys.argv[1])))))))
for tf in wfreqs[0:25]:
print tf[0], ' - ', tf[1]
def read_file(path):
def filter(str_data):
def scan(str_data):
def rem_stop_words(wordl):
def frequencies(wordl):
def sort(word_freqs):
def normalize(str_data):
return ...
return ...
return ...
return ...
return ...
return ...
return ...
Style #4 Main Characteristics
⊳ Function abstractions
• f: Input  Output
⊳ No shared state
⊳ Function composition f º g
@cristalopes #style4 name
Style #4 Main Characteristics
⊳ Function abstractions
• f: Input  Output
⊳ No shared state
⊳ Function composition f º g
Chocolate Factory Style
Image credit: Nykamp DQ, From Math Insight. http://mathinsight.org/image/function machines composed
g
f
@cristalopes #style4 name
STYLE #5
@cristalopes #style5 name
Ejercicios de estilo en la programación
def read_file(path, func):
...
return func(…, normalize)
def filter_chars(data, func):
...
return func(…, scan)
def normalize(data, func):
...
return func(…,remove_stops)
# Main
w_freqs=read_file(sys.argv[1],
filter_chars)
for tf in w_freqs[0:25]:
print tf[0], ' - ', tf[1]
def scan(data, func):
...
return func(…, frequencies)
def remove_stops(data, func):
...
return func(…, sort)
Etc.
Style #5 Main Characteristics
⊳ Functions take one additional
parameter, f
• called at the end
• given what would normally be the
return value plus the next function
@cristalopes #style5 name
Style #5 Main Characteristics
⊳ Functions take one additional
parameter, f
• called at the end
• given what would normally be the
return value plus the next function
Crochet Style
@cristalopes #style5 name
STYLE #6
@cristalopes #style6 name
Ejercicios de estilo en la programación
class DataStorageManager(TFExercise):
class TFExercise():
class StopWordManager(TFExercise):
class WordFreqManager(TFExercise):
class WordFreqController(TFExercise):
# Main
WordFreqController(sys.argv[1]).run()
def words(self):
def info(self):
def info(self): def info(self):
def info(self):
def is_stop_word(self, word):
def inc_count(self, word):
def sorted(self):
def run(self):
Style #6 Main Characteristics
⊳ Things, things and more things!
• Capsules of data and procedures
⊳ Data is never accessed directly
⊳ Capsules can reappropriate
procedures from other capsules
@cristalopes #style6 name
Style #6 Main Characteristics
⊳ Things, things and more things!
• Capsules of data and procedures
⊳ Data is never accessed directly
⊳ Capsules can reappropriate
procedures from other capsules
@cristalopes #style6 name
Kingdom of Nouns Style
STYLE #7
@cristalopes #style7 name
Ejercicios de estilo en la programación
class DataStorageManager():
class StopWordManager():
class WordFrequencyManager():
class WordFrequencyController():
def dispatch(self, message):
def dispatch(self, message):
def dispatch(self, message):
def dispatch(self, message):
# Main
wfcntrl = WordFrequencyController()
wfcntrl.dispatch([‘init’,sys.argv[1]])
wfcntrl.dispatch([‘run’])
Style #7 Main Characteristics
⊳ (Similar to #6)
⊳ Capsules receive messages via
single receiving procedure
@cristalopes #style7 name
Style #7 Main Characteristics
⊳ (Similar to #6)
⊳ Capsules receive messages via
single receiving procedure
@cristalopes #style7 name
Letterbox Style
STYLE #8
@cristalopes #style8 name
Ejercicios de estilo en la programación
# Main
splits = map(split_words,
partition(read_file(sys.argv[1]), 200))
splits.insert(0, [])
word_freqs = sort(reduce(count_words, splits))
for tf in word_freqs[0:25]:
print tf[0], ' - ', tf[1]
def split_words(data_str)
"""
Takes a string (many lines), filters, normalizes to
lower case, scans for words, and filters the stop words.
Returns a list of pairs (word, 1), so
[(w1, 1), (w2, 1), ..., (wn, 1)]
"""
...
result = []
words = _rem_stop_words(_scan(_normalize(_filter(data_str))))
for w in words:
result.append((w, 1))
return result
def count_words(pairs_list_1, pairs_list_2)
"""
Takes two lists of pairs of the form
[(w1, 1), ...]
and returns a list of pairs [(w1, frequency), ...],
where frequency is the sum of all occurrences
"""
mapping = dict((k, v) for k, v in pairs_list_1)
for p in pairs_list_2:
if p[0] in mapping:
mapping[p[0]] += p[1]
else:
mapping[p[0]] = 1
return mapping.items()
Style #8 Main Characteristics
⊳ Two key abstractions:
map(f, chunks) and
reduce(g, results)
@cristalopes #style8 name
Style #8 Main Characteristics
⊳ Two key abstractions:
map(f, chunks) and
reduce(g, results)
@cristalopes #style8 name
Map-Reduce Style
STYLE #9
@cristalopes #style9 name
Ejercicios de estilo en la programación
# Main
connection = sqlite3.connect(':memory:')
create_db_schema(connection)
load_file_into_database(sys.argv[1], connection)
# Now, let's query
c = connection.cursor()
c.execute("SELECT value, COUNT(*) as C FROM words GROUP BY value ORDER BY C DESC")
for i in range(25):
row = c.fetchone()
if row != None:
print row[0] + ' - ' + str(row[1])
connection.close()
def create_db_schema(connection):
c = connection.cursor()
c.execute('''CREATE TABLE documents(id PRIMARY KEY AUTOINCREMENT, name)'''
c.execute('''CREATE TABLE words(id, doc_id, value)''')
c.execute('''CREATE TABLE characters(id, word_id, value)''')
connection.commit()
c.close()
# Now let's add data to the database
# Add the document itself to the database
c = connection.cursor()
c.execute("INSERT INTO documents (name) VALUES (?)", (path_to_f
c.execute("SELECT id from documents WHERE name=?", (path_to_fil
doc_id = c.fetchone()[0]
# Add the words to the database
c.execute("SELECT MAX(id) FROM words")
row = c.fetchone()
word_id = row[0]
if word_id == None:
word_id = 0
for w in words:
c.execute("INSERT INTO words VALUES (?, ?, ?)", (word_id, d
# Add the characters to the database
char_id = 0
for char in w:
c.execute("INSERT INTO characters VALUES (?, ?, ?)", (c
char_id += 1
word_id += 1
connection.commit()
c.close()
Style #9 Main Characteristics
⊳ Entities and relations between them
⊳ Query engine
• Declarative queries
@cristalopes #style9 name
Style #9 Main Characteristics
⊳ Entities and relations between them
⊳ Query engine
• Declarative queries
@cristalopes #style9 name
Tabular Style
Take Home
⊳ Many ways of solving problems
• Know them, assess them
⊳ Constraints are important for
communication
• Make them explicit
⊳ Don’t be hostage of one way of
doing things
Available on Amazon

More Related Content

PDF
20170509 rand db_lesugent
PPTX
P3 2017 python_regexes
PDF
Why async and functional programming in PHP7 suck and how to get overr it?
PDF
Python programming : List and tuples
PDF
Python data handling notes
PPTX
Language R
PDF
PDF
Python : Regular expressions
20170509 rand db_lesugent
P3 2017 python_regexes
Why async and functional programming in PHP7 suck and how to get overr it?
Python programming : List and tuples
Python data handling notes
Language R
Python : Regular expressions

What's hot (15)

PDF
Data Analysis and Programming in R
PDF
R learning by examples
PDF
JDD2015: Functional programing and Event Sourcing - a pair made in heaven - e...
PDF
R for Pythonistas (PyData NYC 2017)
PPTX
Advanced geoprocessing with Python
PPTX
P2 2017 python_strings
PPTX
Dictionary in python
PPTX
R Language Introduction
PPTX
Data analysis with R
PDF
Learning notes of r for python programmer (Temp1)
PDF
Functional Programming by Examples using Haskell
PDF
Functional programming from its fundamentals
PDF
Data Structures In Scala
PDF
Beginning Haskell, Dive In, Its Not That Scary!
PDF
Python programming : Strings
Data Analysis and Programming in R
R learning by examples
JDD2015: Functional programing and Event Sourcing - a pair made in heaven - e...
R for Pythonistas (PyData NYC 2017)
Advanced geoprocessing with Python
P2 2017 python_strings
Dictionary in python
R Language Introduction
Data analysis with R
Learning notes of r for python programmer (Temp1)
Functional Programming by Examples using Haskell
Functional programming from its fundamentals
Data Structures In Scala
Beginning Haskell, Dive In, Its Not That Scary!
Python programming : Strings
Ad

Similar to Ejercicios de estilo en la programación (20)

KEY
Presentation R basic teaching module
PDF
Morel, a data-parallel programming language
PPTX
R language introduction
PPTX
Python Workshop - Learn Python the Hard Way
PPTX
python beginner talk slide
ODP
Introduction to R
PDF
Cypher.PL: an executable specification of Cypher semantics
DOCX
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
PDF
Poetry with R -- Dissecting the code
PDF
Introduction to R programming
ODP
Scala as a Declarative Language
PDF
Term Rewriting
PPT
Profiling and optimization
PPT
R workshop
PDF
Python basic
PDF
Basic and logical implementation of r language
PDF
R Programming: Export/Output Data In R
PDF
Functional programming in ruby
PPTX
GE8151 Problem Solving and Python Programming
KEY
Five Languages in a Moment
Presentation R basic teaching module
Morel, a data-parallel programming language
R language introduction
Python Workshop - Learn Python the Hard Way
python beginner talk slide
Introduction to R
Cypher.PL: an executable specification of Cypher semantics
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
Poetry with R -- Dissecting the code
Introduction to R programming
Scala as a Declarative Language
Term Rewriting
Profiling and optimization
R workshop
Python basic
Basic and logical implementation of r language
R Programming: Export/Output Data In R
Functional programming in ruby
GE8151 Problem Solving and Python Programming
Five Languages in a Moment
Ad

More from Software Guru (20)

PDF
Hola Mundo del Internet de las Cosas
PDF
Estructuras de datos avanzadas: Casos de uso reales
PPTX
Building bias-aware environments
PDF
El secreto para ser un desarrollador Senior
PDF
Cómo encontrar el trabajo remoto ideal
PDF
Automatizando ideas con Apache Airflow
PPTX
How thick data can improve big data analysis for business:
PDF
Introducción al machine learning
PDF
Democratizando el uso de CoDi
PDF
Gestionando la felicidad de los equipos con Management 3.0
PDF
Taller: Creación de Componentes Web re-usables con StencilJS
PPTX
El camino del full stack developer (o como hacemos en SERTI para que no solo ...
PDF
¿Qué significa ser un programador en Bitso?
PDF
Colaboración efectiva entre desarrolladores del cliente y tu equipo.
PDF
Pruebas de integración con Docker en Azure DevOps
PDF
Elixir + Elm: Usando lenguajes funcionales en servicios productivos
PDF
Así publicamos las apps de Spotify sin stress
PPTX
Achieving Your Goals: 5 Tips to successfully achieve your goals
PDF
Acciones de comunidades tech en tiempos del Covid19
PDF
De lo operativo a lo estratégico: un modelo de management de diseño
Hola Mundo del Internet de las Cosas
Estructuras de datos avanzadas: Casos de uso reales
Building bias-aware environments
El secreto para ser un desarrollador Senior
Cómo encontrar el trabajo remoto ideal
Automatizando ideas con Apache Airflow
How thick data can improve big data analysis for business:
Introducción al machine learning
Democratizando el uso de CoDi
Gestionando la felicidad de los equipos con Management 3.0
Taller: Creación de Componentes Web re-usables con StencilJS
El camino del full stack developer (o como hacemos en SERTI para que no solo ...
¿Qué significa ser un programador en Bitso?
Colaboración efectiva entre desarrolladores del cliente y tu equipo.
Pruebas de integración con Docker en Azure DevOps
Elixir + Elm: Usando lenguajes funcionales en servicios productivos
Así publicamos las apps de Spotify sin stress
Achieving Your Goals: 5 Tips to successfully achieve your goals
Acciones de comunidades tech en tiempos del Covid19
De lo operativo a lo estratégico: un modelo de management de diseño

Recently uploaded (20)

PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PDF
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PDF
How Tridens DevSecOps Ensures Compliance, Security, and Agility
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
Trending Python Topics for Data Visualization in 2025
PDF
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
PPTX
assetexplorer- product-overview - presentation
PDF
Salesforce Agentforce AI Implementation.pdf
PPTX
Tech Workshop Escape Room Tech Workshop
DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PPTX
Computer Software and OS of computer science of grade 11.pptx
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PPTX
"Secure File Sharing Solutions on AWS".pptx
DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
PDF
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
PDF
Time Tracking Features That Teams and Organizations Actually Need
Advanced SystemCare Ultimate Crack + Portable (2025)
EaseUS PDF Editor Pro 6.2.0.2 Crack with License Key 2025
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
How Tridens DevSecOps Ensures Compliance, Security, and Agility
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Monitoring Stack: Grafana, Loki & Promtail
Trending Python Topics for Data Visualization in 2025
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
assetexplorer- product-overview - presentation
Salesforce Agentforce AI Implementation.pdf
Tech Workshop Escape Room Tech Workshop
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
Computer Software and OS of computer science of grade 11.pptx
Why Generative AI is the Future of Content, Code & Creativity?
Weekly report ppt - harsh dattuprasad patel.pptx
"Secure File Sharing Solutions on AWS".pptx
How to Use SharePoint as an ISO-Compliant Document Management System
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
Time Tracking Features That Teams and Organizations Actually Need

Ejercicios de estilo en la programación

  • 4. PROGRAMMING STYLES Rules and constraints in software construction
  • 5. Programming Styles ⊳ Ways of expressing tasks ⊳ Exist and recur at all scales ⊳ Frozen in Programming Languages
  • 6. Why Are Styles Important? ⊳ Basic frames of reference for solutions • Like scaffolding ⊳ Common vocabularies • It’s a cultural thing ⊳ Some better than others • Depending on many things!
  • 7. Programming Styles How do you teach this?
  • 9. Queneau’s Exercises in Style ⊳ Metaphor ⊳ Surprises ⊳ Dream ⊳ Prognostication ⊳ Hesitation ⊳ Precision ⊳ Negativities ⊳ Asides ⊳ Anagrams ⊳ Logical analysis ⊳ Past ⊳ Present ⊳ … ⊳ (99)
  • 10. Queneau’s “Styles” ⊳ Constraints ⊳ “A Void” (La Disparition) by Georges Perec • No letter ‘e’
  • 11. Exercises in Programming Style The story: Term Frequency given a text file, output a list of the 25 most frequently-occurring words, ordered by decreasing frequency
  • 12. Exercises in Programming Style The story: Term Frequency given a text file, output a list of the 25 most frequently-occurring words, ordered by decreasing frequency mr - 786 elizabeth - 635 very - 488 darcy - 418 such - 395 mrs - 343 much - 329 more - 327 bennet - 323 bingley - 306 jane - 295 miss - 283 one - 275 know - 239 before - 229 herself - 227 though - 226 well - 224 never - 220 … TFPride and Prejudice
  • 15. # the global list of [word, frequency] pairs word_freqs = [] # the list of stop words with open('../stop_words.txt') as f: stop_words = f.read().split(',') stop_words.extend(list(string.ascii_lowercase))
  • 16. for line in open(sys.argv[1]): for c in line:
  • 17. Style #1 Main Characteristics ⊳ No abstractions ⊳ No use of library functions @cristalopes #style1 name
  • 18. Style #1 Main Characteristics ⊳ No abstractions ⊳ No use of library functions Brain-dump Style @cristalopes #style1 name
  • 20. import re, string, sys stops = set(open("../stop_words.txt").read().split(",") + list(string.ascii_lowercase)) words = [x.lower() for x in re.split("[^a-zA-Z]+", open(sys.argv[1]).read()) if len(x) > 0 and x.lower() not in stops] unique_words = list(set(words)) unique_words.sort(lambda x, y: cmp(words.count(y), words.count(x))) print "n".join(["%s - %s" % (x, words.count(x)) for x in unique_words[:25]]) Credit: Laurie Tratt, Kings College London
  • 21. import re, string, sys stops = set(open("../stop_words.txt").read().split(",") + list(string.ascii_lowercase)) words = [x.lower() for x in re.split("[^a-zA-Z]+", open(sys.argv[1]).read()) if len(x) > 0 and x.lower() not in stops] unique_words = list(set(words)) unique_words.sort(lambda x, y: cmp(words.count(y), words.count(x))) print "n".join(["%s - %s" % (x, words.count(x)) for x in unique_words[:25]])
  • 22. import re, string, sys stops = set(open("../stop_words.txt").read().split(",") + list(string.ascii_lowercase)) words = [x.lower() for x in re.split("[^a-zA-Z]+", open(sys.argv[1]).read()) if len(x) > 0 and x.lower() not in stops] unique_words = list(set(words)) unique_words.sort(lambda x,y:cmp(words.count(y), words.count(x))) print "n".join(["%s - %s" % (x, words.count(x)) for x in unique_words[:25]])
  • 23. Style #2 Main Characteristics ⊳ As few lines of code as possible @cristalopes #style2 name
  • 24. Style #2 Main Characteristics ⊳ As few lines of code as possible Code Golf Style @cristalopes #style2 name
  • 25. Style #2 Main Characteristics ⊳ As few lines of code as possible Try Hard Style @cristalopes #style2 name
  • 28. # # Main # read_file(sys.argv[1]) filter_normalize() scan() rem_stop_words() frequencies() sort() for tf in word_freqs[0:25]: print tf[0], ' - ', tf[1] def read_file(path): def filter_normalize(): def scan(): def rem_stop_words(): def frequencies(): def sort(): data=[] words=[] freqs=[]
  • 29. Style #3 Main Characteristics ⊳ Procedural abstractions • maybe input, no output ⊳ Shared state ⊳ Larger problem solved by applying procedures, one after the other, changing the shared state @cristalopes #style3 name
  • 30. Style #3 Main Characteristics ⊳ Procedural abstractions • maybe input, no output ⊳ Shared state ⊳ Commands Cook Book Style @cristalopes #style3 name
  • 33. # # Main # wfreqs=st(fq(r(sc(n(fc(rf(sys.argv[1]))))))) for tf in wfreqs[0:25]: print tf[0], ' - ', tf[1] def read_file(path): def filter(str_data): def scan(str_data): def rem_stop_words(wordl): def frequencies(wordl): def sort(word_freqs): def normalize(str_data): return ... return ... return ... return ... return ... return ... return ...
  • 34. Style #4 Main Characteristics ⊳ Function abstractions • f: Input  Output ⊳ No shared state ⊳ Function composition f º g @cristalopes #style4 name
  • 35. Style #4 Main Characteristics ⊳ Function abstractions • f: Input  Output ⊳ No shared state ⊳ Function composition f º g Chocolate Factory Style Image credit: Nykamp DQ, From Math Insight. http://mathinsight.org/image/function machines composed g f @cristalopes #style4 name
  • 38. def read_file(path, func): ... return func(…, normalize) def filter_chars(data, func): ... return func(…, scan) def normalize(data, func): ... return func(…,remove_stops) # Main w_freqs=read_file(sys.argv[1], filter_chars) for tf in w_freqs[0:25]: print tf[0], ' - ', tf[1] def scan(data, func): ... return func(…, frequencies) def remove_stops(data, func): ... return func(…, sort) Etc.
  • 39. Style #5 Main Characteristics ⊳ Functions take one additional parameter, f • called at the end • given what would normally be the return value plus the next function @cristalopes #style5 name
  • 40. Style #5 Main Characteristics ⊳ Functions take one additional parameter, f • called at the end • given what would normally be the return value plus the next function Crochet Style @cristalopes #style5 name
  • 43. class DataStorageManager(TFExercise): class TFExercise(): class StopWordManager(TFExercise): class WordFreqManager(TFExercise): class WordFreqController(TFExercise): # Main WordFreqController(sys.argv[1]).run() def words(self): def info(self): def info(self): def info(self): def info(self): def is_stop_word(self, word): def inc_count(self, word): def sorted(self): def run(self):
  • 44. Style #6 Main Characteristics ⊳ Things, things and more things! • Capsules of data and procedures ⊳ Data is never accessed directly ⊳ Capsules can reappropriate procedures from other capsules @cristalopes #style6 name
  • 45. Style #6 Main Characteristics ⊳ Things, things and more things! • Capsules of data and procedures ⊳ Data is never accessed directly ⊳ Capsules can reappropriate procedures from other capsules @cristalopes #style6 name Kingdom of Nouns Style
  • 48. class DataStorageManager(): class StopWordManager(): class WordFrequencyManager(): class WordFrequencyController(): def dispatch(self, message): def dispatch(self, message): def dispatch(self, message): def dispatch(self, message): # Main wfcntrl = WordFrequencyController() wfcntrl.dispatch([‘init’,sys.argv[1]]) wfcntrl.dispatch([‘run’])
  • 49. Style #7 Main Characteristics ⊳ (Similar to #6) ⊳ Capsules receive messages via single receiving procedure @cristalopes #style7 name
  • 50. Style #7 Main Characteristics ⊳ (Similar to #6) ⊳ Capsules receive messages via single receiving procedure @cristalopes #style7 name Letterbox Style
  • 53. # Main splits = map(split_words, partition(read_file(sys.argv[1]), 200)) splits.insert(0, []) word_freqs = sort(reduce(count_words, splits)) for tf in word_freqs[0:25]: print tf[0], ' - ', tf[1]
  • 54. def split_words(data_str) """ Takes a string (many lines), filters, normalizes to lower case, scans for words, and filters the stop words. Returns a list of pairs (word, 1), so [(w1, 1), (w2, 1), ..., (wn, 1)] """ ... result = [] words = _rem_stop_words(_scan(_normalize(_filter(data_str)))) for w in words: result.append((w, 1)) return result
  • 55. def count_words(pairs_list_1, pairs_list_2) """ Takes two lists of pairs of the form [(w1, 1), ...] and returns a list of pairs [(w1, frequency), ...], where frequency is the sum of all occurrences """ mapping = dict((k, v) for k, v in pairs_list_1) for p in pairs_list_2: if p[0] in mapping: mapping[p[0]] += p[1] else: mapping[p[0]] = 1 return mapping.items()
  • 56. Style #8 Main Characteristics ⊳ Two key abstractions: map(f, chunks) and reduce(g, results) @cristalopes #style8 name
  • 57. Style #8 Main Characteristics ⊳ Two key abstractions: map(f, chunks) and reduce(g, results) @cristalopes #style8 name Map-Reduce Style
  • 60. # Main connection = sqlite3.connect(':memory:') create_db_schema(connection) load_file_into_database(sys.argv[1], connection) # Now, let's query c = connection.cursor() c.execute("SELECT value, COUNT(*) as C FROM words GROUP BY value ORDER BY C DESC") for i in range(25): row = c.fetchone() if row != None: print row[0] + ' - ' + str(row[1]) connection.close()
  • 61. def create_db_schema(connection): c = connection.cursor() c.execute('''CREATE TABLE documents(id PRIMARY KEY AUTOINCREMENT, name)''' c.execute('''CREATE TABLE words(id, doc_id, value)''') c.execute('''CREATE TABLE characters(id, word_id, value)''') connection.commit() c.close()
  • 62. # Now let's add data to the database # Add the document itself to the database c = connection.cursor() c.execute("INSERT INTO documents (name) VALUES (?)", (path_to_f c.execute("SELECT id from documents WHERE name=?", (path_to_fil doc_id = c.fetchone()[0] # Add the words to the database c.execute("SELECT MAX(id) FROM words") row = c.fetchone() word_id = row[0] if word_id == None: word_id = 0 for w in words: c.execute("INSERT INTO words VALUES (?, ?, ?)", (word_id, d # Add the characters to the database char_id = 0 for char in w: c.execute("INSERT INTO characters VALUES (?, ?, ?)", (c char_id += 1 word_id += 1 connection.commit() c.close()
  • 63. Style #9 Main Characteristics ⊳ Entities and relations between them ⊳ Query engine • Declarative queries @cristalopes #style9 name
  • 64. Style #9 Main Characteristics ⊳ Entities and relations between them ⊳ Query engine • Declarative queries @cristalopes #style9 name Tabular Style
  • 65. Take Home ⊳ Many ways of solving problems • Know them, assess them ⊳ Constraints are important for communication • Make them explicit ⊳ Don’t be hostage of one way of doing things