SlideShare a Scribd company logo
Functions and modules
       Karin Lagesen

  karin.lagesen@bio.uio.no
Homework:
            TranslateProtein.py
●   Input files are in
    /projects/temporary/cees-python-course/Karin
      ●   translationtable.txt - tab separated
      ●   dna31.fsa
●   Script should:
      ●   Open the translationtable.txt file and read it into a
          dictionary
      ●   Open the dna31.fsa file and read the contents.
      ●   Translates the DNA into protein using the dictionary
      ●   Prints the translation in a fasta format to the file
          TranslateProtein.fsa. Each protein line should be 60
          characters long.
Modularization
●   Programs can get big
●   Risk of doing the same thing many times
●   Functions and modules encourage
     ●   re-usability
     ●   readability
     ●   helps with maintenance
Functions
●   Most common way to modularize a
    program
●   Takes values as parameters, executes
    code on them, returns results
●   Functions also found builtin to Python:
     ●   open(filename, mode)
     ●   sum([list of numbers]
●   These do something on their parameters,
    and returns the results
Functions – how to define
    def FunctionName(param1, param2, ...):

          """ Optional Function desc (Docstring) """

          FUNCTION CODE ...

          return DATA



●
    keyword: def – says this is a function
●   functions need names
●   parameters are optional, but common
●   docstring useful, but not mandatory
●   FUNCTION CODE does something
●   keyword return results: return
Function example
     >>> def hello(name):
     ... results = "Hello World to " + name + "!"
     ... return results
     ...
     >>> hello()
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     TypeError: hello() takes exactly 1 argument (0 given)
     >>> hello("Lex")
     'Hello World to Lex!'
     >>>



●   Task: make script from this – take name
    from command line
●   Print results to screen
Function example
                    script
import sys

def hello(name):
  results = "Hello World to " + name + "!"
  return results

name = sys.argv[1]
functionresult = hello(name)
print functionresult




[karinlag@freebee]% python hello.py
Traceback (most recent call last):
  File "hello.py", line 8, in ?
   name = sys.argv[1]
IndexError: list index out of range
[karinlag@freebee]% python hello.py Lex
Hello World to Lex!
[karinlag@freebee]%
Returning values
●   Returning is not mandatory, if no return,
    None is returned by default
●   Can return more than one value - results
    will be shown as a tuple

               >>> def test(x, y):
               ... a = x*y
               ... return x, a
               ...
               >>> test(1,2)
               (1, 2)
               >>>
Function scope
●   Variables defined inside a function can
    only be seen there!
●   Access the value of variables defined
    inside of function: return variable
Scope example
>>> def test(x):
... z = 10
... print "the value of z is " + str(z)
... return x*2
...
>>> z = 50
>>> test(3)
the value of z is 10
6
>>> z
50
>>> x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
>>>
Parameters
●   Functions can take parameters – not
    mandatory
●   Parameters follow the order in which they
    are given
    >>> def test(x, y):
    ... print x*2
    ... print y + str(x)
    ...
    >>> test(2, "y")
    4
    y2
    >>> test("y", 2)
    yy
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in test
    TypeError: unsupported operand type(s) for +: 'int' and 'str'
    >>>
Named parameters
●   Can use named parameters
    >>> def test(x, y):
    ... print x*2
    ... print y + str(x)
    ...
    >>> test(2, "y")
    4
    y2
    >>> test("y", 2)
    yy
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in test
    TypeError: unsupported operand type(s) for +: 'int' and 'str'
    >>> test(y="y", x=2)
    4
    y2
    >>>
Default parameters
●   Parameters can be given a default value
●   With default, parameter does not have to
    be specified, default will be used
●   Can still name parameter in parameter list
>>> def hello(name = "Everybody"):
... results = "Hello World to " + name + "!"
... return results
...
>>> hello("Anna")
'Hello World to Anna!'
>>> hello()
'Hello World to Everybody!'
>>> hello(name = "Annette")
'Hello World to Annette!'
>>>
Exercise
TranslateProteinFunctions.py
●   Use script from homework
●   Create the following functions:
     ●   get_translation_table(filename)
           –   return dict with codons and protein codes
     ●   read_dna_string(filename)
           –   return tuple with (descr, DNA_string)
     ●   translate_protein(dictionary, DNA_string)
           –   return the protein version of the DNA string
     ●   pretty_print(descr, protein_string, outname)
           –   write result to outname in fasta format
TranslateProteinFunctions.py
import sys



YOUR CODE GOES HERE!!!!

translationtable = sys.argv[1]
fastafile = sys.argv[2]
outfile      = sys.argv[3]

translation_dict = get_translation_table(translationtable)
description, DNA_string = read_dna_string(fastafile)
protein_string = translate_protein(translation_dict, DNA_string)
pretty_print(description, protein_string, outfile)
get_translation_table
def get_translation_table(translationtable):
  fh = open('translationtable.txt' , 'r')
  trans_dict = {}
  for line in fh:
     codon = line.split()[0]
         aa = line.split()[1]
     trans_dict[codon] = aa
  fh.close()
  return trans_dict
read_dna_string

def read_dna_string(fastafile):
  fh = open(fastafile, "r")
  line = fh.readline()
  header_line = line[1:-1]

  seq = ""
  for line in fh:
     seq += line[:-1]
  fh.close()
  return (header_line, seq)
translate_protein

def translate_protein(translation_dict, DNA_string):
  aa_seq = ""

  for i in range(0, len(DNA_string)-3, 3):
         codon = DNA_string[i:i+3]
         one_letter = translation_dict[codon]
         aa_seq += one_letter
  return aa_seq
pretty_print

def pretty_print(description, protein_string, outfile):
  fh = open(outfile, "w")
  fh.write(">" + description + "n")

  for i in range(0, len(protein_string), 60):
     fh.write(protein_string[i:i+60] + "n")
  fh.close()
Modules
●   A module is a file with functions, constants
    and other code in it
●   Module name = filename without .py
●   Can be used inside another program
●   Needs to be import-ed into program
●   Lots of builtin modules: sys, os, os.path....
●   Can also create your own
Using module
●   One of two import statements:
         1: import modulename
         2: from module import function/constant
●   If method 1:
     ●   modulename.function(arguments)
●   If method 2:
     ●   function(arguments) – module name not
         needed
     ●   beware of function name collision
Operating system modules –
      os and os.path
●   Modules dealing with files and operating
    system interaction
●   Commonly used methods:
     ●   os.getcwd() - get working directory
     ●   os.chdir(path) – change working directory
     ●   os.listdir([dir = .]) - get a list of all files in this
         directory
     ●   os.mkdir(path) – create directory
     ●   os.path.join(dirname, dirname/filename...)
Your own modules
●   Three steps:
       1. Create file with functions in it. Module
       name is same as filename without .py
       2. In other script, do
         import modulename
       3. In other script, use function like this:
         modulename.functionname(args)
Separating module use and
             main use
●   Files containing python code can be:
     ●   script file
     ●   module file
●   Module functions can be used in scripts
●   But: modules can also be scripts
●   Question is – how do you know if the code
    is being executed in the module script or
    an external script?
Module use / main use
●   When a script is being run, within that
    script a variable called __name__ will be
    set to the string “__main__”
●   Can test on this string to see if this script is
    being run
●   Benefit: can define functions in script that
    can be used in module mode later
Module mode / main mode
import sys



<code as before>
                                                     When this script is being used,
translationtable = sys.argv[1]                       this will always run, no matter what!
fastafile = sys.argv[2]
outfile      = sys.argv[3]

translation_dict = get_translation_table(translationtable)
description, DNA_string = read_dna_string(fastafile)
protein_string = translate_protein(translation_dict, DNA_string)
pretty_print(description, protein_string, outfile)
Module use / main use
# this is a script
import sys
import TranslateProteinFunctions

description, DNA_string = read_dna_string(sys.argv[1])
print description


[karinlag@freebee]% python modtest.py dna31.fsa
Traceback (most recent call last):
  File "modtest.py", line 2, in ?
   import TranslateProteinFunctions
  File "TranslateProteinFunctions.py", line 44, in ?
   fastafile = sys.argv[2]
IndexError: list index out of range
[karinlag@freebee]Karin%
TranslateProteinFuctions.py
         with main
import sys



<code as before>
if __name__ == “__main__”:
      translationtable = sys.argv[1]
      fastafile = sys.argv[2]
      outfile      = sys.argv[3]

      translation_dict = get_translation_table(translationtable)
      description, DNA_string = read_dna_string(fastafile)
      protein_string = translate_protein(translation_dict, DNA_string)
      pretty_print(description, protein_string, outfile)
ConcatFasta.py
●   Create a script that has the following:
      ●   function get_fastafiles(dirname)
            –   gets all the files in the directory, checks if they are
                fasta files (end in .fsa), returns list of fasta files
            –   hint: you need os.path to create full relative file
                names
      ●   function concat_fastafiles(filelist, outfile)
            –   takes a list of fasta files, opens and reads each of
                them, writes them to outfile
      ●   if __name__ == “__main__”:
            –   do what needs to be done to run script
●   Remember imports!

More Related Content

PPTX
Python Flow Control
PDF
Python tuples and Dictionary
PPTX
USER DEFINE FUNCTIONS IN PYTHON
PPTX
Error and exception in python
PDF
Function arguments In Python
PPSX
python Function
PPTX
Modules in Python Programming
Python Flow Control
Python tuples and Dictionary
USER DEFINE FUNCTIONS IN PYTHON
Error and exception in python
Function arguments In Python
python Function
Modules in Python Programming

What's hot (20)

ODP
Python Modules
PDF
Strings in python
PPTX
Python Functions
PDF
Introduction to python programming
PPTX
Regular expressions in Python
PDF
Variables & Data Types In Python | Edureka
PPTX
Tuple in python
PPTX
Chapter 05 classes and objects
PDF
C++ OOPS Concept
PPT
Introduction to Python
PPTX
CLASS OBJECT AND INHERITANCE IN PYTHON
PDF
Operators in python
PPSX
Modules and packages in python
PPTX
Packages in java
PDF
Datatypes in python
PPTX
Object oriented programming in python
PDF
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
PPT
Basic concept of OOP's
PPTX
File in C language
PPTX
classes and objects in C++
Python Modules
Strings in python
Python Functions
Introduction to python programming
Regular expressions in Python
Variables & Data Types In Python | Edureka
Tuple in python
Chapter 05 classes and objects
C++ OOPS Concept
Introduction to Python
CLASS OBJECT AND INHERITANCE IN PYTHON
Operators in python
Modules and packages in python
Packages in java
Datatypes in python
Object oriented programming in python
PYTHON-Chapter 3-Classes and Object-oriented Programming: MAULIK BORSANIYA
Basic concept of OOP's
File in C language
classes and objects in C++
Ad

Similar to Functions and modules in python (20)

PPTX
Functions and Modules.pptx
PPT
Python Training v2
PDF
Introduction to Python for Bioinformatics
PPTX
CLASS-11 & 12 ICT PPT Functions in Python.pptx
PPTX
Functions_in_Python.pptx
PPTX
P4 2018 io_functions
PDF
Functions_in_Python.pdf text CBSE class 12
PDF
ch 2. Python module
PPTX
functions.pptxghhhhhhhhhhhhhhhffhhhhhhhdf
PPTX
function_xii-BY APARNA DENDRE (1).pdf.pptx
PPTX
An Introduction : Python
PDF
beginners_python_cheat_sheet_pcc_functions.pdf
PPTX
Functions in Python
PPTX
Python and You Series
PDF
Anton Kasyanov, Introduction to Python, Lecture3
PDF
Introduction to Python for Plone developers
PPT
Python scripting kick off
PPTX
Python programming workshop session 4
PPTX
Python_Functions_Unit1.pptx
ODP
An Intro to Python in 30 minutes
Functions and Modules.pptx
Python Training v2
Introduction to Python for Bioinformatics
CLASS-11 & 12 ICT PPT Functions in Python.pptx
Functions_in_Python.pptx
P4 2018 io_functions
Functions_in_Python.pdf text CBSE class 12
ch 2. Python module
functions.pptxghhhhhhhhhhhhhhhffhhhhhhhdf
function_xii-BY APARNA DENDRE (1).pdf.pptx
An Introduction : Python
beginners_python_cheat_sheet_pcc_functions.pdf
Functions in Python
Python and You Series
Anton Kasyanov, Introduction to Python, Lecture3
Introduction to Python for Plone developers
Python scripting kick off
Python programming workshop session 4
Python_Functions_Unit1.pptx
An Intro to Python in 30 minutes
Ad

Recently uploaded (20)

PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Approach and Philosophy of On baking technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
KodekX | Application Modernization Development
PPT
Teaching material agriculture food technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Network Security Unit 5.pdf for BCA BBA.
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation_ Review paper, used for researhc scholars
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Review of recent advances in non-invasive hemoglobin estimation
Mobile App Security Testing_ A Comprehensive Guide.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Approach and Philosophy of On baking technology
Reach Out and Touch Someone: Haptics and Empathic Computing
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Advanced methodologies resolving dimensionality complications for autism neur...
KodekX | Application Modernization Development
Teaching material agriculture food technology
Chapter 3 Spatial Domain Image Processing.pdf

Functions and modules in python

  • 1. Functions and modules Karin Lagesen karin.lagesen@bio.uio.no
  • 2. Homework: TranslateProtein.py ● Input files are in /projects/temporary/cees-python-course/Karin ● translationtable.txt - tab separated ● dna31.fsa ● Script should: ● Open the translationtable.txt file and read it into a dictionary ● Open the dna31.fsa file and read the contents. ● Translates the DNA into protein using the dictionary ● Prints the translation in a fasta format to the file TranslateProtein.fsa. Each protein line should be 60 characters long.
  • 3. Modularization ● Programs can get big ● Risk of doing the same thing many times ● Functions and modules encourage ● re-usability ● readability ● helps with maintenance
  • 4. Functions ● Most common way to modularize a program ● Takes values as parameters, executes code on them, returns results ● Functions also found builtin to Python: ● open(filename, mode) ● sum([list of numbers] ● These do something on their parameters, and returns the results
  • 5. Functions – how to define def FunctionName(param1, param2, ...): """ Optional Function desc (Docstring) """ FUNCTION CODE ... return DATA ● keyword: def – says this is a function ● functions need names ● parameters are optional, but common ● docstring useful, but not mandatory ● FUNCTION CODE does something ● keyword return results: return
  • 6. Function example >>> def hello(name): ... results = "Hello World to " + name + "!" ... return results ... >>> hello() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: hello() takes exactly 1 argument (0 given) >>> hello("Lex") 'Hello World to Lex!' >>> ● Task: make script from this – take name from command line ● Print results to screen
  • 7. Function example script import sys def hello(name): results = "Hello World to " + name + "!" return results name = sys.argv[1] functionresult = hello(name) print functionresult [karinlag@freebee]% python hello.py Traceback (most recent call last): File "hello.py", line 8, in ? name = sys.argv[1] IndexError: list index out of range [karinlag@freebee]% python hello.py Lex Hello World to Lex! [karinlag@freebee]%
  • 8. Returning values ● Returning is not mandatory, if no return, None is returned by default ● Can return more than one value - results will be shown as a tuple >>> def test(x, y): ... a = x*y ... return x, a ... >>> test(1,2) (1, 2) >>>
  • 9. Function scope ● Variables defined inside a function can only be seen there! ● Access the value of variables defined inside of function: return variable
  • 10. Scope example >>> def test(x): ... z = 10 ... print "the value of z is " + str(z) ... return x*2 ... >>> z = 50 >>> test(3) the value of z is 10 6 >>> z 50 >>> x Traceback (most recent call last): File "<stdin>", line 1, in <module> NameError: name 'x' is not defined >>>
  • 11. Parameters ● Functions can take parameters – not mandatory ● Parameters follow the order in which they are given >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: 'int' and 'str' >>>
  • 12. Named parameters ● Can use named parameters >>> def test(x, y): ... print x*2 ... print y + str(x) ... >>> test(2, "y") 4 y2 >>> test("y", 2) yy Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in test TypeError: unsupported operand type(s) for +: 'int' and 'str' >>> test(y="y", x=2) 4 y2 >>>
  • 13. Default parameters ● Parameters can be given a default value ● With default, parameter does not have to be specified, default will be used ● Can still name parameter in parameter list >>> def hello(name = "Everybody"): ... results = "Hello World to " + name + "!" ... return results ... >>> hello("Anna") 'Hello World to Anna!' >>> hello() 'Hello World to Everybody!' >>> hello(name = "Annette") 'Hello World to Annette!' >>>
  • 14. Exercise TranslateProteinFunctions.py ● Use script from homework ● Create the following functions: ● get_translation_table(filename) – return dict with codons and protein codes ● read_dna_string(filename) – return tuple with (descr, DNA_string) ● translate_protein(dictionary, DNA_string) – return the protein version of the DNA string ● pretty_print(descr, protein_string, outname) – write result to outname in fasta format
  • 15. TranslateProteinFunctions.py import sys YOUR CODE GOES HERE!!!! translationtable = sys.argv[1] fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 16. get_translation_table def get_translation_table(translationtable): fh = open('translationtable.txt' , 'r') trans_dict = {} for line in fh: codon = line.split()[0] aa = line.split()[1] trans_dict[codon] = aa fh.close() return trans_dict
  • 17. read_dna_string def read_dna_string(fastafile): fh = open(fastafile, "r") line = fh.readline() header_line = line[1:-1] seq = "" for line in fh: seq += line[:-1] fh.close() return (header_line, seq)
  • 18. translate_protein def translate_protein(translation_dict, DNA_string): aa_seq = "" for i in range(0, len(DNA_string)-3, 3): codon = DNA_string[i:i+3] one_letter = translation_dict[codon] aa_seq += one_letter return aa_seq
  • 19. pretty_print def pretty_print(description, protein_string, outfile): fh = open(outfile, "w") fh.write(">" + description + "n") for i in range(0, len(protein_string), 60): fh.write(protein_string[i:i+60] + "n") fh.close()
  • 20. Modules ● A module is a file with functions, constants and other code in it ● Module name = filename without .py ● Can be used inside another program ● Needs to be import-ed into program ● Lots of builtin modules: sys, os, os.path.... ● Can also create your own
  • 21. Using module ● One of two import statements: 1: import modulename 2: from module import function/constant ● If method 1: ● modulename.function(arguments) ● If method 2: ● function(arguments) – module name not needed ● beware of function name collision
  • 22. Operating system modules – os and os.path ● Modules dealing with files and operating system interaction ● Commonly used methods: ● os.getcwd() - get working directory ● os.chdir(path) – change working directory ● os.listdir([dir = .]) - get a list of all files in this directory ● os.mkdir(path) – create directory ● os.path.join(dirname, dirname/filename...)
  • 23. Your own modules ● Three steps: 1. Create file with functions in it. Module name is same as filename without .py 2. In other script, do import modulename 3. In other script, use function like this: modulename.functionname(args)
  • 24. Separating module use and main use ● Files containing python code can be: ● script file ● module file ● Module functions can be used in scripts ● But: modules can also be scripts ● Question is – how do you know if the code is being executed in the module script or an external script?
  • 25. Module use / main use ● When a script is being run, within that script a variable called __name__ will be set to the string “__main__” ● Can test on this string to see if this script is being run ● Benefit: can define functions in script that can be used in module mode later
  • 26. Module mode / main mode import sys <code as before> When this script is being used, translationtable = sys.argv[1] this will always run, no matter what! fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 27. Module use / main use # this is a script import sys import TranslateProteinFunctions description, DNA_string = read_dna_string(sys.argv[1]) print description [karinlag@freebee]% python modtest.py dna31.fsa Traceback (most recent call last): File "modtest.py", line 2, in ? import TranslateProteinFunctions File "TranslateProteinFunctions.py", line 44, in ? fastafile = sys.argv[2] IndexError: list index out of range [karinlag@freebee]Karin%
  • 28. TranslateProteinFuctions.py with main import sys <code as before> if __name__ == “__main__”: translationtable = sys.argv[1] fastafile = sys.argv[2] outfile = sys.argv[3] translation_dict = get_translation_table(translationtable) description, DNA_string = read_dna_string(fastafile) protein_string = translate_protein(translation_dict, DNA_string) pretty_print(description, protein_string, outfile)
  • 29. ConcatFasta.py ● Create a script that has the following: ● function get_fastafiles(dirname) – gets all the files in the directory, checks if they are fasta files (end in .fsa), returns list of fasta files – hint: you need os.path to create full relative file names ● function concat_fastafiles(filelist, outfile) – takes a list of fasta files, opens and reads each of them, writes them to outfile ● if __name__ == “__main__”: – do what needs to be done to run script ● Remember imports!