SlideShare a Scribd company logo
Bioinformatica p1-perl-introduction
FBW
             01-10-2012




Wim Van Criekinge
Bioinformatics.be

                    • Communiceren van praktische zaken: waar en
                      wanneer gaan de lessen door
                    • Ter beschikking stellen van lesmateriaal
                    • Aanvullend educatief materiaal (FAQ, Web Links)
                    • Practicum opgaven en programmacode

                    Voordelen

                    • Gebruik van het webtechnologie bij het assimileren
                      van de cursus
                    • Veel vragen/antwoorden kunnen interessant voor
                      meerdere mensen, Vermijden van terugkerende
                      vragen
                    • Permante discussie (tijdens het jaar) tussen
                      studenten, prof maar ook thesis en
                      doctoraatsstudenten
Bioinformatica p1-perl-introduction
Practicum

            • Practicum regeling ?
              – Inleiding van 45min over de gebruikte editor,
                programmeertaal, websites
              – 15min toelichting tot de opgaven
              – Normaal in PC-zaal D (check bioinformatics.be!)


                                  Perl for Bioinformatics

                                      Part 1: Beginning

                                      Part 2: Mastering
Practicum Bioinformatica

                 • Practicum
                     – Inleiding tot Perl
                     – Write your first PERL program !
                     – Execute your first.pl
What is Perl ?

                 • Perl is a High-level Scripting language
                 • Larry Wall created Perl in 1987
                     – Practical Extraction (a)nd Reporting
                       Language
                     – (or Pathologically Eclectic Rubbish Lister)
                 •   Born from a system administration tool
                 •   Faster than sh or csh
                 •   Sslower than C
                 •   No need for sed, awk, tr, wc, cut, …
                 •   Perl is open and free
                 • http://conferences.oreillynet.com/e
                   urooscon/
What is Perl ?


                 • Perl is available for most computing
                   platforms: all flavors of UNIX
                   (Linux), MS-
                   DOS/Win32, Macintosh, VMS, OS/2, Amig
                   a, AS/400, Atari
                 • Perl is a computer language that is:
                    – Interpreted, compiles at run-time (need for
                      perl.exe !)
                    – Loosely “typed”
                    – String/text oriented
                    – Capable of using multiple syntax formats
                 • In Perl, “there‟s more than one way to do it”
Why use Perl for bioinformatics ?

                     • Ease of use by novice programmers
                     • Flexible language: Fast software prototyping (quick
                       and dirty creation of small analysis programs)
                     • Expressiveness. Compact code, Perl Poetry:
                       @{$_[$#_]||[]}
                     • Glutility: Read disparate files and parse the relevant
                       data into a new format
                     • Powerful pattern matching via “regular expressions”
                       (Best Regular Expressions on Earth)
                     • With the advent of the WWW, Perl has become the
                       language of choice to create Common Gateway
                       Interface (CGI) scripts to handle form submissions
                       and create compute severs on the WWW.
                     • Open Source – Free. Availability of Perl modules
                       for Bioinformatics and Internet.
Why NOT use Perl for bioinformatics ?

                      • Some tasks are still better done with other
                        languages (heavy computations / graphics)
                           – C(++),C#, Fortran, Java (Pascal,Visual Basic)

                      • With perl you can write simple programs
                        fast, but on the other hand it is also suitable
                        for large and complex programs. (yet, it is
                        not adequate for very large projects)
                           – Python

                      • Larry Wall: “For programmers, laziness is
                        a virtue”
What bioinformatics tasks are suited to Perl ?


                     • Sequence manipulation and analysis
                     • Parsing results of sequence analysis
                       programs (Blast, Genscan, Hmmer etc)
                     • Parsing database (eg Genbank) files
                     • Obtaining multiple database entries
                       over the internet
                     • …
Example of problems we will be solving


                   • Primary Sequence analysis
                   • Perform alignments
                   • Simulation experiments to explain
                     Blast statistics
                   • Predicting protein topology
                   • Predicting secondary structures
                   • “Real-life” problems
                       – Proteomics: Given aa masses find protein
                         in database
                       –…
Perl installation


                    • Perl (op USB):
                      – Perl is available for various operating systems. To
                        download Perl and install it on your computer, have a
                        look at the following resources:
                      – www.perl.com (O'Reilly).
                          • Downloading Perl Software
                      – ActiveState. ActivePerl for Windows, as well as for
                        Linux and Solaris.
                          • ActivePerl binary packages.
                      – CPAN



                    • http://www.bioinformatics.be/n
                      ew/faq/setup/
Check installation


                     • Command-line flags for perl
                       – Perl – v
                          • Gives the current version of Perl
                       – Perl –e
                          • Executes Perl statements from the comment
                            line.
                             – Perl –e “print 42;”
                             – Perl –e “print ”Twonlinesn”;”
                       – Perl –we
                          • Executes and print warnings
                             – Perl –we “print „hello‟;x++;”
How to enter your first program ?


                    • Gebruik een editor
                        – DOS: EDIT
                        – Windows:
                             • NOTEPAD (Let op!)
                             • Word(Pad) -> TEXT FILE
                        – Scite:
                          http://www.scintilla.org/SciTE.html
                        – Textpad
                        – Others
                             • VIM
                             • Eclipse
Brief Introduction to Subdirectories—The Path




                     Path:
                            Route followed by OS to
                                 locate, save, and/or
                                 retrieve a file
Het absolute pad probleem …

                 • Probleem
                     – Ofwel kan je perl starten
                     – Ofwel kan je het script niet vinden
                     – Ofwel kan je een file nodig in het script niet
                       vinden
                 • Oplossing
                     – Don‟t panic !
                     – Gebruikt absolute path-namen
                         • D:Perlbinperl.exe D:tempTest.pl
                     – Let wel in je script met je de slash “escape”
                         • $filename = “d:Temppdb.fasta”
• Oplossingen (II)
   – Kopieer al de files in dezelfde directory !
   – Dus als je perl start vanuit D:Perlbin met perl
     kan je wel verwijzen naar D:Temptest.pl maar
     dan moet ook de absolute verwijzing gebruikt
     worden voor $filename ofwel moet je pdb.fasta
     copieren naar D:PerlBin
   – Pas het zoekpad aan zodat je perl overal kan
     starten
      • Path (geeft het zoekpad)
      • Set Path (past het pad aan, Voorzichtig !). Gebruik de
        dos environment variabele %path% om een directory
        toe te voegen
      • Set path=%path%;d:Perlbin
      • (nadien kan de aanpassing controleren door “path” uit
        te voeren)
Redirection


              Keyboard:
                 Standard input device

              Screen:
                 Standard output device

              Redirection . . .
                 changes output from monitor to
                  somewhere else (usually file or
                  printer).
Textpad


          Minimal install: via Minerva save file
            textpad.be to your folder. Create
            system folder in the same location. In
            system folder save plumb.exe
            (Minerva) and perl syntax files
            (textpad.com)
          • Syntax Highlighting
            – Document Class
          • Launch Perl
            – Tools
Perl
General Remarks


                  • Perl is mostly a free format language: add
                    spaces, tabs or new lines wherever you
                    want.
                  • For clarity, it is recommended to write
                    each statement in a separate line, and use
                    indentation in nested structures.
                  • Comments: Anything from the # sign to
                    the end of the line is a comment. (There
                    are no multi-line comments).
                  • A perl program consists of all of the Perl
                    statements of the file taken collectively as
                    one big routine to execute.
How does the real perl program look like:
       #!/usr/local/bin/perl
                                               Mandatory first line (on UNIX)

       print “Hello everyonen”;
How to run it:

      1. Save the text of your code as a file -- program.pl
      2. Execute it:
                 perl program.pl

                                         Hello everyone
Three Basic Data Types


                  • Scalars - $
                  • Arrays of scalars - @
                  • Associative arrays of
                    scalers or Hashes - %
2+2 = ?
                       $   - indicates a variable
       $a = 2;
       $b = 2;
       $c = $a + $b;

                                              - ends every command
                                         ;
   =   - assigns a value to a variable


                         or     $c = 2 + 2;
                         or     $c = 2 * 2;
                         or     $c = 2 / 2;
                           or   $c = 2 ^ 4;          2^4 <-> 24 =16
                         or     $c = 1.35 * 2 - 3 / (0.12 + 1);
Ok, $c is 4. How do we know it?

              $c = 4;
              print “$c”;

  print command:

                            “ ”        - bracket output expression

       print “Hello n”;


                             n     - print a end-of-the-line character
                                    (equivalent to pressing ‘Enter’)
Strings concatenation:
         print “Hello everyonen”;
         print “Hello” . ” everyone” . “n”;
Expressions and strings together:
         print “2 + 2 = “ . (2+2) . ”n”;                              2 + 2 = 4

                                        expression
Loops and cycles (for statement):

      # Output all the numbers from 1 to 100
      for ($n=1; $n<=100; $n+=1) {
                print “$n n”;
      }
1. Initialization:
          for ( $n=1 ; ; ) { … }

2. Increment:
         for ( ; ; $n+=1 ) { … }

3. Termination (do until the criteria is satisfied):
        for ( ; $n<=100 ; ) { … }
4. Body of the loop - command inside curly brackets:
         for ( ; ; ) { … }
FOR & IF -- all the even numbers from 1 to 100:

       for ($n=1; $n<=100; $n+=1) {
                   if (($n % 2) == 0) {
                            print “$n”;
                   }
       }


           Note: $a % $b -- Modulus
                         -- Remainder when $a is divided by $b
Two brief diversions (warnings & strict)

                • Use warnings

                • strict – forces you to „declare‟ a variable the
                  first time you use it.
                    – usage: use strict; (somewhere near the top of
                      your script)
                • declare variables with „my‟
                    – usage: my $variable;
                    –    or: my $variable = „value‟;
                • my sets the „scope‟ of the variable. Variable
                  exists only within the current block of code
                • use strict and my both help you to debug
                  errors, and help prevent mistakes.
Unary Arithmetic Operators eg. Autoincrement ++

                   • If you place one of the auto operators before the variable, it is
                     known as a pre-incremented (pre-decremented) variable. Its
                     value will be changed before it is referenced. If it is placed
                     after the variable, it is known as a post-incremented (post-
                     decremented) variable and its value is changed after it is used

                   For example:
                   • $a = 5; # $a is assigned 5
                   • $b = ++$a; # $b is assigned the incremented value of $a, 6
                   • $c = $a--; # $c is assigned 6, then $a is decremented to 5

                   #!e:perlbinperl.exe
                   • $getal1 = 5;
                   • print $getal1."n";
                   • print $getal1++."n";
                   • print ++$getal1."n";
Logical and Comparison operators


                  • Equal (True if $a is equal to $b)
                      – Numeric: ==
                      – String: eq




                  • And: &&
                  • Or: ||
Schuifoperatoren

               • Schuifoperatoren zijn handing voor
                 manipulaties op bit-niveau: bv 40
                    256 128 64 32 16 8 4 2 1
                      0     0     0 1 0 1 0 0 0
      >>2
                                  0 0 0 1 0 1 0 00
      <<3       000 1      0     1 0 0 0

               Program
               • $getal1 = 40;
               • print "/4 ".($getal1 >> 2)."n";
               • print "*8 ".($getal1 << 3)."n";
Text Processing Functions

                 The substr function
                 • Definition
                 • The substr function extracts a substring out of a
                   string and returns it. The function receives 3
                   arguments: a string value, a position on the string
                   (starting to count from 0) and a length.
                 Example:
                 • $a = "university";
                 • $k = substr ($a, 3, 5);
                 • $k is now "versi" $a remains unchanged.
                 • If length is omitted, everything to the end of the
                   string is returned.
Random

         #!c:perlbinperl.exe -w
         #srand(time|$$);
         $x = rand(1);


         • srand
            – The default seed for srand, which used to be time, has
              been changed. Now it's a heady mix of difficult-to-predict
              system-dependent values, which should be sufficient for
              most everyday purposes. Previous to version
              5.004, calling rand without first calling srand would yield
              the same sequence of random numbers on most or all
              machines. Now, when perl sees that you're calling rand
              and haven't yet called srand, it calls srand with the default
              seed. You should still call srand manually if your code
              might ever be run on a pre-5.004 system, of course, or if
              you want a seed other than the default
• Oefening hoe goed zijn de random
  nummers ?

• Als ze goed zijn kan je er Pi mee
  berekenen …

• Een goede random generator is
  belangrijk voor goede
  randomsequenties die we nadien
  kunnen gebruiken in simulaties
Bereken Pi aan de hand van twee random getallen




                                                          y

                                                  x

                                                      1

More Related Content

PDF
Sylvain Bellemare Resume
PPTX
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
PPTX
P1 2018 python
PPTX
P1 2017 python
PDF
Python in Action (Part 2)
PDF
Python in Action (Part 1)
PDF
Mastering Python 3 I/O
PDF
Programming Languages #devcon2013
Sylvain Bellemare Resume
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
P1 2018 python
P1 2017 python
Python in Action (Part 2)
Python in Action (Part 1)
Mastering Python 3 I/O
Programming Languages #devcon2013

What's hot (19)

ODP
Concurrent Programming with Ruby and Tuple Spaces
PDF
A peek into Python's Metaclass and Bytecode from a Smalltalk User
PPTX
What's the "right" PHP Framework?
PDF
Are High Level Programming Languages for Multicore and Safety Critical Conver...
PDF
Introduction to Natural Language Processing
PDF
PDF
DEF CON 27 - JEFF DILEO - evil e bpf in depth
PDF
Smalltalk and ruby - 2012-12-08
PPTX
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
PDF
Mastering Python 3 I/O (Version 2)
PPTX
The Intersection of Robotics, Search and AI with Solr, MyRobotLab, and Deep L...
PPT
R tech introcomputer
ODP
Some wonderful Linux softwares for daily use
PDF
Yapc10 Cdt World Domination
PDF
Jupyter notebooks on steroids
PPTX
Thrift vs Protocol Buffers vs Avro - Biased Comparison
PDF
instruction of install Caffe on ubuntu
KEY
Modern Java Concurrency (OSCON 2012)
PDF
Getting started with Linux and Python by Caffe
Concurrent Programming with Ruby and Tuple Spaces
A peek into Python's Metaclass and Bytecode from a Smalltalk User
What's the "right" PHP Framework?
Are High Level Programming Languages for Multicore and Safety Critical Conver...
Introduction to Natural Language Processing
DEF CON 27 - JEFF DILEO - evil e bpf in depth
Smalltalk and ruby - 2012-12-08
Introduction to Jupyter notebook and MS Azure Machine Learning Studio
Mastering Python 3 I/O (Version 2)
The Intersection of Robotics, Search and AI with Solr, MyRobotLab, and Deep L...
R tech introcomputer
Some wonderful Linux softwares for daily use
Yapc10 Cdt World Domination
Jupyter notebooks on steroids
Thrift vs Protocol Buffers vs Avro - Biased Comparison
instruction of install Caffe on ubuntu
Modern Java Concurrency (OSCON 2012)
Getting started with Linux and Python by Caffe
Ad

Viewers also liked (9)

PDF
2012 12 02_epigenetic_profiling_environmental_health_sciences
PPTX
2012 12 12_adam_v_final
PDF
Mini symposium
PPTX
2015 bioinformatics alignments_wim_vancriekinge
PPT
2015 03 13_puurs_v_public
PPTX
Bioinformatica t8-go-hmm
PPTX
2015 bioinformatics python_strings_wim_vancriekinge
PPTX
2015 07 09__epigenetic_profiling_environmental_health_sciences_v42
2012 12 02_epigenetic_profiling_environmental_health_sciences
2012 12 12_adam_v_final
Mini symposium
2015 bioinformatics alignments_wim_vancriekinge
2015 03 13_puurs_v_public
Bioinformatica t8-go-hmm
2015 bioinformatics python_strings_wim_vancriekinge
2015 07 09__epigenetic_profiling_environmental_health_sciences_v42
Ad

Similar to Bioinformatica p1-perl-introduction (20)

PDF
December06Bulletin
PDF
December06Bulletin
PPTX
Bioinformatics p1-perl-introduction v2013
PPTX
Bioinformatics v2014 wim_vancriekinge
PPT
Bioinformatica 29-09-2011-p1-introduction
PPT
2012 03 08_dbi
PDF
lecture2-PerlProgramming
PDF
lecture2-PerlProgramming
PDF
Perl%20SYLLABUS%20PB
PDF
Perl%20SYLLABUS%20PB
PPTX
Master perl io_2011
PDF
WEB PROGRAMMING UNIT V BY BHAVSINGH MALOTH
PDF
Perl-Tutorial
PDF
Perl-Tutorial
PDF
113-1_Perl_1_Introduction_to_command_line.pdf
PPTX
Unit 1-introduction to perl
PDF
Why Perl, when you can use bash+awk+sed? :P
DOC
Shell Scripting Classroom Training
PDF
Perl Myths 200909
PPT
Programming languages vienna
December06Bulletin
December06Bulletin
Bioinformatics p1-perl-introduction v2013
Bioinformatics v2014 wim_vancriekinge
Bioinformatica 29-09-2011-p1-introduction
2012 03 08_dbi
lecture2-PerlProgramming
lecture2-PerlProgramming
Perl%20SYLLABUS%20PB
Perl%20SYLLABUS%20PB
Master perl io_2011
WEB PROGRAMMING UNIT V BY BHAVSINGH MALOTH
Perl-Tutorial
Perl-Tutorial
113-1_Perl_1_Introduction_to_command_line.pdf
Unit 1-introduction to perl
Why Perl, when you can use bash+awk+sed? :P
Shell Scripting Classroom Training
Perl Myths 200909
Programming languages vienna

More from Prof. Wim Van Criekinge (20)

PPTX
2020 02 11_biological_databases_part1
PPTX
2019 03 05_biological_databases_part5_v_upload
PPTX
2019 03 05_biological_databases_part4_v_upload
PPTX
2019 03 05_biological_databases_part3_v_upload
PPTX
2019 02 21_biological_databases_part2_v_upload
PPTX
2019 02 12_biological_databases_part1_v_upload
PPTX
P7 2018 biopython3
PPTX
P6 2018 biopython2b
PPTX
P4 2018 io_functions
PPTX
P3 2018 python_regexes
PPTX
T1 2018 bioinformatics
PDF
Bio ontologies and semantic technologies[2]
PPTX
2018 05 08_biological_databases_no_sql
PPTX
2018 03 27_biological_databases_part4_v_upload
PPTX
2018 03 20_biological_databases_part3
PPTX
2018 02 20_biological_databases_part2_v_upload
PPTX
2018 02 20_biological_databases_part1_v_upload
PPTX
P7 2017 biopython3
PPTX
P6 2017 biopython2
PPTX
Van criekinge 2017_11_13_rodebiotech
2020 02 11_biological_databases_part1
2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part3_v_upload
2019 02 21_biological_databases_part2_v_upload
2019 02 12_biological_databases_part1_v_upload
P7 2018 biopython3
P6 2018 biopython2b
P4 2018 io_functions
P3 2018 python_regexes
T1 2018 bioinformatics
Bio ontologies and semantic technologies[2]
2018 05 08_biological_databases_no_sql
2018 03 27_biological_databases_part4_v_upload
2018 03 20_biological_databases_part3
2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part1_v_upload
P7 2017 biopython3
P6 2017 biopython2
Van criekinge 2017_11_13_rodebiotech

Recently uploaded (20)

PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
01-Introduction-to-Information-Management.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Institutional Correction lecture only . . .
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Insiders guide to clinical Medicine.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
STATICS OF THE RIGID BODIES Hibbelers.pdf
01-Introduction-to-Information-Management.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
Institutional Correction lecture only . . .
Final Presentation General Medicine 03-08-2024.pptx
Insiders guide to clinical Medicine.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Microbial diseases, their pathogenesis and prophylaxis
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Cell Structure & Organelles in detailed.
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
VCE English Exam - Section C Student Revision Booklet
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
human mycosis Human fungal infections are called human mycosis..pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Sports Quiz easy sports quiz sports quiz
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Module 4: Burden of Disease Tutorial Slides S2 2025

Bioinformatica p1-perl-introduction

  • 2. FBW 01-10-2012 Wim Van Criekinge
  • 3. Bioinformatics.be • Communiceren van praktische zaken: waar en wanneer gaan de lessen door • Ter beschikking stellen van lesmateriaal • Aanvullend educatief materiaal (FAQ, Web Links) • Practicum opgaven en programmacode Voordelen • Gebruik van het webtechnologie bij het assimileren van de cursus • Veel vragen/antwoorden kunnen interessant voor meerdere mensen, Vermijden van terugkerende vragen • Permante discussie (tijdens het jaar) tussen studenten, prof maar ook thesis en doctoraatsstudenten
  • 5. Practicum • Practicum regeling ? – Inleiding van 45min over de gebruikte editor, programmeertaal, websites – 15min toelichting tot de opgaven – Normaal in PC-zaal D (check bioinformatics.be!) Perl for Bioinformatics Part 1: Beginning Part 2: Mastering
  • 6. Practicum Bioinformatica • Practicum – Inleiding tot Perl – Write your first PERL program ! – Execute your first.pl
  • 7. What is Perl ? • Perl is a High-level Scripting language • Larry Wall created Perl in 1987 – Practical Extraction (a)nd Reporting Language – (or Pathologically Eclectic Rubbish Lister) • Born from a system administration tool • Faster than sh or csh • Sslower than C • No need for sed, awk, tr, wc, cut, … • Perl is open and free • http://conferences.oreillynet.com/e urooscon/
  • 8. What is Perl ? • Perl is available for most computing platforms: all flavors of UNIX (Linux), MS- DOS/Win32, Macintosh, VMS, OS/2, Amig a, AS/400, Atari • Perl is a computer language that is: – Interpreted, compiles at run-time (need for perl.exe !) – Loosely “typed” – String/text oriented – Capable of using multiple syntax formats • In Perl, “there‟s more than one way to do it”
  • 9. Why use Perl for bioinformatics ? • Ease of use by novice programmers • Flexible language: Fast software prototyping (quick and dirty creation of small analysis programs) • Expressiveness. Compact code, Perl Poetry: @{$_[$#_]||[]} • Glutility: Read disparate files and parse the relevant data into a new format • Powerful pattern matching via “regular expressions” (Best Regular Expressions on Earth) • With the advent of the WWW, Perl has become the language of choice to create Common Gateway Interface (CGI) scripts to handle form submissions and create compute severs on the WWW. • Open Source – Free. Availability of Perl modules for Bioinformatics and Internet.
  • 10. Why NOT use Perl for bioinformatics ? • Some tasks are still better done with other languages (heavy computations / graphics) – C(++),C#, Fortran, Java (Pascal,Visual Basic) • With perl you can write simple programs fast, but on the other hand it is also suitable for large and complex programs. (yet, it is not adequate for very large projects) – Python • Larry Wall: “For programmers, laziness is a virtue”
  • 11. What bioinformatics tasks are suited to Perl ? • Sequence manipulation and analysis • Parsing results of sequence analysis programs (Blast, Genscan, Hmmer etc) • Parsing database (eg Genbank) files • Obtaining multiple database entries over the internet • …
  • 12. Example of problems we will be solving • Primary Sequence analysis • Perform alignments • Simulation experiments to explain Blast statistics • Predicting protein topology • Predicting secondary structures • “Real-life” problems – Proteomics: Given aa masses find protein in database –…
  • 13. Perl installation • Perl (op USB): – Perl is available for various operating systems. To download Perl and install it on your computer, have a look at the following resources: – www.perl.com (O'Reilly). • Downloading Perl Software – ActiveState. ActivePerl for Windows, as well as for Linux and Solaris. • ActivePerl binary packages. – CPAN • http://www.bioinformatics.be/n ew/faq/setup/
  • 14. Check installation • Command-line flags for perl – Perl – v • Gives the current version of Perl – Perl –e • Executes Perl statements from the comment line. – Perl –e “print 42;” – Perl –e “print ”Twonlinesn”;” – Perl –we • Executes and print warnings – Perl –we “print „hello‟;x++;”
  • 15. How to enter your first program ? • Gebruik een editor – DOS: EDIT – Windows: • NOTEPAD (Let op!) • Word(Pad) -> TEXT FILE – Scite: http://www.scintilla.org/SciTE.html – Textpad – Others • VIM • Eclipse
  • 16. Brief Introduction to Subdirectories—The Path Path:  Route followed by OS to locate, save, and/or retrieve a file
  • 17. Het absolute pad probleem … • Probleem – Ofwel kan je perl starten – Ofwel kan je het script niet vinden – Ofwel kan je een file nodig in het script niet vinden • Oplossing – Don‟t panic ! – Gebruikt absolute path-namen • D:Perlbinperl.exe D:tempTest.pl – Let wel in je script met je de slash “escape” • $filename = “d:Temppdb.fasta”
  • 18. • Oplossingen (II) – Kopieer al de files in dezelfde directory ! – Dus als je perl start vanuit D:Perlbin met perl kan je wel verwijzen naar D:Temptest.pl maar dan moet ook de absolute verwijzing gebruikt worden voor $filename ofwel moet je pdb.fasta copieren naar D:PerlBin – Pas het zoekpad aan zodat je perl overal kan starten • Path (geeft het zoekpad) • Set Path (past het pad aan, Voorzichtig !). Gebruik de dos environment variabele %path% om een directory toe te voegen • Set path=%path%;d:Perlbin • (nadien kan de aanpassing controleren door “path” uit te voeren)
  • 19. Redirection Keyboard:  Standard input device Screen:  Standard output device Redirection . . .  changes output from monitor to somewhere else (usually file or printer).
  • 20. Textpad Minimal install: via Minerva save file textpad.be to your folder. Create system folder in the same location. In system folder save plumb.exe (Minerva) and perl syntax files (textpad.com) • Syntax Highlighting – Document Class • Launch Perl – Tools
  • 21. Perl
  • 22. General Remarks • Perl is mostly a free format language: add spaces, tabs or new lines wherever you want. • For clarity, it is recommended to write each statement in a separate line, and use indentation in nested structures. • Comments: Anything from the # sign to the end of the line is a comment. (There are no multi-line comments). • A perl program consists of all of the Perl statements of the file taken collectively as one big routine to execute.
  • 23. How does the real perl program look like: #!/usr/local/bin/perl Mandatory first line (on UNIX) print “Hello everyonen”; How to run it: 1. Save the text of your code as a file -- program.pl 2. Execute it: perl program.pl Hello everyone
  • 24. Three Basic Data Types • Scalars - $ • Arrays of scalars - @ • Associative arrays of scalers or Hashes - %
  • 25. 2+2 = ? $ - indicates a variable $a = 2; $b = 2; $c = $a + $b; - ends every command ; = - assigns a value to a variable or $c = 2 + 2; or $c = 2 * 2; or $c = 2 / 2; or $c = 2 ^ 4; 2^4 <-> 24 =16 or $c = 1.35 * 2 - 3 / (0.12 + 1);
  • 26. Ok, $c is 4. How do we know it? $c = 4; print “$c”; print command: “ ” - bracket output expression print “Hello n”; n - print a end-of-the-line character (equivalent to pressing ‘Enter’) Strings concatenation: print “Hello everyonen”; print “Hello” . ” everyone” . “n”; Expressions and strings together: print “2 + 2 = “ . (2+2) . ”n”; 2 + 2 = 4 expression
  • 27. Loops and cycles (for statement): # Output all the numbers from 1 to 100 for ($n=1; $n<=100; $n+=1) { print “$n n”; } 1. Initialization: for ( $n=1 ; ; ) { … } 2. Increment: for ( ; ; $n+=1 ) { … } 3. Termination (do until the criteria is satisfied): for ( ; $n<=100 ; ) { … } 4. Body of the loop - command inside curly brackets: for ( ; ; ) { … }
  • 28. FOR & IF -- all the even numbers from 1 to 100: for ($n=1; $n<=100; $n+=1) { if (($n % 2) == 0) { print “$n”; } } Note: $a % $b -- Modulus -- Remainder when $a is divided by $b
  • 29. Two brief diversions (warnings & strict) • Use warnings • strict – forces you to „declare‟ a variable the first time you use it. – usage: use strict; (somewhere near the top of your script) • declare variables with „my‟ – usage: my $variable; – or: my $variable = „value‟; • my sets the „scope‟ of the variable. Variable exists only within the current block of code • use strict and my both help you to debug errors, and help prevent mistakes.
  • 30. Unary Arithmetic Operators eg. Autoincrement ++ • If you place one of the auto operators before the variable, it is known as a pre-incremented (pre-decremented) variable. Its value will be changed before it is referenced. If it is placed after the variable, it is known as a post-incremented (post- decremented) variable and its value is changed after it is used For example: • $a = 5; # $a is assigned 5 • $b = ++$a; # $b is assigned the incremented value of $a, 6 • $c = $a--; # $c is assigned 6, then $a is decremented to 5 #!e:perlbinperl.exe • $getal1 = 5; • print $getal1."n"; • print $getal1++."n"; • print ++$getal1."n";
  • 31. Logical and Comparison operators • Equal (True if $a is equal to $b) – Numeric: == – String: eq • And: && • Or: ||
  • 32. Schuifoperatoren • Schuifoperatoren zijn handing voor manipulaties op bit-niveau: bv 40 256 128 64 32 16 8 4 2 1 0 0 0 1 0 1 0 0 0 >>2 0 0 0 1 0 1 0 00 <<3 000 1 0 1 0 0 0 Program • $getal1 = 40; • print "/4 ".($getal1 >> 2)."n"; • print "*8 ".($getal1 << 3)."n";
  • 33. Text Processing Functions The substr function • Definition • The substr function extracts a substring out of a string and returns it. The function receives 3 arguments: a string value, a position on the string (starting to count from 0) and a length. Example: • $a = "university"; • $k = substr ($a, 3, 5); • $k is now "versi" $a remains unchanged. • If length is omitted, everything to the end of the string is returned.
  • 34. Random #!c:perlbinperl.exe -w #srand(time|$$); $x = rand(1); • srand – The default seed for srand, which used to be time, has been changed. Now it's a heady mix of difficult-to-predict system-dependent values, which should be sufficient for most everyday purposes. Previous to version 5.004, calling rand without first calling srand would yield the same sequence of random numbers on most or all machines. Now, when perl sees that you're calling rand and haven't yet called srand, it calls srand with the default seed. You should still call srand manually if your code might ever be run on a pre-5.004 system, of course, or if you want a seed other than the default
  • 35. • Oefening hoe goed zijn de random nummers ? • Als ze goed zijn kan je er Pi mee berekenen … • Een goede random generator is belangrijk voor goede randomsequenties die we nadien kunnen gebruiken in simulaties
  • 36. Bereken Pi aan de hand van twee random getallen y x 1