SlideShare a Scribd company logo
2012 03 08_dbi
FBW
                   21-02-2008




RELOADED
                         2

           Wim Van Criekinge
Three Basic Data Types


                  • Scalars - $
                  • Arrays of scalars - @
                  • Associative arrays of
                    scalers or Hashes - %
• [m]/PATTERN/[g][i][o]
• s/PATTERN/PATTERN/[g][i][e][o]
• tr/PATTERNLIST/PATTERNLIST/[c][d][s]
The ‘structure’ of a Hash

 • An array looks something like this:
                      0       1     2      Index
      @array =
                    'val1' 'val2' 'val3'   Value


 • A hash looks something like this:
                Rob          Matt    Joe_A         Key (name)
%phone =
             353-7236 353-7122 555-1212            Value
Sub routine

       $a=5;
       $b=9;
       $sum=Optellen(5,9);
       print "The SUM is $sumn";
       sub Optellen()
       {
         $d=@_[0];
         $e=@_[1];
         #alternatively we could do this: my($a,
       $b)=@_;
         my($answer)=$d+$e;
          return $answer;
       }
Overview

           • Advanced data structures in Perl
           • Object-oriented Programming in Perl
           • Bioperl: is a large collection of Perl
             software for bioinformatics
           • Motivation:
             – Simple extension: “Multiline parsing“
               more difficult than expected
           • Goal: to make software modular,
             easier to maintain, more reliable, and
             easier to reuse
Multi-line parsing
                     use strict;
                     use Bio::SeqIO;

                     my $filename="sw.txt";
                     my $sequence_object;

                     my $seqio = Bio::SeqIO -> new (
                                        '-format' => 'swiss',
                                        '-file' => $filename
                                        );

                     while ($sequence_object = $seqio -> next_seq) {
                     my $sequentie = $sequence_object-> seq();
                     print $sequentie."n";
                     }
Perl 00

          • A class is a package

          • An object is a reference to a data
            structure (usually a hash) in a class

          • A method is a subroutine in the class
Perl Classes

               • Modules/Packages
                 – A Perl module is a file that uses a package
                   declaration
                 – Packages provide a separate namespace for
                   different parts of program
                 – A namespace protects the variable of one part of
                   a program from unwanted modification by
                   another part of the program
                 – The module must always have a last line that
                   evaluates to true, e.g. 1;
                 – The module must be in “known” directory
                   (environment variable)
                    • Eg … site/lib/bio/Sequentie.pm
Installation on Windows (ActiveState)


                       • Using PPM shell to install BioPerl
                             – Get the number of the BioPerl repository:
                                        – PPM>repository
                             – Set the BioPerl repository, find BioPerl, install
                               BioPerl:
                                        – PPM>repository set <BioPerl repository number>
                                        – PPM>search *
                                        – PPM>install <BioPerl package number>
                       • Download BioPerl in archive form from
                             – http://www.BioPerl.org/Core/Latest/index.shtml
                             – Use winzip to uncompress and install
Directory Structure

                  • BioPerl directory structure organization:
                      – Bio/       BioPerl modules
                      – models/ UML for BioPerl classes
                      – t/        Perl built-in tests
                      – t/data/   Data files used for the tests
                      – scripts/ Reusable scripts that use BioPerl
                      – scripts/contributed/ Contributed scripts not
                        necessarily integrated into BioPerl.
                      – doc/      "How To" files and the FAQ as XML
2012 03 08_dbi
Live.pl

          #!e:Perlbinperl.exe -w
          # script for looping over genbank entries, printing out name
          use Bio::DB::Genbank;
          use Data::Dumper;

          $gb = new Bio::DB::GenBank();

          $sequence_object = $gb->get_Seq_by_id('MUSIGHBA1');
          print Dumper ($sequence_object);

          $seq1_id = $sequence_object->display_id();
          $seq1_s = $sequence_object->seq();
          print "seq1 display id is $seq1_id n";
          print "seq1 sequence is $seq1_s n";
File converter

#!/opt/perl/bin/perl -w
#genbank_to_fasta.pl
use Bio::SeqIO;
my $input = Bio::SeqIO::new->(‘-file’ => $ARGV[0],
                                 ‘-format’ =>
   ‘GenBank’);
my $output = Bio::SeqIO::new->(‘-file’ => ‘>output.fasta’,
                                ‘-format’ => ‘Fasta’);

while (my $seq = $input->next_seq()){
  $output->write_seq($seq)
}
• Bptutorial.pl

• It includes the written tutorial as well
  as runnable scripts

• 2 ESSENTIAL TOOLS
  – Data::Dumper to find out what class your
    in
  – Perl bptutorial (100 Bio::Seq) to find the
    available methods for that class
Oefening 1

Run Needleman-Wunsch-monte-carlo.pl

–     my $MATCH    = 1; # +1 for letters that match
–     my $MISMATCH = -1; # -1 for letters that mismatch
–     my $GAP      = -1; # -1 for any gap

 Score (-64)

 Score = f($MATCH,$MISMATCH,$GAP)

f?
Implement convergence criteria
Store in DATABASE, make graphs in Excel
A Guide to MySQL & DBI
Objectives
• Start MySQL and learn how to use the MySQL
  Reference Manual

• Create a database

• Change (activate) a database

• Create tables using MySQL

• Create and run SQL commands in MySQL
Objectives (continued)
• Identify and use data types to define columns in
  tables

• Understand and use nulls

• Add rows to tables

• View table data

• Correct errors in a database
2012 03 08_dbi
2012 03 08_dbi
Opvolger voor MySQL Front
          • MySQL-Front was destijds een van
            de meest populaire MySQL-
            management applicaties. Wat
            PHPMyAdmin voor webapplicaties
            is, was MySQL-Front dat voor de
            desktop. Helaas kon /wilde de
            auteur niet langer doorgaan met het
            project en werd het project
            stilgelegd.
          • In begin April 2006 heeft de originele
            auteur besloten om de laatste
            broncode voor MySQL-Front
            beschikbaar te maken onder de
            naam HeidiSQL en de eerste beta is
2012 03 08_dbi
Starting MySQL
• Windows XP
   –   Click Start button
   –   Point to All Programs
   –   Point to MySQL on menu
   –   Point to MySQL Server 4.1
   – Click MySQL Command Line Client
• Must enter password in Command Line Client
  window
Obtaining Help in MySQL
• Type h at MySQL> prompt

• Type “help” followed by name of command

  – help contents

  – help union
2012 03 08_dbi
Creating a Database

• Must create a database before creating tables
• Use CREATE DATABASE command
• Include database name
Creating a Database (continued)
Changing the Default Database
• Default database: database to which all
  subsequent commands pertain
• USE command, followed by database name:
   – Changes the default database
   – Execute at the start of every session
Creating a Table

  • Describe the layout of each table in the
    database

  • Use CREATE TABLE command

  • TABLE is followed by the table name

  • Follow this with the names and data types of the
    columns in the table

  • Data types define type and size of data
Table and Column Name Restrictions



• Names cannot exceed 18 characters

• Must start with a letter

• Can contain letters, numbers, and underscores
  (_)

• Cannot contain spaces
Creating the REP Table
Entering Commands in MySQL
• Commands are free-format; no rules stating specific
  words in specific positions

• Press ENTER to move to the next line in a
  command

• Indicate the end of a command by typing a
  semicolon

• Commands are not case sensitive
Running SQL Commands
Editing SQL Commands
• Statement history: stores most recently used
  command

• Editing commands:
   –   Use arrow keys to move up, down, left, and right
   –   Use Ctrl+A to move to beginning of line
   –   Use Ctrl+E to move to end of line
   –   Use Backspace and Delete keys
Errors in SQL Commands
Editing MySQL Commands
• Press Up arrow key to go to top line

• Press Enter key to move to next line if line is correct

• Use Right and Left arrow keys to move to location of
  error

• Press ENTER key when line is correct

• If Enter is not pressed on a line, line not part of the
  revised command
Dropping a Table
• Can correct errors by dropping (deleting) a table and
  starting over

• Useful when table is created before errors are
  discovered

• Command is followed by the table to be dropped
  and a semicolon

• Any data in table also deleted
Data Types

  • For each table column, type of data must be
    defined
  • Common data types:
     – CHAR(n)
     – VARCHAR(n)
     – DATE
     – DECIMAL(p,q)
     – INT
     – SMALLINT
Nulls
• A special value to represent situation when actual
  value is not known for a column

• Can specify whether to allow nulls in the individual
  columns

• Should not allow nulls for primary key columns
Implementation of Nulls
• Use NOT NULL clause in CREATE TABLE
  command to exclude the use of nulls in a column


• Default is to allow null values


• If a column is defined as NOT NULL, system will
  reject any attempt to store a null value there
Adding Rows to a Table

• INSERT command:
   – INSERT INTO followed by table name
   – VALUES command followed by specific values in
     parentheses
   – Values for character columns in single quotation
     marks
The Insert Command
Modifying the INSERT Command

• To add new rows modify previous INSERT command

• Use same editing techniques as those used to
  correct errors
Adding Additional Rows
The INSERT Command with Nulls
• Use a special format of INSERT command to enter a
  null value in a table

• Identify the names of the columns that accept non-
  null values, then list only the non-null values after
  the VALUES command
The INSERT Command with Nulls

• Enter only non-null values
• Precisely indicate values you are entering by listing
  the columns
The INSERT Command with Nulls (continued)
Viewing Table Data
• Use SELECT command to display all the rows and
  columns in a table

• SELECT * FROM followed by the name of the table

• Ends with a semicolon
Viewing Table Data (continued)
Viewing Table Data (continued)
Correcting Errors In the Database

• UPDATE command is used to update a value in a
  table

• DELETE command allows you to delete a record

• INSERT command allows you to add a record
Correcting Errors in the Database
• UPDATE: change the value in a table
• DELETE: delete a row from a table
Correcting Errors in the Database (continued)
Correcting Errors in the Database (continued)
Saving SQL Commands
• Allows you to use commands again without retyping
• Different methods for each SQL implementation you
  are using
  – Oracle SQL*Plus and SQL*Plus Worksheet use a
    script file
  – Access saves queries as objects
  – MySQL uses an editor to save text files
Saving SQL Commands
• Script file:
   –   File containing SQL commands
   –   Use a text editor or word processor to create
   –   Save with a .txt file name extension
   –   Run in MySQL:
        • SOURCE file name
        • . file name
   – Include full path if file is in folder other than default
Creating the Remaining Database Tables

• Execute appropriate CREATE TABLE and INSERT
  commands


• Save these commands to a secondary storage
  device
Describing a Table
Summary
• Use MySQL Command Line Client window to enter
  commands
• Type h or help to obtain help at the mysql> prompt
• Use MySQL Reference Manual for more detailed
  help
Summary (continued)
• Use the CREATE DATABASE command to create a
  database

• Use the USE command to change the default
  database

• Use the CREATE TABLE command to create tables

• Use the DROP TABLE command to delete a table
Summary (continued)
• CHAR, VARCHAR, DATE, DECIMAL, INT and
  SMALLINT data types
• Use INSERT command to add rows
• Use NOT Null clause to identify columns that cannot
  have a null value
• Use SELECT command to view data in a table
Summary (continued)
• Use UPDATE command to change the value in a
  column
• Use DELETE command to delete a row
• Use SHOW COLUMNS command to display a
  table’s structure
• DBI
•   use DBI;

•   my $dbh = DBI->connect( 'dbi:mysql:guestdb',
•                 'root',
•                 '',
•               ) || die "Database connection not made: $DBI::errstr";

•   $sth = $dbh->prepare('SELECT * FROM demo');
•   $sth->execute();
•   while (my @row = $sth->fetchrow_array) {

•   print join(":",@row),"n";
•   }
•   $sth->finish();

•   $dbh->disconnect();
The Players

• Perl – a programming language
• DBMS – software to manage datat storage
• SQL – a language to talk to a DBMS
• DBI – Perl extensions to send SQL to a
  DBMS
• DBD – software DBI uses for specific DBMSs
• $dbh – a DBI object for course-grained
  access
• $sth – a DBI object for fine-grained access
• What is DBI ?

• DBI is a DataBase Interface
  – It is the way Perl talks to Databases
• DBI is a module by Tim Bunce
• DBI is a community of modules & developers
• What is an interface ?

• The overlap where two phenomeba affect
  each other
• A point at which independent systems interact
• A boundary across which two systems
  communicate
• A Sample Interface (the bedrock of DBI)




                     Bone

           Fred               Wilma
                     Dino
• Characteristics of the DINO interface

• Separation of knowledge
  – Fred doesn’t need to know how to find Wilma
  – Dino doesn’t need to know how to read
• Generalizability
  – Fred can send any message
  – Fred can communicate with anyone
• The DBI interface




                      SQL

           Perl             DBMS
                      DBI
• Characteristics of the DBI interface

• Separation of knowledge
  – You don’t need to know how to connect
  – DBI doesn’t need to know SQL


• Generalizeability
  – You can send any SQL
  – You can communicate with any DBMS
• The ingredients of a DBI App
  – 1: A perl script that uses DBI
  – 2: A DBMS
  – 3: SQL statements
Outline of a basic DBI script

Set the Perl Environment
Connect to a DBMS
Perform data-affecting SQL instructions
Perform data-returning SQL requests
Disconnect from the DBMS
• $dbh = DataBase Handle
• Done by DBI
  – Connect
• Done by $dbh, The Database Handle
  – Perform SQL instructions
  – Perform SQL request
  – Disconnect
• Set the Perl Environment
  – use warnings;
  – use strict;
  – Use DBI;
• Connect to a DBMS

my $dbh = DBI -> connect (‘dbi:DBM:’)

$dbh is a Database Handke
  An object created by DBI to handle access to
  this specific connection
• Perform data-affecting Instructions

• $dbh->do($sql_string);

• $dbh->do(“ INSERT INTO geography
  VALUES (‘Nepal’,’Asia’)” );
• Perform data-returning requests

• My @row = $dbh-
  >selectrow_array($sql_string)

• Disconnect from DBMS

• $dbh->disconnect()
A complete script

• use strict;
• use warnings;
• use DBI;

• my $dbh=DBI->connect("dbi:mysql:test","root","");
• $dbh->do("CREATE TABLE geography (country Text, region
  Text)");
• $dbh->do("INSERT INTO geography VALUES
  ('Nepal','Asia')");
• $dbh->do("INSERT INTO geography VALUES
  ('Portugal','Europe')");
• print $dbh->selectrow_array("SELECT * FROM geography");
• $dbh->disconnect
• The script output

• Only one row
• No seperation of the fields
• No metadata
• Improvements

• DBI
  – Connect to DBMS
  – Creates a database handle ($dbh)
• $dbh
  – Provides course-grained access to the DBMS
  – Creates a statement handle ($sth)
• $sth
  – Provides fine-grained access to the DBMS
• Life-cycle of a statement handle ($sth)

• Prepare
  – Creates the handle, sends SQL to the DBMS to
    be analyzed and optimized
• Execute
  – Instructs the DBMS to perform operations
• Fetch
  – Brings data from the DBMS into a script
• Life-cycle of a statement handle ($sth)

• My $sth = $dbh->prepare($sql_string);

• $sth->execute();

• Print $sth->fetchrow_array();
• Fecthing rows in a loop – the snippet

• My $sth=$dbh->prepare(“SELECT * FROM
  geography”);
• $sth->execute();
• While (my @row = $sth->fetchrow_array){
• Print join(“:”,@row),”n”;
• }
• Output
    – Nepal:Asia
    – Portugal:Europe


•   All data retrieved
•   Colums seperated
•   Rows seperated
•   Still no metadata
• Finding Metadata – Handle Attributes

• $handle->{$key}=$value;
• Print $handle->{$key};

• $dbh->{RaiseError}=1;
• Print $dbh->{RaiseError};
• My $column_names = $sth->{NAME};
• Finding Metadata with $sth->{NAME}

• my $sth=$dbh->prepare(“SELECT * FROM
  geography”);
• $sth->execute();
• my @column_names=@{$sth->{NAME}};
• my $num_cols = scaler @column_names;
• print join “:”,@column_names;
• print “(there are $num_cols columns)”;
• Errors

• $dbh->do (“Junk”);
• Print “I Got here!”;
• Checking Errors with RaiseError

• my $dbh=DBI->connect >..

• $dbh->{RaiseError}=1;
• $dbh->do(“Junk”);
• Print “Here ?”;
Number of rows affected

$rows=$dbh->do(“DELETE FROM user
  WHERE age <42”);

# undef = error
# 3 = 3 rows affected
# 0E0 = no error; no rows affected
# -1 = unknown
• Summary so far

• DBI connect($data_source)

•   $dbh do($sql_instruction)
•   Prepare ($sql_request)
•   Disconnect()
•   {RaiseError}

• $sth execute()
    – Fetchrow_array()
    – {NAMEM}
• A Deeper look at connection

                           DBD#1   MySQL




          Perl       DBI



                                     Oracle
                           DBD#2
• DBDs- Database Drivers

• DRIVER          DBMS
• DBD::DBM        DBM
• DBD::Pg         postgreSQL
• DBD::mysql      MySQL
• DBD::Oracle     Oracle
• DBD::ODBC       Ms-Access, MS-SQL-
  Server
• …
• Variation in DBDs & DBMSs

•   Driver-specific connection parameters
•   Driver-specific attributes and methods
•   SQL implementaion
•   Optimization Plans
• Driver-Specific Connection Params – driver name – user
  name and password

•   My $dbh = DBI->connect(
•   “DBI:$driver:”,
•   “root”,
•   “password”;
•   {
•   RaiseError => 1,
•   PrinError => 0,
•   AutoCommit =>1,
•   }

• );
Finish() – fetchus interuptus

While (my @row=$sth->fetchrow_array){
Last if $row[0] eq $some_conditions;
}

$sth->finish();
• Alternate fecthes

• My @row=$sth->fetchrow_array();
  – Print $row[1];
• My @row=$sth->fetchrow_arrayref();
  – Print $row->[1]
• My @row=$sth->fetchrow_hashref();
  – Print $row->{region};
• Placeholders !

• my $sth = $dbh -> prepare (“SELECT name
  from user WHERE country = ? AND city = ?
  AND age > ?”);
• $sth-> execute(‘Venezuela’,’Caracas’,21);
• DBDs that don’t need a separate DBMS

• DBD::CSV, DBD::Excel

• DBD::Amazon DBD::Google
• use DBI; my $dbh = DBI-
  >connect("dbi:Google:", $KEY); my $sth =
  $dbh->prepare(qq[ SELECT title, URL FROM
  google WHERE q = "perl" ]); while (my $r =
  $sth->fetchrow_hashref) { ...
Step1: Getting Drivers
Essential for SQL Querying

• A driver is a piece of software that lets your
  operating system talk to a database
      – Installed drivers visible in ODBC manager
            • “data connectivity” tool
• Each database engine (Oracle, MySQL, etc)
  requires its own driver
      – Generally must be installed by user
• Drivers are needed by Data Source Name
  tool and querying programs
• Require (simple) installation
MySQL Driver: Needed to Query MySQL Databases



  • Windows: Download MySQL
    Connector/ODBC 3.51 here
  • Must be installed for direct querying
    using e.g. Excel
     – Not necessary if you are using the MySQL
       Query Browser
Oefening 2

Fetch a sequence by adapting live.pl and do remote blast using 3
different scoring matrices (summarize results) and perform
“controls” using adaptation of shuffle …




  Rat versus                                                   Rat versus
  mouse RBP                                                    bacterial
                                                               lipocalin
Parsing BLAST Using BPlite, BPpsilite, and BPbl2seq

                   • Similar to Search and SearchIO in
                     basic functionality
                   • However:
                       – Older and will likely be phased out in the
                         near future
                       – Substantially limited advanced
                         functionality compared to Search and
                         SearchIO
                       – Important to know about because many
                         legacy scripts utilize these objects and
                         either need to be converted
Parse BLAST output
#!/opt/perl/bin/perl -w
#bioperl_blast_parse.pl
# program prints out query, and all hits with scores for each blast result
use Bio::SearchIO;

my $record = Bio::SearchIO->new(-format => ‘blast’, -file => $ARGV[0]);

while (my $result = $record->next_result){
   print “>”, $result->query_name, “ “, $result->query_description, “n”;
   my $seen = 0;
   while (my $hit = $result->next_hit){
          print “t”, $hit->name, “t”, $hit->bits, “t”, $hit->significance, “n”;
   $seen++ }
   if ($seen == 0 ) { print “No Hits Foundn” }
}
Parse BLAST in a little more detail
#!/opt/perl/bin/perl -w
#bioperl_blast_parse_hsp.pl
# program prints out query, and all hsps with scores for each blast result
use Bio::SearchIO;
my $record = Bio::SearchIO->new(-format => ‘blast’, -file => $ARGV[0]);
while (my $result = $record->next_result){
    print “>”, $result->query_name, “ “, $result->query_description, “n”;
    my $seen = 0;
    while (my $hit = $result->next_hit{
           $seen++;
           while (my $hsp = $hit->next_hsp){
                    print “t”, $hit->name, “has an HSP with an evalue of: “,
    $hsp->evalue, “n”;}
    if ($seen == 0 ) { print “No Hits Foundn” }
}
Shuffle
   #!/usr/bin/perl -w
   use strict;

   my ($def, @seq) = <>;
   print $def;
   chomp @seq;
   @seq = split(//, join("", @seq));
   my $count = 0;
   while (@seq) {
      my $index = rand(@seq);
      my $base = splice(@seq, $index, 1);
      print $base;
      print "n" if ++$count % 60 == 0;
   }
   print "n" unless $count %60 == 0;
Searching for Sequence Similarity

• BLAST with BioPerl
• Parsing Blast and FASTA Reports
   – Search and SearchIO
   – BPLite, BPpsilite, BPbl2seq
• Parsing HMM Reports
• Standalone BioPerl BLAST
Remote Execution of BLAST
• BioPerl has built in capability of running BLAST jobs remotely
  using RemoteBlast.pm
• Runs these jobs at NCBI automatically
   – NCBI has dynamic configurations (server side) to “always” be up and
     ready
   – Automatically updated for new BioPerl Releases
• Convenient for independent researchers who do not have
  access to huge computing resources
• Quick submission of Blast jobs without tying up local
  resources (especially if working from standalone workstation)
• Legal Restrictions!!!
Example of Remote Blast
A script to run a remote blast would be something like the following skeleton:

$remote_blast = Bio::Tools::Run::RemoteBlast->new( '-prog' =>
   'blastp','-data' => 'ecoli','-expect' => '1e-10' );
$r = $remote_blast->submit_blast("t/data/ecolitst.fa");
while (@rids = $remote_blast->each_rid ) { foreach $rid
   ( @rids ) {$rc = $remote_blast->retrieve_blast($rid);}}

In this example we are running a blastp (pairwise comparison) using the
    ecoli database and a e-value threshold of 1e-10. The sequences that are
    being compared are located in the file “t/data/ecolist.fa”.
Example
It is important to note that all command line options that fall under the blastall
     umbrella are available under BlastRemote.pm.

For example you can change some parameters of the remote job.

Consider the following example:

$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'} =
  'BLOSUM25';

This basically allows you to change the matrix used to BLOSUM 25, rather
   than the default of BLOSUM 62.
Parsing BLAST and FASTA Reports
• Main BioPerl objects in 1.2 are
  Search.pm/SearchIO.pm
   – SearchIO is more robust and the preferred choice (will be
     continued to be supported in future releases)
• Support parsing of BLAST XML reports and other
• Also allow the ability to parse HMMER reports
• Will continue to grow and provide functionality for
  parsing all types of reports. This way multiple report
  types can be handled by simply creating multiple
  instantiations of the SearchIO object.
Parsing Blast Reports


• One of the strengths of BioPerl is its ability to
  parse complex data structures. Like a blast
  report.
• Unfortunately, there is a bit of arcane
  terminology.
• Also, you have to ‘think like bioperl’, in order
  to figure out the syntax.
• This next script might get you started
Sample Script to Read and Parse BLAST Report




# Get the report $searchio = new Bio::SearchIO (-format => 'blast', -file =>
   $blast_report);
$result = $searchio->next_result; # Get info about the entire report $result-
   >database_name;
$algorithm_type = $result->algorithm;
# get info about the first hit $hit = $result->next_hit;
$hit_name = $hit->name ;
# get info about the first hsp of the first hit $hsp =
$hit->next_hsp;
$hsp_start = $hsp->query->start;

More Related Content

PDF
PuppetConf 2017: Hiera 5: The Full Data Enchilada- Hendrik Lindberg, Puppet
PPT
PPT
rtwerewr
PPT
PPTX
Drupal Camp Porto - Developing with Drupal: First Steps
PPTX
Bioinformatics p5-bioperlv2014
PPT
PPT
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool
PuppetConf 2017: Hiera 5: The Full Data Enchilada- Hendrik Lindberg, Puppet
rtwerewr
Drupal Camp Porto - Developing with Drupal: First Steps
Bioinformatics p5-bioperlv2014
Enterprise Search Solution: Apache SOLR. What's available and why it's so cool

What's hot (19)

PPT
MIND sweeping introduction to PHP
PDF
Sql cheat sheet
PDF
Introduction to Perl and BioPerl
PDF
Apache Solr Workshop
PPT
Php classes in mumbai
PPTX
FFW Gabrovo PMG - PHP OOP Part 3
PPTX
Puppet Camp DC: Puppet for Everybody
PDF
Apache solr liferay
PPT
9780538745840 ppt ch08
PDF
Refactor Dance - Puppet Labs 'Best Practices'
PPTX
JSON in Solr: from top to bottom
PDF
Pig
PPTX
Tutorial on developing a Solr search component plugin
PPTX
Spl to the Rescue - Zendcon 09
PDF
Hive
PDF
Solr Query Parsing
PPTX
Solr 6 Feature Preview
PPTX
Resource Routing in ExpressionEngine
PDF
Solr Troubleshooting - TreeMap approach
MIND sweeping introduction to PHP
Sql cheat sheet
Introduction to Perl and BioPerl
Apache Solr Workshop
Php classes in mumbai
FFW Gabrovo PMG - PHP OOP Part 3
Puppet Camp DC: Puppet for Everybody
Apache solr liferay
9780538745840 ppt ch08
Refactor Dance - Puppet Labs 'Best Practices'
JSON in Solr: from top to bottom
Pig
Tutorial on developing a Solr search component plugin
Spl to the Rescue - Zendcon 09
Hive
Solr Query Parsing
Solr 6 Feature Preview
Resource Routing in ExpressionEngine
Solr Troubleshooting - TreeMap approach
Ad

Viewers also liked (17)

PPT
Web::Scraper for SF.pm LT
PDF
do_this and die();
KEY
CPAN Realtime feed
PPTX
Refactoring tools for Perl code
PPT
Web Scraper Shibuya.pm tech talk #8
PDF
Text Layout With Core Text
ODP
Moose - YAPC::NA 2012
PDF
Vim Loves Perl - Perl Casual#2
PDF
A very nice presentation on Moose.
PDF
Everything About Bluetooth (淺談藍牙 4.0) - Central 篇
KEY
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
PDF
Pechakucha (Mons) : Street Art à Mons
ODP
Evolving Software with Moose
PPTX
2016 bioinformatics i_io_wim_vancriekinge
PDF
Finding Our Happy Place in the Internet of Things
PDF
The Future Of Work & The Work Of The Future
Web::Scraper for SF.pm LT
do_this and die();
CPAN Realtime feed
Refactoring tools for Perl code
Web Scraper Shibuya.pm tech talk #8
Text Layout With Core Text
Moose - YAPC::NA 2012
Vim Loves Perl - Perl Casual#2
A very nice presentation on Moose.
Everything About Bluetooth (淺談藍牙 4.0) - Central 篇
Remedie: Building a desktop app with HTTP::Engine, SQLite and jQuery
Pechakucha (Mons) : Street Art à Mons
Evolving Software with Moose
2016 bioinformatics i_io_wim_vancriekinge
Finding Our Happy Place in the Internet of Things
The Future Of Work & The Work Of The Future
Ad

Similar to 2012 03 08_dbi (20)

PPTX
Bioinformatica p6-bioperl
PPTX
Bioinformatics p5-bioperl v2013-wim_vancriekinge
PPTX
Bioinformatics p1-perl-introduction v2013
PPT
Php introduction with history of php
PPT
php fundamental
PPT
KEY
Modules Building Presentation
PPT
Bioinformatica 10-11-2011-p6-bioperl
PDF
System Programming and Administration
PPTX
File handle in PROGRAMMable extensible interpreted .pptx
PPTX
Learning Puppet basic thing
PPT
Php course-in-navimumbai
PDF
Drupal module development
PDF
Jooctrine - Doctrine ORM in Joomla!
PDF
Marc’s (bio)perl course
PPTX
Python oop third class
PPT
Java SpringMVC SpringBOOT (Divergent).ppt
PDF
Hive Anatomy
PDF
Php Crash Course - Macq Electronique 2010
Bioinformatica p6-bioperl
Bioinformatics p5-bioperl v2013-wim_vancriekinge
Bioinformatics p1-perl-introduction v2013
Php introduction with history of php
php fundamental
Modules Building Presentation
Bioinformatica 10-11-2011-p6-bioperl
System Programming and Administration
File handle in PROGRAMMable extensible interpreted .pptx
Learning Puppet basic thing
Php course-in-navimumbai
Drupal module development
Jooctrine - Doctrine ORM in Joomla!
Marc’s (bio)perl course
Python oop third class
Java SpringMVC SpringBOOT (Divergent).ppt
Hive Anatomy
Php Crash Course - Macq Electronique 2010

More from Prof. Wim Van Criekinge (20)

PPTX
2020 02 11_biological_databases_part1
PPTX
2019 03 05_biological_databases_part5_v_upload
PPTX
2019 03 05_biological_databases_part4_v_upload
PPTX
2019 03 05_biological_databases_part3_v_upload
PPTX
2019 02 21_biological_databases_part2_v_upload
PPTX
2019 02 12_biological_databases_part1_v_upload
PPTX
P7 2018 biopython3
PPTX
P6 2018 biopython2b
PPTX
P4 2018 io_functions
PPTX
P3 2018 python_regexes
PPTX
T1 2018 bioinformatics
PPTX
P1 2018 python
PDF
Bio ontologies and semantic technologies[2]
PPTX
2018 05 08_biological_databases_no_sql
PPTX
2018 03 27_biological_databases_part4_v_upload
PPTX
2018 03 20_biological_databases_part3
PPTX
2018 02 20_biological_databases_part2_v_upload
PPTX
2018 02 20_biological_databases_part1_v_upload
PPTX
P7 2017 biopython3
PPTX
P6 2017 biopython2
2020 02 11_biological_databases_part1
2019 03 05_biological_databases_part5_v_upload
2019 03 05_biological_databases_part4_v_upload
2019 03 05_biological_databases_part3_v_upload
2019 02 21_biological_databases_part2_v_upload
2019 02 12_biological_databases_part1_v_upload
P7 2018 biopython3
P6 2018 biopython2b
P4 2018 io_functions
P3 2018 python_regexes
T1 2018 bioinformatics
P1 2018 python
Bio ontologies and semantic technologies[2]
2018 05 08_biological_databases_no_sql
2018 03 27_biological_databases_part4_v_upload
2018 03 20_biological_databases_part3
2018 02 20_biological_databases_part2_v_upload
2018 02 20_biological_databases_part1_v_upload
P7 2017 biopython3
P6 2017 biopython2

Recently uploaded (20)

PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Computing-Curriculum for Schools in Ghana
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PPTX
Computer Architecture Input Output Memory.pptx
PPTX
Introduction to pro and eukaryotes and differences.pptx
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
PDF
Trump Administration's workforce development strategy
PDF
Indian roads congress 037 - 2012 Flexible pavement
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
advance database management system book.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Computing-Curriculum for Schools in Ghana
B.Sc. DS Unit 2 Software Engineering.pptx
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
Computer Architecture Input Output Memory.pptx
Introduction to pro and eukaryotes and differences.pptx
FORM 1 BIOLOGY MIND MAPS and their schemes
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
TNA_Presentation-1-Final(SAVE)) (1).pptx
Trump Administration's workforce development strategy
Indian roads congress 037 - 2012 Flexible pavement
What if we spent less time fighting change, and more time building what’s rig...
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Chinmaya Tiranga quiz Grand Finale.pdf
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Virtual and Augmented Reality in Current Scenario
Paper A Mock Exam 9_ Attempt review.pdf.
advance database management system book.pdf

2012 03 08_dbi

  • 2. FBW 21-02-2008 RELOADED 2 Wim Van Criekinge
  • 3. Three Basic Data Types • Scalars - $ • Arrays of scalars - @ • Associative arrays of scalers or Hashes - %
  • 5. The ‘structure’ of a Hash • An array looks something like this: 0 1 2 Index @array = 'val1' 'val2' 'val3' Value • A hash looks something like this: Rob Matt Joe_A Key (name) %phone = 353-7236 353-7122 555-1212 Value
  • 6. Sub routine $a=5; $b=9; $sum=Optellen(5,9); print "The SUM is $sumn"; sub Optellen() { $d=@_[0]; $e=@_[1]; #alternatively we could do this: my($a, $b)=@_; my($answer)=$d+$e; return $answer; }
  • 7. Overview • Advanced data structures in Perl • Object-oriented Programming in Perl • Bioperl: is a large collection of Perl software for bioinformatics • Motivation: – Simple extension: “Multiline parsing“ more difficult than expected • Goal: to make software modular, easier to maintain, more reliable, and easier to reuse
  • 8. Multi-line parsing use strict; use Bio::SeqIO; my $filename="sw.txt"; my $sequence_object; my $seqio = Bio::SeqIO -> new ( '-format' => 'swiss', '-file' => $filename ); while ($sequence_object = $seqio -> next_seq) { my $sequentie = $sequence_object-> seq(); print $sequentie."n"; }
  • 9. Perl 00 • A class is a package • An object is a reference to a data structure (usually a hash) in a class • A method is a subroutine in the class
  • 10. Perl Classes • Modules/Packages – A Perl module is a file that uses a package declaration – Packages provide a separate namespace for different parts of program – A namespace protects the variable of one part of a program from unwanted modification by another part of the program – The module must always have a last line that evaluates to true, e.g. 1; – The module must be in “known” directory (environment variable) • Eg … site/lib/bio/Sequentie.pm
  • 11. Installation on Windows (ActiveState) • Using PPM shell to install BioPerl – Get the number of the BioPerl repository: – PPM>repository – Set the BioPerl repository, find BioPerl, install BioPerl: – PPM>repository set <BioPerl repository number> – PPM>search * – PPM>install <BioPerl package number> • Download BioPerl in archive form from – http://www.BioPerl.org/Core/Latest/index.shtml – Use winzip to uncompress and install
  • 12. Directory Structure • BioPerl directory structure organization: – Bio/ BioPerl modules – models/ UML for BioPerl classes – t/ Perl built-in tests – t/data/ Data files used for the tests – scripts/ Reusable scripts that use BioPerl – scripts/contributed/ Contributed scripts not necessarily integrated into BioPerl. – doc/ "How To" files and the FAQ as XML
  • 14. Live.pl #!e:Perlbinperl.exe -w # script for looping over genbank entries, printing out name use Bio::DB::Genbank; use Data::Dumper; $gb = new Bio::DB::GenBank(); $sequence_object = $gb->get_Seq_by_id('MUSIGHBA1'); print Dumper ($sequence_object); $seq1_id = $sequence_object->display_id(); $seq1_s = $sequence_object->seq(); print "seq1 display id is $seq1_id n"; print "seq1 sequence is $seq1_s n";
  • 15. File converter #!/opt/perl/bin/perl -w #genbank_to_fasta.pl use Bio::SeqIO; my $input = Bio::SeqIO::new->(‘-file’ => $ARGV[0], ‘-format’ => ‘GenBank’); my $output = Bio::SeqIO::new->(‘-file’ => ‘>output.fasta’, ‘-format’ => ‘Fasta’); while (my $seq = $input->next_seq()){ $output->write_seq($seq) }
  • 16. • Bptutorial.pl • It includes the written tutorial as well as runnable scripts • 2 ESSENTIAL TOOLS – Data::Dumper to find out what class your in – Perl bptutorial (100 Bio::Seq) to find the available methods for that class
  • 17. Oefening 1 Run Needleman-Wunsch-monte-carlo.pl – my $MATCH = 1; # +1 for letters that match – my $MISMATCH = -1; # -1 for letters that mismatch – my $GAP = -1; # -1 for any gap  Score (-64)  Score = f($MATCH,$MISMATCH,$GAP) f? Implement convergence criteria Store in DATABASE, make graphs in Excel
  • 18. A Guide to MySQL & DBI
  • 19. Objectives • Start MySQL and learn how to use the MySQL Reference Manual • Create a database • Change (activate) a database • Create tables using MySQL • Create and run SQL commands in MySQL
  • 20. Objectives (continued) • Identify and use data types to define columns in tables • Understand and use nulls • Add rows to tables • View table data • Correct errors in a database
  • 23. Opvolger voor MySQL Front • MySQL-Front was destijds een van de meest populaire MySQL- management applicaties. Wat PHPMyAdmin voor webapplicaties is, was MySQL-Front dat voor de desktop. Helaas kon /wilde de auteur niet langer doorgaan met het project en werd het project stilgelegd. • In begin April 2006 heeft de originele auteur besloten om de laatste broncode voor MySQL-Front beschikbaar te maken onder de naam HeidiSQL en de eerste beta is
  • 25. Starting MySQL • Windows XP – Click Start button – Point to All Programs – Point to MySQL on menu – Point to MySQL Server 4.1 – Click MySQL Command Line Client • Must enter password in Command Line Client window
  • 26. Obtaining Help in MySQL • Type h at MySQL> prompt • Type “help” followed by name of command – help contents – help union
  • 28. Creating a Database • Must create a database before creating tables • Use CREATE DATABASE command • Include database name
  • 29. Creating a Database (continued)
  • 30. Changing the Default Database • Default database: database to which all subsequent commands pertain • USE command, followed by database name: – Changes the default database – Execute at the start of every session
  • 31. Creating a Table • Describe the layout of each table in the database • Use CREATE TABLE command • TABLE is followed by the table name • Follow this with the names and data types of the columns in the table • Data types define type and size of data
  • 32. Table and Column Name Restrictions • Names cannot exceed 18 characters • Must start with a letter • Can contain letters, numbers, and underscores (_) • Cannot contain spaces
  • 34. Entering Commands in MySQL • Commands are free-format; no rules stating specific words in specific positions • Press ENTER to move to the next line in a command • Indicate the end of a command by typing a semicolon • Commands are not case sensitive
  • 36. Editing SQL Commands • Statement history: stores most recently used command • Editing commands: – Use arrow keys to move up, down, left, and right – Use Ctrl+A to move to beginning of line – Use Ctrl+E to move to end of line – Use Backspace and Delete keys
  • 37. Errors in SQL Commands
  • 38. Editing MySQL Commands • Press Up arrow key to go to top line • Press Enter key to move to next line if line is correct • Use Right and Left arrow keys to move to location of error • Press ENTER key when line is correct • If Enter is not pressed on a line, line not part of the revised command
  • 39. Dropping a Table • Can correct errors by dropping (deleting) a table and starting over • Useful when table is created before errors are discovered • Command is followed by the table to be dropped and a semicolon • Any data in table also deleted
  • 40. Data Types • For each table column, type of data must be defined • Common data types: – CHAR(n) – VARCHAR(n) – DATE – DECIMAL(p,q) – INT – SMALLINT
  • 41. Nulls • A special value to represent situation when actual value is not known for a column • Can specify whether to allow nulls in the individual columns • Should not allow nulls for primary key columns
  • 42. Implementation of Nulls • Use NOT NULL clause in CREATE TABLE command to exclude the use of nulls in a column • Default is to allow null values • If a column is defined as NOT NULL, system will reject any attempt to store a null value there
  • 43. Adding Rows to a Table • INSERT command: – INSERT INTO followed by table name – VALUES command followed by specific values in parentheses – Values for character columns in single quotation marks
  • 45. Modifying the INSERT Command • To add new rows modify previous INSERT command • Use same editing techniques as those used to correct errors
  • 47. The INSERT Command with Nulls • Use a special format of INSERT command to enter a null value in a table • Identify the names of the columns that accept non- null values, then list only the non-null values after the VALUES command
  • 48. The INSERT Command with Nulls • Enter only non-null values • Precisely indicate values you are entering by listing the columns
  • 49. The INSERT Command with Nulls (continued)
  • 50. Viewing Table Data • Use SELECT command to display all the rows and columns in a table • SELECT * FROM followed by the name of the table • Ends with a semicolon
  • 51. Viewing Table Data (continued)
  • 52. Viewing Table Data (continued)
  • 53. Correcting Errors In the Database • UPDATE command is used to update a value in a table • DELETE command allows you to delete a record • INSERT command allows you to add a record
  • 54. Correcting Errors in the Database • UPDATE: change the value in a table • DELETE: delete a row from a table
  • 55. Correcting Errors in the Database (continued)
  • 56. Correcting Errors in the Database (continued)
  • 57. Saving SQL Commands • Allows you to use commands again without retyping • Different methods for each SQL implementation you are using – Oracle SQL*Plus and SQL*Plus Worksheet use a script file – Access saves queries as objects – MySQL uses an editor to save text files
  • 58. Saving SQL Commands • Script file: – File containing SQL commands – Use a text editor or word processor to create – Save with a .txt file name extension – Run in MySQL: • SOURCE file name • . file name – Include full path if file is in folder other than default
  • 59. Creating the Remaining Database Tables • Execute appropriate CREATE TABLE and INSERT commands • Save these commands to a secondary storage device
  • 61. Summary • Use MySQL Command Line Client window to enter commands • Type h or help to obtain help at the mysql> prompt • Use MySQL Reference Manual for more detailed help
  • 62. Summary (continued) • Use the CREATE DATABASE command to create a database • Use the USE command to change the default database • Use the CREATE TABLE command to create tables • Use the DROP TABLE command to delete a table
  • 63. Summary (continued) • CHAR, VARCHAR, DATE, DECIMAL, INT and SMALLINT data types • Use INSERT command to add rows • Use NOT Null clause to identify columns that cannot have a null value • Use SELECT command to view data in a table
  • 64. Summary (continued) • Use UPDATE command to change the value in a column • Use DELETE command to delete a row • Use SHOW COLUMNS command to display a table’s structure
  • 66. use DBI; • my $dbh = DBI->connect( 'dbi:mysql:guestdb', • 'root', • '', • ) || die "Database connection not made: $DBI::errstr"; • $sth = $dbh->prepare('SELECT * FROM demo'); • $sth->execute(); • while (my @row = $sth->fetchrow_array) { • print join(":",@row),"n"; • } • $sth->finish(); • $dbh->disconnect();
  • 67. The Players • Perl – a programming language • DBMS – software to manage datat storage • SQL – a language to talk to a DBMS • DBI – Perl extensions to send SQL to a DBMS • DBD – software DBI uses for specific DBMSs • $dbh – a DBI object for course-grained access • $sth – a DBI object for fine-grained access
  • 68. • What is DBI ? • DBI is a DataBase Interface – It is the way Perl talks to Databases • DBI is a module by Tim Bunce • DBI is a community of modules & developers
  • 69. • What is an interface ? • The overlap where two phenomeba affect each other • A point at which independent systems interact • A boundary across which two systems communicate
  • 70. • A Sample Interface (the bedrock of DBI) Bone Fred Wilma Dino
  • 71. • Characteristics of the DINO interface • Separation of knowledge – Fred doesn’t need to know how to find Wilma – Dino doesn’t need to know how to read • Generalizability – Fred can send any message – Fred can communicate with anyone
  • 72. • The DBI interface SQL Perl DBMS DBI
  • 73. • Characteristics of the DBI interface • Separation of knowledge – You don’t need to know how to connect – DBI doesn’t need to know SQL • Generalizeability – You can send any SQL – You can communicate with any DBMS
  • 74. • The ingredients of a DBI App – 1: A perl script that uses DBI – 2: A DBMS – 3: SQL statements
  • 75. Outline of a basic DBI script Set the Perl Environment Connect to a DBMS Perform data-affecting SQL instructions Perform data-returning SQL requests Disconnect from the DBMS
  • 76. • $dbh = DataBase Handle • Done by DBI – Connect • Done by $dbh, The Database Handle – Perform SQL instructions – Perform SQL request – Disconnect
  • 77. • Set the Perl Environment – use warnings; – use strict; – Use DBI;
  • 78. • Connect to a DBMS my $dbh = DBI -> connect (‘dbi:DBM:’) $dbh is a Database Handke An object created by DBI to handle access to this specific connection
  • 79. • Perform data-affecting Instructions • $dbh->do($sql_string); • $dbh->do(“ INSERT INTO geography VALUES (‘Nepal’,’Asia’)” );
  • 80. • Perform data-returning requests • My @row = $dbh- >selectrow_array($sql_string) • Disconnect from DBMS • $dbh->disconnect()
  • 81. A complete script • use strict; • use warnings; • use DBI; • my $dbh=DBI->connect("dbi:mysql:test","root",""); • $dbh->do("CREATE TABLE geography (country Text, region Text)"); • $dbh->do("INSERT INTO geography VALUES ('Nepal','Asia')"); • $dbh->do("INSERT INTO geography VALUES ('Portugal','Europe')"); • print $dbh->selectrow_array("SELECT * FROM geography"); • $dbh->disconnect
  • 82. • The script output • Only one row • No seperation of the fields • No metadata
  • 83. • Improvements • DBI – Connect to DBMS – Creates a database handle ($dbh) • $dbh – Provides course-grained access to the DBMS – Creates a statement handle ($sth) • $sth – Provides fine-grained access to the DBMS
  • 84. • Life-cycle of a statement handle ($sth) • Prepare – Creates the handle, sends SQL to the DBMS to be analyzed and optimized • Execute – Instructs the DBMS to perform operations • Fetch – Brings data from the DBMS into a script
  • 85. • Life-cycle of a statement handle ($sth) • My $sth = $dbh->prepare($sql_string); • $sth->execute(); • Print $sth->fetchrow_array();
  • 86. • Fecthing rows in a loop – the snippet • My $sth=$dbh->prepare(“SELECT * FROM geography”); • $sth->execute(); • While (my @row = $sth->fetchrow_array){ • Print join(“:”,@row),”n”; • }
  • 87. • Output – Nepal:Asia – Portugal:Europe • All data retrieved • Colums seperated • Rows seperated • Still no metadata
  • 88. • Finding Metadata – Handle Attributes • $handle->{$key}=$value; • Print $handle->{$key}; • $dbh->{RaiseError}=1; • Print $dbh->{RaiseError}; • My $column_names = $sth->{NAME};
  • 89. • Finding Metadata with $sth->{NAME} • my $sth=$dbh->prepare(“SELECT * FROM geography”); • $sth->execute(); • my @column_names=@{$sth->{NAME}}; • my $num_cols = scaler @column_names; • print join “:”,@column_names; • print “(there are $num_cols columns)”;
  • 90. • Errors • $dbh->do (“Junk”); • Print “I Got here!”;
  • 91. • Checking Errors with RaiseError • my $dbh=DBI->connect >.. • $dbh->{RaiseError}=1; • $dbh->do(“Junk”); • Print “Here ?”;
  • 92. Number of rows affected $rows=$dbh->do(“DELETE FROM user WHERE age <42”); # undef = error # 3 = 3 rows affected # 0E0 = no error; no rows affected # -1 = unknown
  • 93. • Summary so far • DBI connect($data_source) • $dbh do($sql_instruction) • Prepare ($sql_request) • Disconnect() • {RaiseError} • $sth execute() – Fetchrow_array() – {NAMEM}
  • 94. • A Deeper look at connection DBD#1 MySQL Perl DBI Oracle DBD#2
  • 95. • DBDs- Database Drivers • DRIVER DBMS • DBD::DBM DBM • DBD::Pg postgreSQL • DBD::mysql MySQL • DBD::Oracle Oracle • DBD::ODBC Ms-Access, MS-SQL- Server • …
  • 96. • Variation in DBDs & DBMSs • Driver-specific connection parameters • Driver-specific attributes and methods • SQL implementaion • Optimization Plans
  • 97. • Driver-Specific Connection Params – driver name – user name and password • My $dbh = DBI->connect( • “DBI:$driver:”, • “root”, • “password”; • { • RaiseError => 1, • PrinError => 0, • AutoCommit =>1, • } • );
  • 98. Finish() – fetchus interuptus While (my @row=$sth->fetchrow_array){ Last if $row[0] eq $some_conditions; } $sth->finish();
  • 99. • Alternate fecthes • My @row=$sth->fetchrow_array(); – Print $row[1]; • My @row=$sth->fetchrow_arrayref(); – Print $row->[1] • My @row=$sth->fetchrow_hashref(); – Print $row->{region};
  • 100. • Placeholders ! • my $sth = $dbh -> prepare (“SELECT name from user WHERE country = ? AND city = ? AND age > ?”); • $sth-> execute(‘Venezuela’,’Caracas’,21);
  • 101. • DBDs that don’t need a separate DBMS • DBD::CSV, DBD::Excel • DBD::Amazon DBD::Google • use DBI; my $dbh = DBI- >connect("dbi:Google:", $KEY); my $sth = $dbh->prepare(qq[ SELECT title, URL FROM google WHERE q = "perl" ]); while (my $r = $sth->fetchrow_hashref) { ...
  • 102. Step1: Getting Drivers Essential for SQL Querying • A driver is a piece of software that lets your operating system talk to a database – Installed drivers visible in ODBC manager • “data connectivity” tool • Each database engine (Oracle, MySQL, etc) requires its own driver – Generally must be installed by user • Drivers are needed by Data Source Name tool and querying programs • Require (simple) installation
  • 103. MySQL Driver: Needed to Query MySQL Databases • Windows: Download MySQL Connector/ODBC 3.51 here • Must be installed for direct querying using e.g. Excel – Not necessary if you are using the MySQL Query Browser
  • 104. Oefening 2 Fetch a sequence by adapting live.pl and do remote blast using 3 different scoring matrices (summarize results) and perform “controls” using adaptation of shuffle … Rat versus Rat versus mouse RBP bacterial lipocalin
  • 105. Parsing BLAST Using BPlite, BPpsilite, and BPbl2seq • Similar to Search and SearchIO in basic functionality • However: – Older and will likely be phased out in the near future – Substantially limited advanced functionality compared to Search and SearchIO – Important to know about because many legacy scripts utilize these objects and either need to be converted
  • 106. Parse BLAST output #!/opt/perl/bin/perl -w #bioperl_blast_parse.pl # program prints out query, and all hits with scores for each blast result use Bio::SearchIO; my $record = Bio::SearchIO->new(-format => ‘blast’, -file => $ARGV[0]); while (my $result = $record->next_result){ print “>”, $result->query_name, “ “, $result->query_description, “n”; my $seen = 0; while (my $hit = $result->next_hit){ print “t”, $hit->name, “t”, $hit->bits, “t”, $hit->significance, “n”; $seen++ } if ($seen == 0 ) { print “No Hits Foundn” } }
  • 107. Parse BLAST in a little more detail #!/opt/perl/bin/perl -w #bioperl_blast_parse_hsp.pl # program prints out query, and all hsps with scores for each blast result use Bio::SearchIO; my $record = Bio::SearchIO->new(-format => ‘blast’, -file => $ARGV[0]); while (my $result = $record->next_result){ print “>”, $result->query_name, “ “, $result->query_description, “n”; my $seen = 0; while (my $hit = $result->next_hit{ $seen++; while (my $hsp = $hit->next_hsp){ print “t”, $hit->name, “has an HSP with an evalue of: “, $hsp->evalue, “n”;} if ($seen == 0 ) { print “No Hits Foundn” } }
  • 108. Shuffle #!/usr/bin/perl -w use strict; my ($def, @seq) = <>; print $def; chomp @seq; @seq = split(//, join("", @seq)); my $count = 0; while (@seq) { my $index = rand(@seq); my $base = splice(@seq, $index, 1); print $base; print "n" if ++$count % 60 == 0; } print "n" unless $count %60 == 0;
  • 109. Searching for Sequence Similarity • BLAST with BioPerl • Parsing Blast and FASTA Reports – Search and SearchIO – BPLite, BPpsilite, BPbl2seq • Parsing HMM Reports • Standalone BioPerl BLAST
  • 110. Remote Execution of BLAST • BioPerl has built in capability of running BLAST jobs remotely using RemoteBlast.pm • Runs these jobs at NCBI automatically – NCBI has dynamic configurations (server side) to “always” be up and ready – Automatically updated for new BioPerl Releases • Convenient for independent researchers who do not have access to huge computing resources • Quick submission of Blast jobs without tying up local resources (especially if working from standalone workstation) • Legal Restrictions!!!
  • 111. Example of Remote Blast A script to run a remote blast would be something like the following skeleton: $remote_blast = Bio::Tools::Run::RemoteBlast->new( '-prog' => 'blastp','-data' => 'ecoli','-expect' => '1e-10' ); $r = $remote_blast->submit_blast("t/data/ecolitst.fa"); while (@rids = $remote_blast->each_rid ) { foreach $rid ( @rids ) {$rc = $remote_blast->retrieve_blast($rid);}} In this example we are running a blastp (pairwise comparison) using the ecoli database and a e-value threshold of 1e-10. The sequences that are being compared are located in the file “t/data/ecolist.fa”.
  • 112. Example It is important to note that all command line options that fall under the blastall umbrella are available under BlastRemote.pm. For example you can change some parameters of the remote job. Consider the following example: $Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'} = 'BLOSUM25'; This basically allows you to change the matrix used to BLOSUM 25, rather than the default of BLOSUM 62.
  • 113. Parsing BLAST and FASTA Reports • Main BioPerl objects in 1.2 are Search.pm/SearchIO.pm – SearchIO is more robust and the preferred choice (will be continued to be supported in future releases) • Support parsing of BLAST XML reports and other • Also allow the ability to parse HMMER reports • Will continue to grow and provide functionality for parsing all types of reports. This way multiple report types can be handled by simply creating multiple instantiations of the SearchIO object.
  • 114. Parsing Blast Reports • One of the strengths of BioPerl is its ability to parse complex data structures. Like a blast report. • Unfortunately, there is a bit of arcane terminology. • Also, you have to ‘think like bioperl’, in order to figure out the syntax. • This next script might get you started
  • 115. Sample Script to Read and Parse BLAST Report # Get the report $searchio = new Bio::SearchIO (-format => 'blast', -file => $blast_report); $result = $searchio->next_result; # Get info about the entire report $result- >database_name; $algorithm_type = $result->algorithm; # get info about the first hit $hit = $result->next_hit; $hit_name = $hit->name ; # get info about the first hsp of the first hit $hsp = $hit->next_hsp; $hsp_start = $hsp->query->start;