This presentation is available under the Creative Commons
Attribution-ShareAlike 3.0 Unported License. Please refer to
http://www.bits.vib.be/ if you use this presentation or parts
hereof.
Introduction to Linux
for Bioinformatics
Productivity
Joachim Jacob
5 and 12 May 2014
2 of 17
Multiple commands
In bash, commands put on one line when be
separated by “;”
$ wget http://homepage.tudelft.nl/19j49/t-
SNE_files/tSNE_linux.tar.gz ; tar xvfz tSNE_linux.tar.gz
3 of 17
Multiple commands
Commands on a oneliner can also be separated by
&& or ||.
&& Only execute the command if the preceding one
finished correctly.
$ curl corz.org/ip && echo 'n'
|| (not a pipe!) - Inverse of the above. Only execute
the command if the preceding one did not succesfully
ends.
4 of 17
Piping a list of files with xargs
A pipe reads the output of a command.
Some commands requires the file name to be
passed, instead of the content of the file. E.g. this
doesn't work:
$ ls | less
$ ls | file
Usage: file [-bchikLlNnprsvz0] [--apple] [--mime-
encoding] [--mime-type]
[-e testname] [-F separator] [-f
namefile] [-m magicfiles] file ...
file -C [-m magicfiles]
file [--help]
5 of 17
Piping a list of files with xargs
Some commands requires the file name to be
passed, instead of the content of the file.
xargs passes the output of a command as a list of
arguments to another program.
$ ls | xargs file
bin: directory
buddy.sh: Bourne-Again shell
script, ASCII text executable
Compression_exercise: directory
Desktop: directory
Documents: directory
Downloads: directory
FastQValidator.0.1.1.tgz: gzip compressed data,
from Unix, last modified: Fri Oct 19 16:44:23 2012
6 of 17
.bashrc
~/.bashrc is a hidden configuration file for bash in
your home.
It configures the prompt in your terminal.
It contains aliases to commands.
7 of 17
alias example
When you enter a first word on the command line that
bash does not recognize as a command, it will
search in the aliases for the word.
You can specify aliases in .bashrc. An example:
8 of 17
Alias example
Some interesting aliases
alias ll='ls -lh'
alias dirsize="du -sh */"
alias uncom='grep -v -E "^#|^$"'
alias hosts="cat /etc/hosts"
alias dedup="awk '! x[$0]++' "
Aliases are perfectly suited for storing one-liners: find
some at
https://wikis.utexas.edu/display/bioiteam/Scott%27s+li
st+of+linux+one-liners
9 of 17
Alias exercise
→ exercise link
10 of 17
Finding stuff: locate
Extremely quick and convenient:
locate
However, it won't find the newest files you created.
First you need to update the database by running:
updatedb
It accepts wildcards. Example:
$ locate *.sam
Bonus: How to filter on a certain location?
11 of 17
Finding stuff: find
More elaborate tool to find stuff:
$ find -name alignment.sam
Find won't find without specifying options:
-name : to search on the name of the file
-type : to search for the type: (f)ile, (d)irectory, (l)ink
-perm : to search for the permissions (111 or rwx)
…
This is the power tool to find stuff.
12 of 17
Finding stuff: find
The most powerful option of find:
-exec Execute a command on the found entities.
13 of 17
Finding stuff: find
The most powerful option of find:
-exec Execute a command on the found entities.
$ find -name *.gz
./DRR000542_2.fastq.subset.gz
./DRR000542_1.fastq.subset.gz
./DRR000545_2.fastq.subset.gz
./DRR000545_1.fastq.subset.gz
$ find -name *.gz -exec gunzip {} ;
$ ls
DRR000542_1.fastq.subset DRR000545_1.fastq.subset
DRR000542_2.fastq.subset DRR000545_2.fastq.subset
14 of 17
Command substitution in bash
In bash, the output of commands can be directly
stored in a variable. Put the command between back-
ticks.
$ test=`ls -l`
$ echo $test
total 7929624 -rw-rw-r-- 1 joachim joachim 15326 May 10
2013 0538c2b.jpg -rw-rw-r-- 1 joachim joachim 4914797 Nov
8 16:15 18d7alY
15 of 17
Command substitution in bash
A variable can also contain a list. A list contains
several entities (e.g. files).
Extracting first 100k lines from compressed text file:
for filename in `ls DRR00054*tar.gz`; 
do zcat $filename | head -n 1000000 
>${file%.gz}.subset; done
The output of ls is being put in a list. 'for' assigns one after the other
the name of the file to the variable file. This variable is used in the
oneliner zcat | head.
16 of 17
Keywords
.bashrc
;
alias
prompt
locate
find
Command substitution
Write in your own words what the terms mean
17 of 17
Exercises
→ Concatenate the contents of fastq files

More Related Content

PDF
Part 4 of 'Introduction to Linux for bioinformatics': Managing data
PDF
Part 5 of "Introduction to Linux for Bioinformatics": Working the command lin...
PDF
The structure of Linux - Introduction to Linux for bioinformatics
PDF
Managing your data - Introduction to Linux for bioinformatics
PDF
Text mining on the command line - Introduction to linux for bioinformatics
PDF
Productivity tips - Introduction to linux for bioinformatics
PDF
Part 2 of 'Introduction to Linux for bioinformatics': Installing software
PDF
50 most frequently used unix linux commands (with examples)
Part 4 of 'Introduction to Linux for bioinformatics': Managing data
Part 5 of "Introduction to Linux for Bioinformatics": Working the command lin...
The structure of Linux - Introduction to Linux for bioinformatics
Managing your data - Introduction to Linux for bioinformatics
Text mining on the command line - Introduction to linux for bioinformatics
Productivity tips - Introduction to linux for bioinformatics
Part 2 of 'Introduction to Linux for bioinformatics': Installing software
50 most frequently used unix linux commands (with examples)

What's hot (20)

PDF
Part 1 of 'Introduction to Linux for bioinformatics': Introduction
PPT
PPT
Linux Commands
PDF
Unix Command Line Productivity Tips
PDF
Basic linux commands
PPT
Linux commands
ODP
Linux Introduction (Commands)
PDF
Useful Linux and Unix commands handbook
PPT
Basic 50 linus command
PPTX
Linux command for beginners
PDF
Introduction to Linux for bioinformatics
PDF
Basic commands
PPTX
Linux Command Suumary
PPTX
Unix OS & Commands
PPTX
Terminal Commands (Linux - ubuntu) (part-1)
PPT
Linux presentation
PPT
101 4.2 maintain the integrity of filesystems
PPT
1.2 boot the system v2
PPT
PPT
Unix/Linux Basic Commands and Shell Script
Part 1 of 'Introduction to Linux for bioinformatics': Introduction
Linux Commands
Unix Command Line Productivity Tips
Basic linux commands
Linux commands
Linux Introduction (Commands)
Useful Linux and Unix commands handbook
Basic 50 linus command
Linux command for beginners
Introduction to Linux for bioinformatics
Basic commands
Linux Command Suumary
Unix OS & Commands
Terminal Commands (Linux - ubuntu) (part-1)
Linux presentation
101 4.2 maintain the integrity of filesystems
1.2 boot the system v2
Unix/Linux Basic Commands and Shell Script
Ad

Similar to Part 6 of "Introduction to linux for bioinformatics": Productivity tips (20)

DOCX
Directories description
PDF
One-Liners to Rule Them All
PPT
Linux commands and file structure
PPT
linux-lecture4.ppt
TXT
An a z index of the bash commands
PPT
linux-lecture4.pptuyhbjhbiibihbiuhbbihbi
DOCX
archive A-Z linux
ODP
Class 2
PDF
Module 02 Using Linux Command Shell
PDF
Essential Linux Toolkit 37 commands you should know
PDF
Mastering the Unix Command Line
PDF
Shell scripting
PPTX
PDF
Quick guide of the most common linux commands
PDF
Unix and Linux - The simple introduction
PDF
Bash Scripting Workshop
ODP
Love Your Command Line
ODP
intro unix/linux 02
PDF
Quick Guide with Linux Command Line
PPTX
linux chapter 5.pptx lesson About introduction to linux
Directories description
One-Liners to Rule Them All
Linux commands and file structure
linux-lecture4.ppt
An a z index of the bash commands
linux-lecture4.pptuyhbjhbiibihbiuhbbihbi
archive A-Z linux
Class 2
Module 02 Using Linux Command Shell
Essential Linux Toolkit 37 commands you should know
Mastering the Unix Command Line
Shell scripting
Quick guide of the most common linux commands
Unix and Linux - The simple introduction
Bash Scripting Workshop
Love Your Command Line
intro unix/linux 02
Quick Guide with Linux Command Line
linux chapter 5.pptx lesson About introduction to linux
Ad

More from Joachim Jacob (8)

ODP
Korte handleiding van de Partago app
ODP
Blaas nieuw leven in je PC met Linux
ODP
The Galaxy toolshed
PDF
Part 6 of RNA-seq for DE analysis: Detecting biology from differential expres...
PDF
Part 5 of RNA-seq for DE analysis: Detecting differential expression
PDF
Part 2 of RNA-seq for DE analysis: Investigating raw data
PDF
Part 1 of RNA-seq for DE analysis: Defining the goal
PDF
Part 4 of RNA-seq for DE analysis: Extracting count table and QC
Korte handleiding van de Partago app
Blaas nieuw leven in je PC met Linux
The Galaxy toolshed
Part 6 of RNA-seq for DE analysis: Detecting biology from differential expres...
Part 5 of RNA-seq for DE analysis: Detecting differential expression
Part 2 of RNA-seq for DE analysis: Investigating raw data
Part 1 of RNA-seq for DE analysis: Defining the goal
Part 4 of RNA-seq for DE analysis: Extracting count table and QC

Recently uploaded (20)

PPTX
A powerpoint on colorectal cancer with brief background
PPTX
perinatal infections 2-171220190027.pptx
PPTX
PMR- PPT.pptx for students and doctors tt
PPT
Animal tissues, epithelial, muscle, connective, nervous tissue
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PDF
Social preventive and pharmacy. Pdf
PPT
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
PPTX
Microbes in human welfare class 12 .pptx
PPTX
ap-psych-ch-1-introduction-to-psychology-presentation.pptx
PPT
Mutation in dna of bacteria and repairss
PPTX
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
PPT
Biochemestry- PPT ON Protein,Nitrogenous constituents of Urine, Blood, their ...
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PPT
veterinary parasitology ````````````.ppt
PDF
CHAPTER 2 The Chemical Basis of Life Lecture Outline.pdf
PDF
Packaging materials of fruits and vegetables
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
Introcution to Microbes Burton's Biology for the Health
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
A powerpoint on colorectal cancer with brief background
perinatal infections 2-171220190027.pptx
PMR- PPT.pptx for students and doctors tt
Animal tissues, epithelial, muscle, connective, nervous tissue
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
Social preventive and pharmacy. Pdf
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
Microbes in human welfare class 12 .pptx
ap-psych-ch-1-introduction-to-psychology-presentation.pptx
Mutation in dna of bacteria and repairss
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
Biochemestry- PPT ON Protein,Nitrogenous constituents of Urine, Blood, their ...
TORCH INFECTIONS in pregnancy with toxoplasma
veterinary parasitology ````````````.ppt
CHAPTER 2 The Chemical Basis of Life Lecture Outline.pdf
Packaging materials of fruits and vegetables
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
Introcution to Microbes Burton's Biology for the Health
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf

Part 6 of "Introduction to linux for bioinformatics": Productivity tips

  • 1. This presentation is available under the Creative Commons Attribution-ShareAlike 3.0 Unported License. Please refer to http://www.bits.vib.be/ if you use this presentation or parts hereof. Introduction to Linux for Bioinformatics Productivity Joachim Jacob 5 and 12 May 2014
  • 2. 2 of 17 Multiple commands In bash, commands put on one line when be separated by “;” $ wget http://homepage.tudelft.nl/19j49/t- SNE_files/tSNE_linux.tar.gz ; tar xvfz tSNE_linux.tar.gz
  • 3. 3 of 17 Multiple commands Commands on a oneliner can also be separated by && or ||. && Only execute the command if the preceding one finished correctly. $ curl corz.org/ip && echo 'n' || (not a pipe!) - Inverse of the above. Only execute the command if the preceding one did not succesfully ends.
  • 4. 4 of 17 Piping a list of files with xargs A pipe reads the output of a command. Some commands requires the file name to be passed, instead of the content of the file. E.g. this doesn't work: $ ls | less $ ls | file Usage: file [-bchikLlNnprsvz0] [--apple] [--mime- encoding] [--mime-type] [-e testname] [-F separator] [-f namefile] [-m magicfiles] file ... file -C [-m magicfiles] file [--help]
  • 5. 5 of 17 Piping a list of files with xargs Some commands requires the file name to be passed, instead of the content of the file. xargs passes the output of a command as a list of arguments to another program. $ ls | xargs file bin: directory buddy.sh: Bourne-Again shell script, ASCII text executable Compression_exercise: directory Desktop: directory Documents: directory Downloads: directory FastQValidator.0.1.1.tgz: gzip compressed data, from Unix, last modified: Fri Oct 19 16:44:23 2012
  • 6. 6 of 17 .bashrc ~/.bashrc is a hidden configuration file for bash in your home. It configures the prompt in your terminal. It contains aliases to commands.
  • 7. 7 of 17 alias example When you enter a first word on the command line that bash does not recognize as a command, it will search in the aliases for the word. You can specify aliases in .bashrc. An example:
  • 8. 8 of 17 Alias example Some interesting aliases alias ll='ls -lh' alias dirsize="du -sh */" alias uncom='grep -v -E "^#|^$"' alias hosts="cat /etc/hosts" alias dedup="awk '! x[$0]++' " Aliases are perfectly suited for storing one-liners: find some at https://wikis.utexas.edu/display/bioiteam/Scott%27s+li st+of+linux+one-liners
  • 9. 9 of 17 Alias exercise → exercise link
  • 10. 10 of 17 Finding stuff: locate Extremely quick and convenient: locate However, it won't find the newest files you created. First you need to update the database by running: updatedb It accepts wildcards. Example: $ locate *.sam Bonus: How to filter on a certain location?
  • 11. 11 of 17 Finding stuff: find More elaborate tool to find stuff: $ find -name alignment.sam Find won't find without specifying options: -name : to search on the name of the file -type : to search for the type: (f)ile, (d)irectory, (l)ink -perm : to search for the permissions (111 or rwx) … This is the power tool to find stuff.
  • 12. 12 of 17 Finding stuff: find The most powerful option of find: -exec Execute a command on the found entities.
  • 13. 13 of 17 Finding stuff: find The most powerful option of find: -exec Execute a command on the found entities. $ find -name *.gz ./DRR000542_2.fastq.subset.gz ./DRR000542_1.fastq.subset.gz ./DRR000545_2.fastq.subset.gz ./DRR000545_1.fastq.subset.gz $ find -name *.gz -exec gunzip {} ; $ ls DRR000542_1.fastq.subset DRR000545_1.fastq.subset DRR000542_2.fastq.subset DRR000545_2.fastq.subset
  • 14. 14 of 17 Command substitution in bash In bash, the output of commands can be directly stored in a variable. Put the command between back- ticks. $ test=`ls -l` $ echo $test total 7929624 -rw-rw-r-- 1 joachim joachim 15326 May 10 2013 0538c2b.jpg -rw-rw-r-- 1 joachim joachim 4914797 Nov 8 16:15 18d7alY
  • 15. 15 of 17 Command substitution in bash A variable can also contain a list. A list contains several entities (e.g. files). Extracting first 100k lines from compressed text file: for filename in `ls DRR00054*tar.gz`; do zcat $filename | head -n 1000000 >${file%.gz}.subset; done The output of ls is being put in a list. 'for' assigns one after the other the name of the file to the variable file. This variable is used in the oneliner zcat | head.
  • 16. 16 of 17 Keywords .bashrc ; alias prompt locate find Command substitution Write in your own words what the terms mean
  • 17. 17 of 17 Exercises → Concatenate the contents of fastq files