1. Regular expressions
Used by several different UNIX commands, including
ed, sed, awk, grep
A period ‘.’ matches any single characters
.X. matches any X that is surrounded by any two
characters
Caret character ^ matches the beginning of the line
^Bridgeport matches the characters Bridgeport only if
they occur at the beginning of the line
2. Regular expressions (continue.)
A dollar sign ‘$’ is used to match the end of the line
Bridgeport$ will match the characters Bridgeport only
they are the very last characters on the line
$ matches any single character at the end of the line
To match any single character, this character should be
preceded by a backslash ‘’ to remove the special
meaning
.$ matches any line end with a period
3. Regular expressions (continue.)
^$ matches any line that contains no characters
[…] is used to match any character enclosed in […]
[tT] matches a lower or upper case t followed
immediately by the characters
[A-Z] matches upper case letter
[A-Za-z] matches upper or lower case letter
[^A-Z] matches any character except upper case letter
[^A-Za-z] matches any non alphabetic character
4. Regular expressions (continue.)
(*) Asterisk matches zero or more characters
X* matches zero, one, two, three, … capital X’s
XX* matches one or more capital X’s
.* matches zero or more occurrences of any characters
e.*e matches all the characters from the first e in the
line to the last one
[A-Za-z] [A-Za-z] * matches any alphabetic character
followed by zero or more alphabetic character
5. Regular expressions (continue.)
[-0-9] matches a single dash or digit character
(ORDER IS IMPORTANT)
[0-9-] same as [-0-9]
[^-0-9] matches any alphabetic except digits and dash
[]a-z] matches a right bracket or lower case letter
(ORDER IS IMPORTANT)
6. Regular expressions (continue.)
{min, max} matches a precise number of characters
min specifies the minimum number of occurrences of the
preceding regular expression to be matched, and max
specifies the maximum
w{1,10} matches from 1 to 10 consecutive w’s
[a-zA-Z]{7} matches exactly seven alphabetic characters
7. Regular expressions (continue.)
X{5,} matches at least five consecutive X’s
(….) is used to save matched characters
^(.) matches the first character on the line and store it
into register one
There is 1-9 registers
To retrieve what is stored in any register n is used
Example: ^(.)1 matches the first two characters on a
line if they are both the same characters
8. Regular expressions (continue.)
^(.).*1$ matches all lines in which the first
character on the line is the same as the last.
Note (.*) matches all the characters in-between
^(…)(…) the first three characters on the line
will be stored into register 1 and the next three
characters into register 2
9. cut
$ who
bgeorge pts/16 Oct 5 15:01 (216.87.102.204)
abakshi pts/13 Oct 6 19:48 (216.87.102.220)
tphilip pts/11 Oct 2 14:10 (AC8C6085.ipt.aol.com)
$ who | cut -c1-8,18-
bgeorge Oct 5 15:01 (216.87.102.204)
abakshi Oct 6 19:48 (216.87.102.220)
tphilip Oct 2 14:10 (AC8C6085.ipt.aol.com)
$
Used in extracting various fields of data from a data file or the
output of a command
Format: cut -cchars file
chars specifies what characters to extract from each line of file.
10. cut (continue.)
Example: -c5, -c1,3,4 -c-10-15 -c5-
The –d and –f options are used with cut when you
have data that is delimited by a particular
character
Format: cut –ddchars –ffields file
dchar: delimiters of the fields (default: tab
character)
fields: fields to be extracted from file
11. cut (continue.)
$ cat phonebook
Edward 336-145
Alice 334-121
Sony 332-336
Robert 326-056
$ cut -f1 phonebook
Edward
Alice
Sony
Robert
$
15. paste (continue.)
Example:
$ cat students
Sue
Vara
Elvis
Luis
Eliza
$ cat sid
578426
452869
354896
455468
335123
$ paste students sid
Sue 578426
Vara 452869
Elvis 354896
Luis 455468
Eliza 335123
$
16. paste (continue.)
The option –s tells paste to paste together
lines from the same file not from alternate
files
To change the delimiter, -d option is used
17. paste (continue.)
Examples:
$ paste -d '+' students sid
Sue+578426
Vara+452869
Elvis+354896
Luis+455468
Eliza+335123
$ paste -s students
Sue Vara Elvis Luis Eliza
$ ls | paste -d ' ' -s -
addr args list mail memo name nsmail phonebook programs roster sid
students test tp twice user
$
18. sed
sed (stream editor) is a program used for editing
data
Unlike ed, sed can not be used interactively
Format: sed command file
command: applied to each line of the specified file
file: if no file is specified, then standard input is
assumed
sed writes the output to the standard output
s/Unix/UNIX command is applied to every line in
the file, it replaces the first Unix with UNIX
19. sed (continue.)
sed makes no changes to the original input file
‘s/Unix/UNIX/g’ command is applied to every line in the
file. It replaces every Unix with UNIX. “g” means global
With –n option, selected lines can be printed
Example: sed –n ’1,2p’ file which prints the first two
lines
Example: sed –n ‘/UNIX/p’ file, prints any line
containing UNIX
20. sed (continue.)
Example: sed –n ‘/1,2d/’ file, deletes lines 1 and 2
Example: sed –n’ /1’ text, prints all lines from
text,
showing non printing characters as nn and tab
characters as “>”
21. tr
The tr filter is used to translate characters from standard
input
Format: tr from-chars to-chars
Result is written to standard output
Example tr e x <file, translates every “e” in file to “x” and
prints the output to the standard output
The octal representation of a character can be given to “tr”
in the format nnn
Example: tr : ‘11’ will translate all : to tabs
22. tr (continue.)
Character Octal value
Bell 7
Backspace 10
Tab 11
New line 12
Linefeed 12
Form feed 14
Carriage return 15
Escape 33
23. tr (continue.)
Example: tr ‘[a-z]’’[A-Z]’ < file translate all lower
case letters in file to their uppercase equivalent.
The characters ranges [a-z] and [A-Z] are
enclosed in quotes to keep the shell from replacing
them with all files named from a through z and A
through Z
To “squeeze” out multiple occurrences of
characters the –s option is used
24. tr (continue.)
Example: tr –s ’ ’ ‘ ‘ < file will squeeze multiple spaces
to one space
The –d option is used to delete single characters from a
stream of input
Format: tr –d from-chars
Example: tr –d ‘ ‘ < file will delete all spaces from the
input stream
25. grep
Searches one or more files for a particular
characters patterns
Format: grep pattern files
Example: grep path .cshrc will print every line
in .cshrc file which has the pattern ‘path’ and print
it
Example: grep bin .cshrc .login .profile will print
every line from any of the three files .cshrc, .login
and .profile which has the pattern “bin”
26. grep (continue.)
Example : grep * smarts will give an
error because * will be substituted with
all file in the correct directory
Example : grep ‘*’ smarts
*
smarts
grep
arguments
27. sort
By default, sort takes each line of the specified input file and
sorts it into ascending order
$ cat students
Sue
Vara
Elvis
Luis
Eliza
$ sort students
Eliza
Elvis
Luis
Sue
Vara
$
28. sort (continue.)
The –n option tells sort to eliminate
duplicate lines from the output
29. sort (continue.)
$ echo Ash >> students
$ cat students
Sue
Vara
Elvis
Luis
Eliza
Ash
Ash
$ sort students
Ash
Ash
Eliza
Elvis
Luis
Sue
Vara
30. sort (continue.)
The –s option reverses the order of the sort
The –o option is used to direct the input from the
standard output to file
sort students > sorted_students works as sort
students –o sorted_students
The –o option allows to sort file and saves the output
to the same file
Example:
sort students –o students correct
sort students > students incorrect
31. sort (continue.)
• The –n option specifies the first field for sort
as number and data to sorted arithmetically
33. sort (continue.)
To sort by the second field +1n should be used
instead of n. +1 says to skip the first field
+5n would mean to skip the first five fields on
each line and then sort the data numerically
35. uniq
Used to find duplicate lines in a file
Format: uniq in_file out_file
uniq will copy in_file to out_file removing
any duplicate lines in the process
uniq’s definition of duplicated lines are
consecutive-occurring lines that match
exactly
36. uniq (continue.)
$ cat students
Sue
Vara
Elvis
Luis
Eliza
Ash
Ash
$ uniq students
Sue
Vara
Elvis
Luis
Eliza
Ash
$
The –d option is used to list duplicate lines
Example:
38. References
UNIX SHELLS BY EXAMPLE BY ELLIE
QUIGLEY
UNIX FOR PROGRAMMERS AND USERS BY
G. GLASS AND K ABLES
UNIX SHELL PROGRAMMING BY S.
KOCHAN AND P. WOOD