SlideShare a Scribd company logo
Winter Training, December 2011
 Unix and Shell Programming
     Department of COE and SE,
    Delhi Technological University



Instructor: Divyashikha Sethia
Contents

UNIT 1: INTRODUCTION TO UNIX ..........................................................................3

UNIT 2: SHELL SCRIPTING..................................................................................... 63

UNIT 3: ADVANCED SHELL SCRIPTING, SED, AND AWK .................. 143
UNIT 1: INTRODUCTION TO UNIX


1. THE UNIX OPERATING SYSTEM – AN OVERVIEW.................................7

2. UNIX COMMANDS................................................................................................... 21

3. UNIX FILE SYSTEM ................................................................................................ 33

4. THE VI TEXT EDITOR ............................................................................................ 45
Unix shell program training
COE
                                                                                                                    Unit 1, Lesson 1




LESSON 1                      T HE UNIX OPERATING S YSTEM – AN
                              OVERVIEW
1. THE UNIX OPERATING SYSTEM – AN OVERVIEW .................................................7

  1.0       OBJECTIVES ...............................................................................................................7
  1.1       INTRODUCTION ...........................................................................................................7
  1.2       INTRODUCTION TO THE COMPUTERS .........................................................................7
      1.2.1 Typical hardware components of a computer.................................................8
  1.3       OPERATING SYSTEM..................................................................................................8
      1.3.1 Virtual Memory.....................................................................................................9
  1.4       UNIX OPERATING SYSTEM .................................................................................... 10
      1.4.1 History of UNIX ................................................................................................. 10
      1.4.2 Importance of UNIX ......................................................................................... 11
  1.5       UNIX OPERATING SYSTEM – ATTRIBUTES AND COMPONENTS ............................ 12
  1.6       STARTING WITH UNIX............................................................................................. 14
  1.7       CHANGING YOUR PASSWORD ................................................................................ 15
  1.8       ENTERING COMMANDS IN THE UNIX SYSTEM ....................................................... 16
      1.8.1 Command Options and Arguments ............................................................... 17
  1.9       SUMMING UP........................................................................................................... 17
  1.10      ANSWERS TO THE SELF CHECK QUESTIONS ........................................................... 17
  1.11      TERMINAL QUESTIONS............................................................................................. 18
  1.12      REFERENCES .......................................................................................................... 18
Unix shell program training
COE
                                                                             Unit 1, Lesson 1




  1. The UNIX Operating System – An Overview


Use and influence of computers has been steadily increasing in the last few
decades. Today, computers play a pivotal role in all walks o f life. An operating
system (OS) is a core component of the computer system. An operating system lets
a computer function as multi-user, multitasking and multithreading environment, thus
augmenting the power of the computer. UNIX is an operating system that offers its
users all these capabilities along with numerous other features. In this lesson we will
look upon the features and components of the UNIX system that make it very useful
and popular. In the subsequent lessons we will explore the features and components
of UNIX in more details.



1.0       Objectives
          After going through this lesson, you will be able to

         Understand   the concepts of the Operating System
         Understand   what is the UNIX Operating Systems
         Understand   the importance and popularity of UNIX Operating System
         Understand   how to start working on a UNIX machines


1.1       Introduction
          In the modern age, we have seen the computer doing wonders, from children
          playing games to the scientists launching satellites; we can clearly see that
          the computers are playing a important role. It is the operating system that has
          made the computing in the modern world possible and efficient.


1.2       Introduction to the computers
          Unlike calculator, a computer carries out user specified tasks. An inherent
          power provided by a computer is that it can be programmed to do variety of
          tasks. Computers are mostly general purpose computers in the sense that a
          computer can be used to play a game and the same computer can be used to
          perform a circuit simulation.

          A computer consists of hardware and software. A computer can be defined as
          a programmable machine which responds and executes a list of instructions.
          These lists of instructions are called programs. The hardware components are
          the physical components and software is data o r instruction.


                                                                                           7
COE
                                                                              Unit 1, Lesson 1




1.2.1 Typical hardware components of a computer

          Hardware components in computer are what you can see and touch.




         Memory: Enables the computer to store the temporary data and instructions.
          This is used in the computer during the execution of various instruction sets.

           While evaluating the following expression, the
           intermediate results are stored in memory
           Sum = 2 + 1 + 3 * 4

         Mass storage devices: These are used for the bulk storage of data, such as,
          disk drives and tape drives.
         Input devices: Interface to take the instructions from the user to the computer.
          Commonly used input devices are keyboard, mouse, web camera, etc.
         Output Devices: Display the results of the instruction processing done by the
          computer. Commonly used are display monitors and the printers.
         Central Processing Unit (CPU): The brain of the computer in which all the
          processing is done. It reads the data from memory or input and executes the
          instructions. CPU consists of ALU (Arithmetic Logic Unit) and CU (Control
          Unit). ALU is responsible for all calculations and CU is responsible for getting
          instructions and data for execution.

          Working with the hardware components alone is very difficult because their
          controls are very cryptic. Instead, software components are used to drive the
          hardware components. The operating system is also one such software.


1.3       Operating System
          An Operating System (OS) is an important program that runs on the
          computer. An operating system performs the very basic tasks, such as
          recognizing inputs from the user, sending outputs to the display, keeping track
          of file and directories on the disk, and controlling the peripheral devices such
          as the disk drivers and printers.




8
COE
                                                                              Unit 1, Lesson 1




          The OS also works as a traffic cop - it makes sure that different program and
          users running at the same time do not interfere with each other. The operating
          system is also responsible for security and blocking unauthorized users.

          Operating systems can be classified as follows:

         Multi-user: Allows multiple users to use computers at the same time.
         Multiprocessing: Supports running parts of a program in parallel.
         Multitasking: Allows multiple programs to run concurrently on a single CPU.
         Multithreading: Allows different parts of a single program to run concurrently.

          Operating systems provide a platform on which other programs, called
          application programs, can run. The application programs must be written to
          run on a particular operating system. Your choice of operating system,
          therefore, determines to a great extent the applications you can run. For PCs,
          the popular operating systems are DOS, OS/2, Windows and Linux.

1.3.1 Virtual Memory

          Programs that run on a computer may need more memory than what is
          available physically on that computer. Many operating systems provide an
          illusion to the user of much larger memory. This is done by loading only partial
          program and data in physical memory. Only the parts that are needed for
          current execution are brought into physical memory. So, bigger programs can
          be run even if physical memory is small.




                                                                                            9
COE
                                                                        Unit 1, Lesson 1




Self-Check Questions
1. A ____________ is a prerecorded set of instructions, which is executed b y the
   computer to perform some task.
2. A computer is a specific purpose machine that can not be tweaked to perform
   some other tasks. (True/False)
3. The operating systems keep the temperature inside the computer down, so that
   the functioning is proper. (True/False)
4. A ___________ system allows running parts of a program in parallel, on more
   than one CPU.
5. In a _______________ system, a large number of users can use the system
   concurrently.
6. The ____________ memory is an imaginary memory which is used by the
   Operating System to get a larger address space.



1.4   UNIX Operating System
1.4.1 History of UNIX

      The UNIX operating system found its beginnings in MULTICS, which stands
      for Multiplexed Operating and Computing System. The MULTICS project
      began in the mid 1960s as a joint effort by General Electric, Massachusetts
      Institute for Technology and Bell Laboratories. In 1969 Bell Laboratories
      pulled out of the project.

      One of Bell Laboratories people involved in the project was Ken Thompson.
      He liked the potential MULTICS had, but felt it was too complex and that the
      same thing could be done in simpler way. In 1969 he wrote the first version of
      UNIX, called UNICS. UNICS stood for Uniplexed Operating and Computing
      System. Although the operating system has changed, the name stuck and
      was eventually shortened to UNIX.

      Ken Thompson teamed up with Dennis Ritchie, who wrote the first C compiler.
      In 1973 they rewrote the UNIX core (called kernel) in C. The following year a
      version of UNIX known as the Fifth Edition was first licensed to universities.
      The Seventh Edition, released in 1978, served as a dividing point for two
      divergent lines of UNIX development. These two branches are known as
      SVR4 (Release 4) and BSD.

      Ken Thompson spent a year's sabbatical with the University of California at
      Berkeley. While there are two graduate students, Bill Joy and Chuck Haley,
      wrote the first Berkeley version of UNIX, which was distributed to students.
      This resulted in the source code being worked on and developed by many
      different people. The Berkeley version of UNIX is known as BSD, Berkeley




10
COE
                                                                               Unit 1, Lesson 1


          Software Distribution. From BSD came the VI editor, C shell, virtual memory,
          Send mail, and support for TCP/IP.

1.4.2 Importance of UNIX

          During past 25 years the UNIX OS has evolved into powerful, flexible, and
          versatile and robust operating system. It serves as the operating system for
          variety of computers , for single user personal computers , engineering
          workstation , multi-user microcomputers, minicomputers, mainframes, super
          computers and as well as special application devices . There are
          approximately 20 million machines now running UNIX and more than 100
          million users, and this popularity and rapid growth is estimated to be
          increased further. The success of UNIX is due to many factors including its
          portability to a wide range of machines, its adaptability and simplicity, the wide
          range of tasks it can perform, its multi-user and multitasking nature, and its
          suitability for networking. What follows is a description of the features that
          have made UNIX system so popular.

         Multi-user and Multitasking abilities
          The UNIX OS allows the use of a single computer by many users. It is also a
          multitasking system that is it allows more than one application to be run on the
          same computer at the same time.

         Powerful command set
          The UNIX OS provides a consistent and powerful set of commands that has
          made it very useful particularly for the technical people .

         Combining commands
          The UNIX provides constructs like pipes and redirection of commands which
          enables the user to create his own powerful utilities from UNIX commands.

         Excellent environment for Networking
          UNIX offers program and utilities that provide the services needed to build
          networked applications - the basis for distributed, networked computing. With
          networked computing, information and processing is shared amongst different
          computers in a network. It is useful in client server computing where the
          machines on the network can be client and servers at the same time. UNIX
          system is used as the base system for the development of the internet
          services and the growth of internet.

         Portability
          The UNIX system is far easy to be ported to new machines than other
          operating systems. The fact that, it is portable to almost any computer, results
          from its being almost entirely written in C programming language.




                                                                                            11
COE
                                                                      Unit 1, Lesson 1




1.5   UNIX Operating System – Attributes and Components
      The UNIX operating system is made up of several major components. Some
      of these components are the commands, the file system, the shell, the kernel
      and the commands.




12
COE
                                                                                   Unit 1, Lesson 1




         The Commands and User Programs

          UNIX provides a number of built-in commands and in addition user programs
          can also run.

         The File System

          The basic unit that stores information in the UNIX system is called a file. The
          UNIX file system provides a logical method of organizing files. Files are
          organized in a hierarchical file system where the files are grouped together in
          a directory.

            Example: Hierarchical File Structure
               /dtu/COE_Course/COE_101/schedule
                              Here ―dtu‖ is the parent directory which is in ‗/‘
                              root and other directories are in it


          An important simplifying feature of the UNIX system is the wa y it treats the
          files. For example, physical devices are treated as files, this permits the same
          command to work for an ordinary file or a device i.e. same command can be
          used to write to a file and printer.

         The Shell and shell scripts

          The shell is the command interpreter in the UNIX operating system. It reads
          the user specified commands and interprets them as requests to execute a
          program or a set of programs, which it then arrange to carry them out. Shell
          also provides a programming language. Shell scripts are covered in
          subsequent chapters of this unit.

         The kernel

          The kernel is the core of the OS. The kernel interacts directly with the
          hardware through a set of programs called the device drivers that are built into
          the kernel. It provides the set of services that can be used by the other
          programs; also it safeguards these programs from hardware layers. The major
          functions of the kernel are to maintain the file system, manage memory,
          access control to the computer, and handle the interrupts (these are the
          signals to terminate the processes, ctrl + C is a common example)., error
          handling, I/O handling which enables the computer interaction with the
          peripheral devices such as printers, monitors, storage devices, etc.).

          Programs use kernel through the system calls. For example, if the user wants
          some file to be opened then the program generates a system call to open the
          directories and then the files.

          The figure below shows the relationship amongst various components of the
          UNIX file system.


                                                                                                13
COE
                                                                            Unit 1, Lesson 1




             The User Commands

                   The Shell

                              The Kernel

                                  Hardware




          Components of UNIX operating system (shown in gray).



Self-check Questions
7. UNIX is a multi-user OS and also possesses multitasking abilities. (True/False)
8. The first version of the UNIX Operating System was known as _____________.
9. The file system in a UNIX Operating System is a hierarchical structure.
    (True/False)
10. The ____________ in a UNIX Operating System is used to interact with the
    hardware and executes the user commands and program.
11. The command interpreter in the UNIX system is called ___________.
12. The programs in the UNIX systems interact using the __________ calls with the
    kernel to perform the tasks.




1.6       Starting with UNIX
          This section is dedicated to the learning of how to log into a UNIX system and
          how to change password on a UNIX system. We will touch the details of the
          different types of system configurations and how we can log on to systems
          having these configurations.

         Selecting a login

          Every UNIX user on a multi-user system is recognized by a login name which
          is the only identity he has on the system. This is to be set before you use a
          multi-user or a single user UNIX system, to log onto the system.

          UNIX provides excellent built-in security. Therefore no users are permitted
          unless they are identified. For this identification, each user has a login ID.



14
COE
                                                                            Unit 1, Lesson 1




          The login ID is typically allocated by an authority (known as the system
          administrator). The system administrator is also responsible to add new users
          to the system and provide them a login name and an initial work enviro nment
          and password on the computer.

          UNIX shows a login prompt initially. User needs to type-in his login ID. Then
          the password prompt comes. After you correctly type in the password, you get
          logged into the system. The example below shows this process.

           login: akash
           password:
                  ―akash‖ is the user login name.
                  Note to keep password secure, it is not displayed when
                  you type it.

         Connecting to the UNIX System

          In a multi-user system you have to contact the system administrator as to how
          you can connect to the system using your PC or terminal. Your PC can be
          directly wired to a computer or it can be connected via LAN.

          Direct Connect - This is a method of connecting to UNIX machines when
          there is a single machine.

          Dial-in Access - You can dial in to the UNIX network using a modem, use
          terminal emulators to get the UNIX prompt.

          Local Area Network (LAN) - LAN is a client server model. Connect to the
          server using the client workstation and use the UNIX capabilities.

          IP Networks
          Using IP networks like internet one can connect to some remote machines
          using telnet capability of UNIX.


1.7       Changing Your Password
          Your password is very important information that you must not share with
          anyone. You must change it regularly (say once in 2 months) and also should
          remember it (you must not write it on paper). Your password should contain 6
          to 8 letters and should not simply be your name, your date of birth, etc. Your
          password should also contain at least one non alphabet (maybe a number).

          To change the password of your login you can use the passwd command.
            bash> password
            password: Changing password for sushobhit
            Old password:
            New Password:
            Re-enter new password:
            bash>
                                                                                         15
COE
                                                                         Unit 1, Lesson 1


      There is a simple scheme to create complex passwords and still remember
      them! All you do is to take the first letters of a line of your favorite poem or
      song and add a number or symbol to make a complex password. Here is an
      example: Say you pick the like ―Twinkle twinkle little star‖. Take the first
      letters to makes a string Ttls. And suppose your favorite symbol is = (equal
      sign) and favorite number is 2 so you append these to the string to make your
      complex password as Ttls2=. You can see that for anyone else it will too hard
      to find out while it is very easy for you to remember.

      NOTE: If you forget your password it cannot be retrieved even by the system
      administrator. The only remedy in such cases is that the system administrator
      can reset the password.



Self-Check Questions
13. ________________ is the program which is used to connect to the UNIX system
    from a remote system.
14. ___________________ in a multi-user system is the person who is responsible
    for maintaining the system.
15. Get the odd one out
    To connect to a UNIX system one of the following measures can be used
    a. Dial-in access
    b. IP Networks
    c. LAN
    d. System Calls
16. If you forget your password system administrator can give you permissions.
    (True/False)




1.8   Entering Commands in the UNIX System
      UNIX provides numerous commands. When the user types some command
      on UNIX prompt then the shell invokes the program for the command, the
      command program can invoke many system calls, these calls then interacts
      with the hardware.




16
COE
                                                                           Unit 1, Lesson 1




1.8.1 Command Options and Arguments

         UNIX system has a standardized comma nd syntax that is applicable to almost
         all the UNIX commands. Every command has some base functionality and
         additional functionality that are provided by the command line arguments.

         For Example, the ls command can be used to list the contents of a directory.
          bash> ls
          README 2134.tar.gz game_scores game_schedule

         Now let‘s use ls command with some option
            bash> ls –l
            -rw-r--r--  1 anmol friends 10777 Mar 30 16:26
           README
            -rw-r--r--   1 achint friends 21483 Feb 28 17:39
            2134.tar.gz
            drwxr-xr-x 2 amit friends      4096 Dec 12 16:41
           game_scores
         This example shows the usage of –l argument of ls command, which outputs
         thedrwx------ 3of ls command. 4096 May 10 2006
              long format arat friends
            game_schedule
         Another command that is frequently used is ‗man‘ command. This is used to
         displays the manuals of different commands.


1.9      Summing Up
         An operating system is the most important software in any computer as it fills
         the communication gap between a user and the underlying hardware. UNIX
         operating system with its unique qualities and ease to adapt is a popular and
         powerful operating system now days. In the chapters to follow we will explore
         the powers of UNIX in some details.


1.10 Answers to the self check questions
      1. program
      2. False
      3. False
      4. multitasking
      5. multi-user
      6. virtual memory
      7. True
      8. MULTICS
      9. True
      10. Shell
      11. Shell
      12. System calls


                                                                                        17
COE
                                                                      Unit 1, Lesson 1


      13. telnet
      14. system administrator
      15. h
      16. False


1.11 Terminal questions
      1. List and expand briefly the components of the UNIX operating system.
      2. What are the features of UNIX operating system that are the cause of its
         popularity amongst the users?
      3. Explain briefly the possible modes to log onto a UNIX system


1.12 References
      1. http://www.uwsg.iu.edu/usail/concepts/unixhx.html




18
COE                                                                                                                   Unit 1, Lesson 2




LESSON 2                       UNIX COMMAND

2. UNIX COMMANDS ......................................................................................................... 21

   2.0       OBJECTIVES ............................................................................................................ 21
   2.1       INTRODUCTION ........................................................................................................ 21
   2.2       THE C OMMANDS CLASS .......................................................................................... 21
   2.3       CONNECTING TO UNIX ........................................................................................... 22
      2.3.1 telnet command ................................................................................................ 22
      2.3.2 rlogin command ................................................................................................ 22
   2.4       FILE MANAGEMENT ................................................................................................. 22
      2.4.1 mv command..................................................................................................... 23
      2.4.2 cp command...................................................................................................... 23
      2.4.3 rm command ..................................................................................................... 23
   2.5       A COMMUNICATION RELATED COMMAND - FTP ....................................................... 23
   2.6       INFORMATION .......................................................................................................... 24
      2.6.1 man command .................................................................................................. 24
      2.6.2 du – Disk usage ................................................................................................ 25
      2.6.3 df – Disk free ..................................................................................................... 25
      2.6.4 quota................................................................................................................... 25
      2.6.5 who – Finding out who is logged on .............................................................. 25
   2.7       PRINTING ................................................................................................................. 26
      2.7.1 lpr – Printing ...................................................................................................... 26
      2.7.2 lprm – Removing a printing job ...................................................................... 26
      2.7.3 lpq – Checking the printing queue ................................................................. 26
   2.8       PROCESS CONTROL................................................................................................ 26
      2.8.1 ps – Finding the process ................................................................................. 26
      2.8.2 & - Running process in background .............................................................. 27
      2.8.3 Cntrl-z – Suspending a processes................................................................. 27
      2.8.4 Jobs – Finding the process in background................................................... 27
      2.8.5 Kill – Killing a process...................................................................................... 27
      2.8.6 nice – reducing the priority of process .......................................................... 27
   2.9       MISCELLANEOUS COMMANDS ................................................................................. 28
COE                                                                                                              Unit 1, Lesson 2



      2.9.1 alias / unalias command.................................................................................. 28
      2.9.2 cal (calendar) command.................................................................................. 28
      2.9.3 clear command ................................................................................................. 28
      2.9.4 crontab command............................................................................................. 28
      2.9.5 csh command.................................................................................................... 28
      2.9.6 history command .............................................................................................. 29
      2.9.7 date command .................................................................................................. 29
      2.9.8 echo command ................................................................................................. 29
      2.9.9 grep command.................................................................................................. 29
      2.9.10 unset command ................................................................................................ 29
      2.9.11 tar command .................................................................................................... 29
      2.9.12 tee command .................................................................................................... 29
      2.9.13 touch command ................................................................................................ 29
  2.10      SUMMING UP........................................................................................................... 30
  2.11      ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................... 30
  2.12      TERMINAL QUESTIONS ............................................................................................ 30
COE                                                                       Unit 1, Lesson 2




                             2. Unix Commands


UNIX as any other operating system provides a set of commands to its users, using
which, the users can perform the tasks they want. There is a huge variety of
commands that UNIX provides its user. In the present lesson we will discover and
read about the usage of many of the commands in UNIX.



2.0       Objectives
          After going through this lesson, you will be able to

         Use the UNIX commands to perform tasks
         Understand how to send and receive mails on UNIX
         Understand the file management basic command
         Understand the information and communication system using the UNIX


2.1       Introduction
          UNIX provides a number of commands. For the ease of understanding we can
          divide these commands into various categories.


2.2       The Commands class
          UNIX commands can be grouped amongst few broader classes:

         Starting and Ending
          These are the commands which are basically used to logon to the UNIX
          system, or to initiate working on to the UNIX system.

         File Management
          File is the basic data holding entity in the UNIX systems. There is a set of
          commands that can be used to maintain the file system so as to keep the data
          stored in the files, secured, updated and maintained.

         Communication
          UNIX provides communications at many levels, including mails, writing
          messages, exchanging files, etc.




                                                                                       21
COE                                                                          Unit 1, Lesson 2




         Information
          UNIX provides a number of commands to get information about the system
          like who are logged in, how much disk space is available, etc.

         Printing
          In UNIX user can give the print command and also can monitor the status of
          the job or can remove the job if required from the queue.

         Job and Process control
          As there are lots of processes which are going on in a UNIX system, it is
          sometimes required to get the information related to the user jobs running on
          the system. For this purpose UNIX provides a set of commands to monitor,
          kill, prioritize and resuming the jobs.

          In the present chapter we will look at some of these commands in detail and
          the other commands will be discussed in the chapters to follow.


2.3       Connecting to UNIX
          Before we learn anything in details the very first thing we will look at is the
          process that a user has to adopt to start with the UNIX system.

2.3.1 telnet command

          The telnet command is used for logging into a remote system. The telnet
          command presents the same login and password prompts as done on a local
          system.

2.3.2 rlogin command

          The rlogin command is used to connect to a remote computer. It is
          comparatively easier to use then telnet. Here is the syntax of rlogin command:

          rlogin [-l username] hostname

          In this the username is taken by default the username of the current user.
          Hostname is the name of the UNIX machine that is to be logged on.


2.4       File Management
          A file is a basic data storage entity in a UNIX system. There is a set of
          commands that can be used to maintain this system. We will be having an
          introductory flavor of these commands in this chapter with the complete
          discussion being taken up in the chapter on file system. Readers are advised
          to have a look at the man pages of each of these commands and try to
          understand what exactly these commands are used for.




22
COE                                                                   Unit 1, Lesson 2



2.4.1 mv command

      The mv command moves a file. The command can also be used to rename a
      file. Here is a simple example of mv command.

          bash> ls
          tempPresentation.txt
          bash>            mv          tempPresentation.txt
        finalPresentation.txt
          bash> ls
          finalPresentation.txt

2.4.2 cp command

      The cp command copies a file. Here is a simple example of the cp command.

         bash> ls
         tempPresentation.txt
         bash>            cp           tempPresentation.txt
       finalPresentation.txt
         bash> ls
         tempPresentation.txt finalPresentation.txt
2.4.3 rm command

      The rm command removes a file. Here is an e xample of the rm command.

         bash> ls
         tempPresentation.txt finalPresentation.txt
         bash> rm tempPresentation.txt
         bash> ls
         finalPresentation.txt


2.5   A communication related command - ftp
      The ftp (file transfer protocol) command is used for copying files from a
      remote computer to another computer. While mv and cp works on the same
      system at a time you might need to get files from across systems at the same
      time ftp can be used for that.

      In the example below we can see how ftp can be used to connect to a remote
      machine. In this example user ‗achint‘ gets file from machine mitserv.




                                                                                   23
COE                                                                           Unit 1, Lesson 2




            bash> ftp mitserv
            Connected to mitserv
            Name: achint                      # User types his
            login id
            31 Please specify the password.
            Password:                        # password will not
            be visible
            230 Login successful.
            Remote system type is UNIX.
            ftp> get myPresentation.txt      # Now you are in ftp.
            See the prompt
            250KB data transfer successful
            ftp> quit
          The ftp prompt provides few limited commands as listed below:
            bash>                            # You are out of ftp
            now.
         bin – Changes the file transfer type to support the binary image transfer.
         get – Used to ‗get‘ the files from remote machine
         mget- multiple get commands
         ls – Used to list the contents of a directory on a remote machine
         cd – Used to change directories on the remote machine
         pwd – Used to get the present working directory on remote host
         lpwd – Gives the current working directory in local host.


2.6       Information
          The information UNIX commands, regarding other users, disk quota and other
          things can be retrieved using some of the UNIX commands. In this section we
          will be discussing about some of these commands.

2.6.1 man command

          UNIX traditionally provides the manual pages (called ‗man‘ pages) for all the
          built-in commands and for system calls.

          You can learn a lot by referring to the manual pages for commands.

          The general syntax of the command is

          man [-] [-k keywords] topic/command

          The example below shows a part of the manual page of ‗du‘ command.

             bash> man du




24
COE                                                                     Unit 1, Lesson 2



2.6.2 du – Disk usage

      This command is used to find out how much disk space is been occupied at
      present by the files and directories of the user.

2.6.3 df – Disk free

      The df command tells how much disk space is left which can be used.

2.6.4 quota

      This command is used for knowing as to how much disk space the files are
      occupying on the file system.

2.6.5 who – Finding out who is logged on

      The who command displays the information like the usernames, terminal IDs
      and process IDs of other users and processes running on the computer.

      General syntax of the command is:
      who [-q] [am i]

      Following example shows the output of who command.

      bash> who
      singhs :0        May 28 14:05
      achint pts/0     May 28 14:06 (lx-ptiwari:0.0)
      anmol pts/1      May 28 14:12 (lx-ptiwari:0.0)




Self-Check Questions
1. The commands below are used to connect to the remote computers:
   i. telnet
   ii. rlogin
   iii. rm
2. It is not possible to logon to another machine with another username by any
   means. (True/False)
3. If some files are needed to be transferred from a remote location to the current
   location, we can use the ________________ command for this purpose.
4. If a user needs to know the usage of the write command, he can use the
   ____________ command to know how the command works.
5. There is a restriction on the usage of the disk space by a user or a group on the
   UNIX system and this disk space restriction can be found by using the command
   _____________.
6. To know as to how much total disk space your files and directories have taken,
   issue __________ command.



                                                                                     25
COE                                                                       Unit 1, Lesson 2



7. On a multi-user system, there are more than one people logged onto a machine
   and this sometimes chokes that machine off. To get in information as to who all
   are logged onto the machine we can use ______________ command.



2.7   Printing
      UNIX provides commands that for printing documents. Additionally, it is
      possible to control the printer queue and also to kill the processes if required
      to cancel the printing job.

2.7.1 lpr – Printing

      This command can be used to print some text in a file. This is used to specify
      a printer otherwise it issues a print job to the default printer set by the user.

2.7.2 lprm – Removing a printing job

      The lprm command can be used to cancel the print jobs that have been
      queued or printing. It can be used to cancel printing jobs on the specified
      printer or to cancel the job on the default printer.

2.7.3 lpq – Checking the printing queue

      This command shows the printer queue status on the named printer. Jobs
      queued on the default destination will be shown if no printer or class is
      specified on the command-line.


2.8   Process Control
      When you run a program in UNIX, the program‘s copy starts to run. This
      running program copy is called a process. The concept of process is
      fundamental to UNIX OS. So, you should find out and understand details
      about processes. If you run the same commands twice, each time a new
      process is started.

      Every process is identified by a unique process ID and this ID can be used to
      refer to this process or to perform any further operations on the process, like
      killing the process. We will have a look at the commands which can be used
      to control the processes.

2.8.1 ps – Finding the process

      This command is used to list all the processes being run on the machine.

       bash> ps –ef
       PID PPID User     Process …
       233 230    achint ls –l
       345 342    anmol ps –ef

26
COE                                                                      Unit 1, Lesson 2




2.8.2 & - Running process in background

      By put ‗&‘ at the end of any command, that command runs in the background.
      Time consuming commands can be put into background so that you can
      continue working on the same terminal.

2.8.3 Cntrl-z – Suspending a processes

      If some command is by mistake issued and you want to suspend this
      command and do something else first. Then you can use Cntrl-z to suspend
      this process and get the CPU free for some other more important work.

2.8.4 Jobs – Finding the process in background

      To find the processes running in the background you can use the jobs
      command. This is different from the ps command.

2.8.5 Kill – Killing a process

      If some process is running for long time or is producing some unwanted
      results you can use the ‗kill‘ command to kill the process.

      The syntax of command is
      Kill [-signal] [process id]

      Sometimes a process may still not get killed and you still want to kill it, you
      can send the -9 signal to kill it.

2.8.6 nice – reducing the priority of process

      This command can be used to reduce the priority of a command and let other
      commands run earlier than the command.

      The syntax of command is
      nice command [command option]



Self-Check Questions
8. If a print job is fired it is not possible to abort the printing. (True/False)
9. To know as to what all are the print processes that are at the printer in queue, we
    can use ____________ command.
10. To print some text in a file, use ______________ command.
11. To change the priority of a job we can use the _________ command.
12. If some process is fired which is not required at the moment and we need to fire
    another process, then we suspend the process using _______________
    command and continue with the process later on.


                                                                                      27
COE                                                                        Unit 1, Lesson 2



13. If it is required to know the processes running on to the system then we will issue
    ______________ command.



2.9   Miscellaneous commands
      Besides the other commands that we have discussed in this lesson by now,
      there are numerous other commands in UNIX with lots of options which can
      be used to perform some amazing tasks. We will be discussing some of these
      commands with useful and common options that are used. For other options
      readers can refer the man pages of these commands.

2.9.1 alias / unalias command

      To create or remove an alias for some command these commands are used.
      The example shows the use

        bash> alias rm ―rm –i‖
                                            Creates an alias rm which calls rm –i
        bash> unalias rm

                                 Now rm will call rm command


2.9.2 cal (calendar) command

      This command displays the calendar.

2.9.3 clear command

      This command clears the screen

2.9.4 crontab command

      It is sometimes required to run some commands at a specific date and time.
      For this purpose ‗crontab‘ command can be used. See man crontab for see
      details. The cron (see man cron) maintains a file which is managed using the
      crontab command. This file contains the information about the command and
      the time and date of the execution of the command. Here is an example:

        bash> crontab –l
        0 0 * * 5 echo ―This is a cron‖ | mail john
                      Contents of crontab file.

2.9.5 csh command

      This command is used to run the C shell or to execute a C shell script.
      The syntax for this command is
      csh [filename]


28
COE                                                                      Unit 1, Lesson 2




2.9.6 history command

      This command is used to list the commands that you have typed so far.

2.9.7 date command

      This command prints the system date and time. The date command has many
      formatting arguments. See man date for details.

       bash> date
       Friday 25 Jan 2008


2.9.8 echo command

      This command echoes back string given to it.

       bash> echo ―My name is achint‖
       My name is achint

2.9.9 grep command

      This command is used to search a pattern in a file. We will see more details
      on grep command in subsequent chapters. Here is a simple example.

       bash> grep goto file.c
       /*You should not use goto in c programming */

2.9.10 unset command

      The unset commands removes a shell variable.

2.9.11 tar command

      This command is used to create an archive of files or to extract files from an
      existing archive. See man tar for details.

2.9.12 tee command

      This command copies text from a pipe into a file. See man tee for details.

2.9.13 touch command

      This command changes the date and time of a file without changing the files
      content. The touch command creates a file if no t exiting.




                                                                                      29
COE                                                                       Unit 1, Lesson 2




Self-Check Questions
14. An ____________ is a short command or word that points at some path, or
    absolute command name.
15. To change the date and time stamp on a file without reading the file __________
    command can be used.
16. To get the text from a pipe into a file ______ command can be used.



2.10 Summing Up
           UNIX provides a rich set of commands for file management, printing, process
           control, etc.


2.11 Answers to the self-check questions
      1. telnet, rlogin.
      2. False.
      3. ftp
      4. man.
      5. quota
      6. du.
      7. Who
      8. False
      9. lpq
      10. enscript
      11. nice
      12. cntrl-Z
      13. ps
      14. alias
      15. touch
      16. tee


2.12 Terminal Questions
      1. Define and explain the various command classes
      2. How is communication handled in UNIX? What is FTP?
      3. Describe how File Management is implemented in UNIX
      4. List the commands and their usage for various commands used in process
         control
      5. Explain the various print commands in UNIX




30
COE                                                                                                                      Unit 1, Lesson 3




LESSON 3                        UNIX FILE S YSTEMS
3. UNIX FILE SYSTEM ....................................................................................................... 33

   3.0       OBJECTIVES ............................................................................................................ 33
   3.1       INTRODUCTION ........................................................................................................ 33
   3.2       FILES ....................................................................................................................... 33
      3.2.1 Filenames .......................................................................................................... 33
      3.2.2 Filename Extensions ....................................................................................... 34
   3.3       DIRECTORIES .......................................................................................................... 34
   3.4       FILE TYPE................................................................................................................ 34
      3.4.1 Links ................................................................................................................... 35
      3.4.2 Special Files...................................................................................................... 35
   3.5       PATH TO A FILE ........................................................................................................ 36
      3.5.1 The root directory ............................................................................................. 36
      3.5.2 Absolute Path.................................................................................................... 36
      3.5.3 Relative Path..................................................................................................... 36
   3.6       MANIPULATING FILES .............................................................................................. 36
      3.6.1 Moving and Renaming Files and Directories ............................................... 36
      3.6.2 Copying files and directories .......................................................................... 36
      3.6.3 Removing Files and Directories ..................................................................... 37
      3.6.4 Creating a directory.......................................................................................... 37
      3.6.5 Listing the files .................................................................................................. 37
   3.7       FILE PERMISSIONS .................................................................................................. 38
      3.7.1 File Permissions ............................................................................................... 38
      3.7.2 Permissions for directories ............................................................................. 39
      3.7.3 Changing the permissions on the file ............................................................ 39
   3.8       CHANGING FILE OWNER AND GROUP .................................................................... 40
   3.9       FILE SEARCH........................................................................................................... 40
   3.10      VIEWING BEGINNING AND END OF A FILE................................................................ 40
   3.11      ANSWERS TO THE SELF CHECK QUESTIONS ........................................................... 41
   3.12      TERMINAL QUESTIONS............................................................................................. 42
   3.13      SUGGESTED READING MATERIAL........................................................................... 42
Unix shell program training
COE                                                                             Unit 1, Lesson 3




                             3. UNIX File System


In the UNIX operating system the basic storage block is known as a file. This lesson
focuses at understanding the concepts of file manipulation and handling.



3.0       Objectives
          After going through this lesson, you will be able to

         Understand   the basic concepts of files and directories
         Understand   the paths and pathnames in UNIX systems
         Understand   the UNIX file types
         Understand   the basic UNIX commands related to the file system
         Understand   the file manipulation and file security


3.1       Introduction
          In a UNIX operating system the basic structure that stores data is known as a
          file. You can store data of any format in a file. Multiple files can be put
          together in a directory. Apart from containing files, a directory can contain
          other directories as well. A directory that is inside another directory is called a
          subdirectory.
          A file is analogous to a notebook. A directory is analogous to a bag that
          contains files.


3.2       Files
          A file contains a sequence of bytes stored on a storage device, such as a
          disk. On the disk the file is not necessarily stored on a single sector but can
          be scattered on the disk The OS, keeps track of the information that belongs
          to a specific sequence of data.

3.2.1 Filenames

          Each file has a name. Any name can be given to a file. The name of a file can
          be changed anytime. Unlike windows, UNIX file names do not contain spaces.

          An important thing to remember here is UNIX is case sensitive. Which means
          ‗A‘ is different than ‗a‘, so one should be very careful while using the cases for
          separating the file names. So, myfile.txt and myFile.txt are different files.


                                                                                             33
COE                                                                           Unit 1, Lesson 3




3.2.2 Filename Extensions

      UNIX does not enforce any specific extensions on file names. This is unlike
      Windows where extensions are used to invoke applications directly.

      In UNIX you can choose any extension for your files. Even multiple extensions
      are permitted (e.g.,data,tar.gz). Also files need not always have extensions
      (e.g., myFileOf24Dec2007).

      Since it is possible to not give extensions, one can create files where
      extensions are misleading. For example, myProg.db may be a C program
      while myData.cpp may be containing simple text data. Obviously this is not
      desirable and one must be careful in putting proper extensions.

      Though UNIX itself does not enforce any extensions, there are many
      important utilities/programs that expect a specific file exte nsion. For example,
      the C compiler expects files with .c or .h extensions.


3.3   Directories
      Files are kept in directories. Directories are the groups of files in some logical
      structure totally dependent on the application and the user requirements. A
      directory can contain files and other subdirectories.
      The figure below shows how the directory myData contains subdirectories
      which in turn contains the files.

                                  myDat
                                  a/

                        Investmen       Official
                        ts/             /

                RBI        ICI      Sal custo Reports
                Bonds      CI       es mers
                                    pla
                                    n
      Each directory in UNIX contains two special subdirectories:
      ./ (The dot directory) This indicates the current directory itself.
      ../ (The dot dot directory) indicates the parent directory of current directory.

         bash> pwd
         Investments         Shows current directory as Investments/
         bash>cd ..
         bash>pwd
         myData                  Current directory after cd .. is myData/ (the parent)

3.4
         My name is achint


34
COE                                                                          Unit 1, Lesson 3




File Type
      Regardless of the data contained in a file, UNIX associates a file type for each
      file. There are 4 file types - ordinary files, directories, links and special files.

      Ordinary file is any file that you commonly use. These include text files,
      executable programs, shell scripts, etc. Also, we have already see what are
      directories. Lets now see links and special files.

3.4.1 Links

      A link is not a file but it is a second name to a file. Sometimes linking files is a
      good option over copying because once copied, the copies can be changed
      differently. On the other hand if you create a link then there is actually only
      one copy of the file. A link is created using the ln command of UNIX. There
      two types of links, soft link and hard link. See man ln for more details.

3.4.2 Special Files

      UNIX represents even devices with files. These files are special files. For
      example, the audio output is typically /dev/audio file. What can you do with
      such a special file? Well, you can write into it or read from a special file and
      UNIX hides the details on how it is actually working with the device. For
      example, you can simply cat a music file to /dev/audio and it will be played!



Self-Check Questions
1. IT is possible to have multiple filename extensions in a file in UNIX. (True/False)
2. It is required to have a filename extension in a file in UNIX, which signifies the
   properties of that file. (True/False)
3. Filename work and Work points to the same file in a UNIX file system.
   (True/False)
4. Directories acts as a categorization structure of the data in a UNIX file system.
   (True/False)
5. __________________ is a directory under the parent directory, which can be
   used for the categorization of data further down the hierarchical file structure.
6. Which is not a UNIX file type?
   a) Links
   b) Symbolic Links
   c) Program files
   d) Directories
7. A ______________ (soft/hard) is only a te xt file that points to some other file
   somewhere in the file system and does not contains the data.




                                                                                          35
COE                                                                        Unit 1, Lesson 3




3.5   Path to a file
3.5.1 The root directory
      UNIX OS treats the directory / as the root directory. The root directory is the
      ultimate parent of all other directories on a UNIX system.

3.5.2 Absolute Path

      Every file on a system has a path that starts from the root.
      For example,

        bash> pwd
        /dtu/IT_Courses/IT_101/schedules.txt

                                This is the absolute path to the ―schedules‖ file

.
      The pwd command always lists the absolute path.

3.5.3 Relative Path

      When in a directory, if you know the relative position of a file, you need not
      access that file using absolute path. You can simply use the relative path to
      the desired file as well. This is shown in an example below:
      You can also access files using relative paths. For example,

        bash> pwd               This is the relative path of
        /dtu/It_Courses/IT_999 ―schedules.txt‖ with respect to
                                ―/dtu/It_Courses/IT_999‖
        bash> ls ../IT-102/schedule.txt



3.6   Manipulating Files
      The file manipulation operations are – file deletion, file renaming and moving
      files from one location to another.

3.6.1 Moving and Renaming Files and Directories

      The mv command of UNIX moves files and directories to specified locations.

        bash> mv –i data data.old                        Moves data to data.old
        bash> mv –i data new
        bash> mv –i oldDir newDir
                                          Moves data into new/ directory

                            Moves oldDir to newDir



3.6.2 Copying files and directories



36
COE                                                                                    Unit 1, Lesson 3



      The cp command of UNIX copies files and directories..

      bash> cp old new             Copies file old to new. Overwrites new if exists.
      bash> cp –R /home/joe/bread /home/jam/food

                              Copies all files and subdirectories to the target
                              directory


3.6.3 Removing Files and Directories

      Often you want to files or some directory (including its contents). For example
      you may be cleaning your system. The rm command deletes files and
      directories.

       bash> rm file.txt my.txt                     Removes specified
                                                    files.
                                           -f option indicates that rm will not give
       bash> rm –f file.txt
                                           error even if file given to be deleted
                                           does not exist.
       bash> rm –r directory1

                          -r option indicates delete all subdirectories as well.



      Be careful with rm command. A file or directory once deleted cannot e
      undeleted in UNIX. There is no such thing as trash can in UNIX. It is advisable
      to use the –i option of rm command all the time. See man rm for details.

      If a directory is empty, then it can be deleted using rmdir command. See man
      rmdir for details.

3.6.4 Creating a directory

      The mkdir command creates a new directory.
       bash> mkdir project             Will create directory project/
       bash> mkdir /home/anmol/data
       bash> mkdir ../../myDir      Absolute path can be given to create a dir

                                Relative path can be given

3.6.5 Listing the files

      The ls command of UNIX lists files and directories in the current directory. lt
      has a large number of other options (see man ls).




                                                                                                    37
COE                                                                            Unit 1, Lesson 3




           bash> ls -l                                       achint is the file owner.
           drwxr--r-- 1 achint editors 4096 drafts           editors is the group. Size is
           -rw-r--r-- 1 achint editors 30405 edition-32      8460 bytes
           -r-xr-xr-x 1 achint editors 8460 final_draft

             This field explains file permissions and file
             type the fields are explained in table below




Self-Check Questions
8. The __________________ is the parent directory of all types of directories in the
    UNIX file system.
9. The name of file starting from the root directory is called the _____________
    pathname of the file.
10. The relative pathname of a file is the name of the file with respect to the parent
    directory. (True/False)
11. Pick the odd one out
    Following operations can be performed on the file system
    a) Building
    b) Listing
    c) Renaming filenames
    d) Copying
12. On using the ‗mv‘ command from one file to an existing file it ___________
    (appends/overwrites) the contents of the moved file onto existing file.
13. To copy one directory to the other it is mandatory to use the option _______ with
    the command ‗cp‘.
14. Command ‗rmdir‘ can be used to delete the complete hierarchical directory
    structure. (True/False)



3.7       File Permissions
          UNIX enforces permissions for files and directories. If you are the owner of a
          file, you can put permissions whether the file should be readable by others or
          not, and so on. Lets see more details about file permissions.


3.7.1 File Permissions

          The user of the UNIX file system can belong to three classes:

         The owner of the file
         The group which the file belongs to
         Other users


38
COE                                                                        Unit 1, Lesson 3




      bash> ls -l
      drwxr--r-- 1 achint editors 4096 drafts
      -rw-r--r-- 1 achint editors 30405 edition-32 These 3 indicate
      -rwxr-xr-- 1 achint editors 8460 final_draft group people can
                                                   read/execute but
                                                   cannot write into
                                 -rwxr-xr--        this file
                                                       These 3 indicates
              First letter:
                                                       others can only
              - means                                  read this file.
              ordinary file
              d means           These 3 letters
                                indicates file
              directory
                                readable, writable
              l means its a
                                and can be executed
              link
                                by the owner.


3.7.2 Permissions for directories

       For the directories read permissions enables the user to list the contents of
       the directory; Write permissions allows the users to create a file or a directory
       inside that directory and execute permissions allows to change the present
       working directory to that directory.

3.7.3 Changing the permissions on the file

       The chmod command changes the permissions for a file and directory. See
       man chmod for details. There are several ways to change the permissions of
       a file. Here are few examples:
         bash>chmod ug+r w sample       Permits user and group to read and write
         bash> ls -ld sample            in file
         drw-rw---- 2 achint editor     96 Dec 8 12:53 sample


         bash> chmod a-rwx sample      Removes permissions for all
         bash> ls -l sample
         ---------- 2 amol editor 96 Dec 8 12:53 sample



       There is another form in which the permissions can be directly set for the files
       by using an octal code. With three-digit octal notation, each numeral
       represents a different component of the permission set: user class, group
       class, and "others" class respectively.

       For example, the number 764 in octal can be represented as following in
       binary 111110100.


                                                                                        39
COE                                                                              Unit 1, Lesson 3




         The first octal digit when converted to binary represents the permissions for
          owner (7 in octal is 111 in binary which implies rwx for owner).
         The next octal digit when converted to binary represents the permissions for
          the group (6 in octal is 110 in binary which implies rw- for group).
         The last octal digit when converted to binary represents the permissions for
          the others (4 in octal is 100 in binary which implies r-- for other).


3.8       Changing File Owner and Group
          The chown command changes the owner of a file. See man chown for details.

          The chgrp command changes the group of a file. See man chgrp for details.

3.9       File Search
          The find command helps in locating files and directories. This is a powerful
          command and has lots of options. See man find for details. Here is the syntax
          of the find command.

           find search_directory –name file_name [-print]


          The find command searches through the contents of one or more directories
          including all of their subdirectories.

           bash> find / -name schedule -print
           /dtu/IT_courses/IT_101/schedule         Finds all the files in ‗/‘ named
           /dtu/IT_courses/IT_102/schedule         schedule


          Another example in which same file name is searched in two directories:

            bash> find . –type d –name abc -print

                            Finds ‗directory‘ abc and not file in the present directory


.


3.10 Viewing Beginning and End of a file
          UNIX provides commands using which it is possible to display the contents of
          the start or end of the file. These are head and tail commands.

          head – Start of the file
          tail – end of the file




40
COE                                                                       Unit 1, Lesson 3



         Example usage

          bash> head –n 10 file
                                  Shows the 10 starting lines of ‗file‘




Self-Check Questions
15. Pick the odd one out
    The users in a UNIX file system can be categorized as:
    a) Owners
    b) Group
    c) Friends
    d) Other users
16. To change the file permissions from one set to another, the command
    ___________ can be used.
17. __________________ command is used to change the owner and the group of
    the file.
18. The _______ command lets you search for files and directories.
19. The _______ command will be useful to show the last few lines of a file.



3.11 Answers to the self check questions
      1. True
      2. False
      3. False
      4. True
      5. Subdirectory
      6. Program files.
      7. Soft link
      8. Root.
      9. Absolute path..
      10. True
      11. Building
      12. overwrites.
      13. –r
      14. False
      15. Friends
      16. Chmod
      17. Chown, chgrp
      18. Find
      19. tail




                                                                                       41
COE                                                                       Unit 1, Lesson 3




3.12 Terminal questions
      1. Write a detailed note about the hierarchical file structure.
      2. Explain briefly the manipulating operations possible on the file structure
      3. Write a brief note on the permissions on the files and directories in UNIX.
         Also, explain how we can change permissions of the files in UNIX using the
         chmod command. Use some relevant examples to explain the concepts.
      4. Explain the UNIX system file types, also explain the salient features of each
         file type


3.13 Suggested Reading Material
      1. Unix Programming Environment, by Kernighan and Pike.
      2. Design of Unix Operating System, by Maurice J. Bach




42
COE                                                                                                                  Unit 1, Lesson 4




LESSON 4                       T HE VI T EXT EDITOR
4. THE VI TEXT EDITOR.................................................................................................... 45

   4.0       OBJECTIVES ............................................................................................................ 45
   4.1       INTRODUCTION ........................................................................................................ 45
   4.2       FILES CONTAIN STREAM OF CHARACTERS .............................................................. 45
   4.3       HOW VI HANDLES THE FILES ................................................................................. 46
   4.4       INVOKING VI ............................................................................................................. 46
   4.5       MODES OF VI ........................................................................................................... 46
      4.5.1 Command mode ............................................................................................... 46
      4.5.2 Edit mode........................................................................................................... 46
      4.5.3 Switching between command mode and edit mode................................... 47
   4.6       POSITIONING TE XT ON THE SCREEN ...................................................................... 47
      4.6.1 Scrolling and moving the Screen ................................................................... 47
      4.6.2 The GOTO Command ..................................................................................... 48
      4.6.3 Searching........................................................................................................... 48
   4.7       POSITIONING THE C URSOR : H, L, J, K COMMANDS................................................. 48
   4.8       EDITING USING SCOPES .......................................................................................... 49
      4.8.1 Delete Text (d, D) ............................................................................................. 50
      4.8.2 Change Text (c, C) ........................................................................................... 50
      4.8.3 Replace Command (r, R) ................................................................................ 50
      4.8.4 Erase Command (x, X) .................................................................................... 51
      4.8.5 Undo Command (u, U) .................................................................................... 51
   4.9       TE XT INSERTION ...................................................................................................... 51
      4.9.1 Append Command (a, A) ................................................................................ 51
      4.9.2 Insert Command (i, I) ....................................................................................... 52
      4.9.3 Open Command (o, O) .................................................................................... 52
      4.9.4 Read Command (:r) ......................................................................................... 52
   4.10      GLOBAL SEARCH AND REPLACE FOR TEXT ............................................................ 52
   4.11      REARRANGING AND DUPLICATING TEXT................................................................. 53
      4.11.1 Copying Text and Moving the Copy .............................................................. 53
      4.11.2 Deleting Text and Moving It ............................................................................ 54
COE                                                                                                               Unit 1, Lesson 4



  4.12       NAMED BUFFERS .................................................................................................... 54
      4.12.1 Using the named buffers ................................................................................. 55
  4.13       MISCELLANEOUS INFORMATION .............................................................................. 56
      4.13.1 Creating Line Numbers ................................................................................... 56
      4.13.2 Lines and Sentences in VI .............................................................................. 56
      4.13.3 Joining Lines ..................................................................................................... 57
      4.13.4 Repeating a Command ................................................................................... 57
      4.13.5 Editing Multiple Files Using vi......................................................................... 57
      4.13.6 Mark Command ................................................................................................ 58
  4.14       SAVING OR STORING A FILE.................................................................................... 58
      4.14.1 Writing to the file ............................................................................................... 59
      4.14.2 Exiting the vi editor ........................................................................................... 59
  4.15       SUMMING UP........................................................................................................... 60
  4.16       ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................... 60
  4.17       TERMINAL QUESTIONS ............................................................................................ 61
COE                                                                            Unit 1, Lesson 4




                            4. The VI Text Editor


When you write programs, scripts or modify data, write mails, etc., you will need to
use text editor. This lesson focuses on the VI text editor; one of the most commonly
used text editors in UNIX systems.



4.0       Objectives
          After going through this lesson, you will be able to

         Understand   how to open and edit files using vi
         Understand   various text insertion and deletion methods in vi
         Understand   the basic structure of vi text editor
         Understand   the commands to edit text using vi and scopes
         Understand   miscellaneous other features of vi


4.1       Introduction
          vi is a visual, non-graphical and interactive text editor which allows a user to
          create, modify, and store files on the computer.

          Note that in this chapter, the cursor is shown by putting an underscore for a
          character. For example: The cursor is at the letter ‗n‘ in the following line.
          This is a line.
          There's an editor out there that programmers have been using to edit their
          programs for the last 24 years. It's called vi (say vee-eye) and it is it is quite
          powerful.

          http://www.websiterepairguy.com/articles/vi/12_learn_vi.html


4.2       Files contain stream of characters
          When you type characters or numbers, etc. each key goes as an ASCII
          character. For example, ‗a‘ gets recorded as ASCII 97. When you write lines
          like these
          This is line 1
          This is line 2

          These lines are stored as a stream of characters like ―This is line 1 nThis is
          line 2‖. Here the n is a special character which signifies a new line.


                                                                                            45
COE                                                                       Unit 1, Lesson 4




4.3   How Vi Handles The Files
      When you open a file in vi, the file contents are read into a buffer. All text
      editing jobs are done in memory as the buffer. The file on the disk is not
      updated unless vi is explicitly asked to save the changes. This gives an option
      to change the content of the buffer until you are not satisfied without changing
      the file on the disk.

4.4   Invoking vi
      The vi editor can be invoked using the following command

       $ vi demo.txt


      The figure below shows how the file looks when opened in vi.



                                  The cursor
         ~
         ~                       Tile(~) in vi represents an
         ~                       empty line.
         ~
         .
4.5   Modes of vi
         .                       File
                                 information
         ―myfile‖ [new
      vi has two modes in which you will work.
         file]
4.5.1 Command mode

      The command mode is the default mode. All vi commands work only in the
      command mode. In the command mode you cannot write text. You can only
      move around in the text, delete text, modify existing text, search for text, etc.

4.5.2 Edit mode

      In edit mode you can add new text in vi. In edit mode you cannot use any
      commands to search or navigate in the text.




46
COE                                                                          Unit 1, Lesson 4




4.5.3 Switching between command mode and edit mode

      When in command mode, few commands take you to edit mode. For
      example, in the command mode, if you press i, you will get to the edit mode
      and can add text.

      When in the edit mode, you can stop editing further and go to the command
      mode by pressing the <Esc> key.


4.6   Positioning Text on the Screen

 This is a
                         You are in command
 line                    mode and cursor is at ‘a’.
                         press ‗i‘
 This is a                  Cursor is at same position
 line
                            but edit mode has started
 This is da                 now press ‗d‘
                        Cursor is at letter ‗a‘ and
 line
       Press            letter ‗d‘ is added.
       ‗esc‘
 This is provides several ways you are in text you want to edit in a file.
      vi da               Now to reach the
 line                     command mode
4.6.1 Scrolling and moving the Screen

      By scrolling the screen we can reach the text desired. The table below
      explains how one can scroll the screen.

              Command             Resulting Action
              Cntrl+u             Moves window upwards one complete screen
              Cntrl+d             Moves window downwards one complete screen
              H                   Takes cursor to the top of the screen
              L                   Takes the cursor to the bottom of the screen
              M                   Takes the cursor to the middle of the screen

      All these commands work only in the command mode.




                                                                                          47
COE                                                                         Unit 1, Lesson 4




4.6.2 The GOTO Command

      Sometimes you already know the line number where you want to reach. You
      can use the GOTO in such cases. The table below explains the command and
      the resulting action.

              Command              Resulting Action
              G                    Moves cursor to the last line
              <N>G                 Moves the cursor to the Nth line
              Like 33G
              :<N>                 Moves the cursor to the Nth line
              Like :65

4.6.3 Searching

      It is also possible to search for a pattern and by this the screen will be moved
      to the occurrences of the desired pattern.

      Here are the commands that work for search in vi..

              Command              Resulting Action
              ‗/pattern‘           Searches the pattern forward from current
                                   cursor position
              ‗?pattern‘           Searches the pattern backward from current
                                   cursor position
              :set ic              This makes the subsequent searches case
                                   insensitive (ic in set ic stands for ignore case)
              :set noic            This makes the subsequent searches case
                                   sensitive

      Once you start a search you can repeat the search in a simple way. On
      keying in ‗n‘ vi goes to the next instance of pattern in the file and using ‗N‘ it
      searches in opposite direction.


4.7   Positioning the Cursor : h, l, j, k commands
      This section explains finer control of the cursor.

      You can move the cursor by use of "arrow" keys. You can also use the
      "direction" keys "h" (move left by one character), "j" (move down to next lined),
      "k" (move up to previous line), and "l" (move right by one character).

      The "RETURN" key is similar to the "j" key in that it moves the cursor down
      one line. However, the "RETURN" key always positions the cursor at the
      beginning of the next line; whereas, the "j" key moves the cursor straight down
      from its present position, which may be the middle of a line. Moving several
      spaces may be accomplished by repeatedly pressing the "RETURN", direction



48
COE                                                                           Unit 1, Lesson 4



       or arrow key; such as, "k" "k" "k" to move upward 3 lines. You can also
       precede any of these keys with a number and achieve the same results, "3k".



Self-Check Questions
1. If in a file cursor is resting at the 34 line and it is desired to be placed onto the 74
   line then the command that is to be issued is _____________G.
2. On searching with ―?‖ and ―/‖, the search respectively will be done
   ______________ and ____________________. (backwards/forward).
3. To get the file statistics using the VI editor the command required to be issued is
   ___________.
4. On keying in ―N‖ while searching for a pattern using ―?‖ the cursor will reach the
   next instance of the pattern ________________. (backward/forward)
5. To move to the 25 word in the line while the cursor is on 18 line the command
   that can be issued is ___________.
6. To move to the beginning of the line on which the cursor is residing in a text file
   the command that can be issued is __________.
7. The vi editor sets or creates a temporary buffer area while editing a file which is
   stored on the disk and is used later on for the reference purpose by the editor.
   (True/False)




4.8    Editing using scopes
       vi commands have scope built into them. For example, when you say ‗dd‘
       then first ‗d‘ indicates the delete operations and the second ‗d‘ tells it to apply
       the command on a line. Similarly, ‗yy‘ yanks a line. But the commands like ‗d‘
       and ‗y‘ can be given a scope and VI commands also have upper case
       versions.

               Scope             Text Unit Encompassed
               0                 Beginning of line
               $                 End of line
               W w               Word right
               B b               Word left
               E e               End of word right


       With the scopes we can use the operators to get more powerful outcomes.
       We can further do editing very much locally using the combination of the
       operators and scopes. In this section we will discuss this combination.




                                                                                           49
COE                                                                        Unit 1, Lesson 4




4.8.1 Delete Text (d, D)

      The delete command is used in command mode to remove portions of text
      from the file being edited. The scope must be specified after the delete
      operator. Some of the most common scopes used with the delete operator
      shown in the next table.

              Delete             Resulting Action
              operator
              and
              scope
              dw                 Delete word forward
              D(                 Delete complete sentence backward
              d)                 Delete complete sentence forward
              dG                 Delete from current line to end of file
              dL                 Delete from current line to end of screen
              d/^xyz             Delete from current line to first occurrence of
                                 pattern
              dtx                Delete from current place to first occurrence of ‗x‘

      NOTE: The same scope prefixes can be used with all the scoped text editing
      commands so we will not discuss them with any further commands b ut
      different scopes or operators, if any will be discussed.

      NOTE: It is important to remember that the current cursor position serves as
      the starting point for the scope. This means if you do scoped deletion, it will
      happen starting from the current point. For example, typing "2dd" will delete
      two consecutive lines beginning with the current line.

4.8.2 Change Text (c, C)

      You can use the change command to change the text in a line. Scopes are
      applied in the same manner as they are used with the delete command.

      On issuing the change text command, vi gets into the edit mode and after the
      text insertion on issuing the <ESC> key it returns to the command mode. The
      example shows how change command can be used.

       This is the line to watch
                                           Cursor is positioned
                                           at‗t‘
         On issuing the command
         ‗2cw‘ or change two words
         and keying in ―new line‖
                                   Text inserted in place of
4.8.3 Replace Command (r, R)
        This is new line to watch two words
      The replace command is used to replace portions of text on the screen. The
      table shows the two variants of the replace command and their usage for
      replacing text.


50
COE                                                                      Unit 1, Lesson 4




             Replace                Text replacing action
             command
             r                      Used to replace a single character at a time
             R                      Used to replace as many characters as
                                    there are keystroke until user issue <ESC>


       This is the line to watch out for.
                                            Cursor positioned at ‗l‘
        On issuing ‗r‘ command and
        typing ‗m‘
                                              ‘l‘ is replaced by ‗m‘
       This is the mine to watch out for.
4.8.4 Erase issuing ‗R‘ command,
        On Command (x, X)
        keying in ―kite‖ and <ESC>    Complete word is
      The erase command removes a character.
                                      replaced
       This is the kite to watch out for.
                Erase                Erase Action
                Command
                x                    Erase character on which cursor is
                                     placed
                X                    Erase character left to cursor

4.8.5 Undo Command (u, U)

      Undo command reverses the effect of the editing operations done on a file.

      ‗u‘ reverses the effect of last editing command whereas ‗U‘ reverses the effect
      of all the editing operations on the file since last save.


4.9   Text Insertion
      vi editor provides several ways to insert the text in the file. We will be
      discussing each of these methods in some detail but it is advisable for a newly
      inducted candidate to take up one approach and use that to insert the text.

4.9.1 Append Command (a, A)

      It is used to add to the existing text. It has two forms ‗a‘ and ‗A‘. These two
      forms are explained in the figure below.

       The student laughed.

              On issuing ‗a‘ command and typing ‗s‘ and <ESC>

       The students laughed.
                                            Text appended after the cursor



                                                                                      51
       The students laughed. Aloud.
COE                                                                         Unit 1, Lesson 4




4.9.2 Insert Command (i, I)

      This command is used to insert the text into a text file. This command has two
      forms ‗i‘ and ‗I‘. In the figure below it is explained how to use this command.

       The student laughed.

             On issuing ‗i‘ command and typing ‗new
             ‘and <ESC>
       The new student laughed. Text inserted before the
                                   cursor
            On issuing ‗I‘ command and typing appended at end of line.
                                         Text
            ‗Again‘and<ESC>
                               Text appended in the
       Again The student laughed.
                               beginning of line.

4.9.3 Open Command (o, O)

      Open command opens a new line to add text. This has two forms ‗o‘ and ‗O‘,
      in the figure below the usage is explained.


       The student laughed.

             On issuing ‘O’ command and typing ‘A new line is added’ and
             ESC>
       A new line is added
       The student laughed.                 Text inserted above the current line


           On issuing ‘o’ command and typing ‘Another line ’ and <ESC>
       A new line added
       The student laughed.
       Another line                    Text appended in the beginning of the line.


4.9.4 Read Command (:r)

      The read command is allows the user to copy of another file into the current
      file. While in command mode and with the cursor on the line above where you
      want the special file read in, type:

        :r <File>      Reads the file specified at cursor location in the current file


4.10 Global Search and Replace for text


52
COE                                                                         Unit 1, Lesson 4



      The example below shows different commands that can be used for searching
      and replacing with different purpose.

        :1,$s/oldText/newText/g
                                   This command replaces all the
                                   instances of oldText with
        :1,15s/oldText/newText/g
                                   newText in the file
                             This command replaces
        :g/oldText/s//newText/gc
                             oldText with newText from line
                             number 1 to 15
                This command asks before replacing text
                each time
Self-Check Questions
8. To delete the word on which the cursor is placed ―D‖ command can be issued.
    (True/False)
9. The change operator invokes the text insertion mode. (True/False).
10. The operator _______________ changes the text, yet does that in command
    mode and not in text insertion mode.
11. The command ______________ replaces the characters on screen one at a time
    as the user keys in the new characters.
12. To erase the character on which the cursor is place __________ command is to
    be issued, whereas to delete the character prior to the character (left) on which
    the cursor is placed _________ command needs to be issued.
13. To replace the name ―shahs‖ with ―mazes‖ in a text file the command to be issued
    is ___________.




4.11 Rearranging and Duplicating Text
      You can yank text for copying it at another place in the text file.

4.11.1 Copying Text and Moving the Copy

      Step 1: Copying Text with the Yank Command (y, Y)

      The yank command ‗y‘ can be used with the scopes and similar scopes can
      be used as we have seen in delete command. Yanking places the yanked
      content into an unnamed buffer. Some of the examples of yanking are:

        This is the line to be yanked .
                                          cursor is character ‗l‘
         On issuing the
         command
         ‗3yw‘ which means yank
         3 words, it yanks 3
         words starting from
         current cursor position          cursor is at first line
        This is the line to be yanked
        This is another line ‗3yy‘ will
          Issuing command to yank
        This is yet another line that can be yanked
          yank 3 lines starting from                                                     53
          current line
COE                                                                        Unit 1, Lesson 4



      Step 2: Put Command (p, P)

      The put command is used to place the contents of the unnamed buffer back
      into the file being edited. Returning whole lines into the text is handled
      differently than word and sentence fragments.

      The lower-case "p" places the line or lines below the current line and the
      upper-case "P" places them above the current line.

      A handy feature of yank & put is the ability to insert copy repeatedly within the
      same file. The format for this action is yank, relocate cursor, put, relocate
      cursor, put, etc. until all needed copies have been placed.

4.11.2 Deleting Text and Moving It

      When you delete a text, it gets yanked and thus it can be used to put in
      another place in the text.


           This is the file.                      This line will
           It contains text.                      be deleted
           This line will be deleted.             using ‗dd‘
           Below this it will be later            command.
           on pasted.                        Currently cursor
           This will be the end of           is placed on this
           file. is the file.
           This                              line
           It contains text.                  On using the ‗p‘
           Below this it will be later        command the
           on pasted.                         line is placed
           This line will be deleted.         below the
           This will be the end of            present cursor
           file                               position
4.12 Named Buffers
      Named buffers offer another way to copy (yank) or remove (delete) text.

      The unnamed buffer only saves the last deleted or yanked text. vi provides 26
      named buffers (a-z) are created for your use. Named buffers allow users to
      yank multiple text and put them at different places.

      These named buffers remain only for the life of the current editing session.
      Once you quit vi, these buffers are no longer available.

      Here are few examples of how named buffers are used.

      Typing "g7yy in command mode, implies the following:
      Quote (―) calls for a named buffer
      ―g gets the buffer named g
      7yy implies yanking 7 lines into the named buffer g.


54
COE                                                                       Unit 1, Lesson 4




      Now, if you type ―gp, it implies the following:
      ―g calls for the named buffer g
      ―gp implies paste the contents of the named buffer g.

      You can append more information into a named buffer. When you use the
      capital letter to yank into a named buffer, the yanked contents are appended
      into the named buffer. For example ―g7yy yanks 7 lines into buffer g, now
      ―G3yy would yank and append the 3 lines after the already yanked 7 lines into
      the buffer g.

      These named buffers are not write-protected. If a named buffer contains
      information and it is called a second time with its lower-case name, the
      original material is over-written.

4.12.1 Using the named buffers

      Once you yank contents into a named buffer g, you can paste it anywhere in
      the file. If you type ―gp, it implies the following:
       ―g calls for the named buffer g
       ―gp implies paste the contents of the named buffer g.

      p    putting the contents below the current line
      P    putting the contents above the current line

      It is important to note that VI editor will not tell you which all buffers are
      defined currently also it cannot tell you which buffer contain what; you must
      remember the names of the buffers and what all contents they have.



Self-Check Questions
14. 1To copy 10 lines of text into an unnamed buffer 10_____ command can be
    used. (Y/y)
15. The text saved in an unnamed buffer created by yanking or deleting can be
    placed back into the text below the current line where the cursor is placed by
    using _________ command.
16. To append 5 more lines to the named buffer ‗a‘, the command to be issued
    is__________.
17. If a named buffer is called upon again and new information is written into it then
    the new information is appended to the buffer. (True/False)
18. It is possible to get the buffer name on the basis of the content stored in the
    buffer. (True/False)




                                                                                       55
COE                                                                          Unit 1, Lesson 4




4.13 Miscellaneous Information
       In this section we will discuss about some miscellaneous information which
       can be used to be more productive in editing the files.

4.13.1 Creating Line Numbers

       In vi editor by default the line numbers are not shown. But vi editor allows the
       line number view. Command for this is:
           :%nu

       Sometimes depending upon the requirements it is desired that the line
       numbers are seen only for the current session. To have line numbers inserted
       for the current session, type:
           :set number

       Immediately you will see the line numbers appear in your file and they will
       remain until you exit the editor or type:
          :set nonu

       The "control s" command stops screen movement.
       The "control q" command releases frozen screen.
       The ―control l‖ command refreshes vi screen without modifying the file.

       The .exrc file
       There are many setup (set) commands that can be set or changed for vi. It is
       advisable to put these commands into the ~/.exrc file so that every time vi
       automatically loads these settings.

       For example:

      bash> cat ~/.exrc
      set nu      # Show line numbers
      set nows # Do not wrap file while searching.
      bash>

       The following command will show you the available setup commands.
          :set all

4.13.2 Lines and Sentences in VI

       To be successful in your editing, it is necessary to understand what the editor
       considers a line and a sentence. Just for clarity, a line and a sentence are
       different items to the editor. To the editor, a line begins on the left of a screen
       and terminates at a carriage return. The carriage return is the invisible
       character placed in your file every time you press the "RETURN" key. A
       sentence to the editor is a string of characters of unspecified length (a few
       characters to many lines) terminating with the punctuation marks ―.‖, ―?‖, ―!‖
       followed by either a carriage return or two blank spaces.



56
COE                                                                        Unit 1, Lesson 4



4.13.3 Joining Lines

      As you are editing files, you will find it is desirable to combine or join lines.
      This is easily done using the "J" (join) command. An illustration of joining lines
      is given below. The cursor is located on the top line when the "J" command is
      issued. vi will move the lower line and butt it to the end of the upper line. The
      editor takes care of necessary spacing for you.




4.13.4 Repeating a Command

      To make life a bit easier, vi allows text alteration commands to be repeated by
      using the ―.‖ (Repeat) command. A handy way to illustrate the repeat
      command is with the “c w” command replacing a single word with two new
      words throughout a paragraph.

      In this example, the first occurrence of ―PU‖ is located with the search
      command PU”. Then with the cursor on the ―P‖ of ―PU‖, the ―cw” command is
      issued followed with ―Purdue University‖ and the ―ESC”. The „n‟ key is pressed
      to find the next occurrence of ―PU‖. The cursor relocates on the ‗P‘ of the next
      ―PU‖ and all that is required to change it to ―Purdue University‖ is to type ―.‖




4.13.5 Editing Multiple Files Using vi

      The vi editor provides a feature which allows a user to edit multiple files by
      use of the ":e" (edit) command. This ability to access multiple files without
      leaving the editor permits a user to see information in another file without
      exiting the editor. Additionally, because files are opened within the same
      editor invocation they can share the same named buffers, thereby making the
      transfer of text possible between the files. When vi is invoked, a work area
      called a buffer is created for editing purposes. It is into this work space that a
      copy of a specified disk file is placed. The editor permits only one file copy in


                                                                                        57
COE                                                                         Unit 1, Lesson 4



      this buffer space at a time. Thus after making changes to a file (delete, add, or
      change), you must inform the editor what you wish done to the current buffer
      contents before you will be permitted to bring another file into this space. You
      do this by use of the ":w" (write current buffer contents to opened file), ":e!
      newfile" (toss current buffer contents, no update to opened file, and place a
      copy of newly called file in buffer), or ":quit!" (Exit editor and toss buffer and
      buffer contents).

      When you have two files open, VI permits toggling between files by use of ":e
      #". This works because whenever VI sees the character "#" used in a
      command where a filename is expected, it substitutes the "#" with the name of
      the previous file.

      For example if you had been in fruits then opened vegetables, the command
      ":e #" would return you to where you were in the fruits file. Repeat ":e #" and
      you would be back in vegetables.

4.13.6 Mark Command

      The mark command sets up a mark in vi and while editing you can go back to
      the places where you had placed these marks. vi provides 26 marks which
      are named ‗a‘ to ‗z‘.

      You can put a mark ―g‖ in a position using a command like the following:
           mg
      Note that the marks are not visible at all in vi. You have to remember the
      marks that you have put. To go back to the marked location ―g‖, use the
      following command:
          ‗g


4.14 Saving or Storing a File
      As mentioned earlier, the VI text editor creates a temporary working area
      which can be a copy of the existing file on the disk or a new file. This area is
      at the disposal of the user until he saves the file. On saving the file, the buffer
      is removed from storage and changes saved on to the file which gets stored
      on the disk. Disk storage on the other hand gets removed with the remove
      command of UNIX.

      The changes made in the buffer are not saved until you specify the command
      to do so, thus it is advisable to keep on saving the work periodically. We will
      discuss how to save our work periodically. Below is a schematic showing how
      the work is saved on the disk.




58
COE                                                                           Unit 1, Lesson 4




4.14.1 Writing to the file

       It is useful and safe to save the work periodically when typing text. The ‗:w‘
       command writes the buffer to the file on the disk thus saving the changes.
       This works in the command mode.


         :w <File>           Saves the changes done in the <file>


4.14.2 Exiting the vi editor

       To exit the vi editor you can use the quit command ‗:q‘. This command in
       conjunction with write command leads to ‗:wq‘ (write and quit). To discard the
       changes made you can use ‗:q!‖.



Self-Check Questions
19. The text insertion command takes the VI control from command mode to text
    insertion mode. (True/False)
20. If some text is required to be added to the current text, such that the new inserted
    text is added in the end of the line on which cursor is positioned then text
    insertion is invoked with the command ____________.
21. If in some application it is required that the same piece of text from one text file is
    to be inserted in another text file, user can use the command _______________.
22. When using text insertion command read ‗:r‘, to switch back to the command
    mode from text insertion mode the ESC key can be used. (True/False)
23. On issuing the write command once in the complete session we ensure that in
    that all the text inserted in the session, includi ng the text inserted after the write
    command is issued, is saved. (True/False)
24. If we need to store the editing work done in the editor, the command
    ___________ is needed to be issued.
25. If one finds out that he does not need the text he has inserted into t he editor
    window in the present session, then he is required to issue ____________
    command.
26. In some application it is required to create a file ‗new‘ from a file ‗old‘ with some
    new text and the file ‗old‘ needs to be kept unchanged. The VI commands that

                                                                                           59
COE                                                                        Unit 1, Lesson 4



    should be issued for writing the new changes is __________________ and
    exiting the VI session is ____________.
27. The VI editor can operate in two modes. The mode which can let the user change
    the text in the file is _____________________ mode.




4.15 Summing Up
         In this chapter we have looked upon Vi text editor quantitatively. We
         discussed a lot of techniques and viewed examples that can help you in
         editing text files very efficiently. With these techniques at hand you will be
         able to learn other advanced techniques, when you work in actual
         environment and situations.


4.16 Answers to self-check questions
      1. 74G.
      2. backwards and forward.
      3. cntrl-g
      4. forward
      5. 7w
      6. 0 (zero)
      7. False.
      8. False.
      9. False.
      10. r
      11. R.
      12. x, X.
      13. : g/shahs/s///xyz/g
      14. 10yy
      15. p
      16. ―A5yy
      17. False
      18. False
      19. True
      20. A
      21. :r
      22. True
      23. True
      24. :w.
      25. :q!
      26. :w <new>, :q!
      27. Edit.




60
COE                                                                     Unit 1, Lesson 4




4.17 Terminal Questions
      1. Explain the processes that are used for changing the text using the VI text
         editor
      2. Explain the processes that can be used to delete the text using the VI text
         editor
      3. Write a note about the named buffers and also explain some usage with
         practical examples
         Write briefly about the rearranging and duplicating of text in the VI text
      4. Explain how the VI editor functions
      5. What are the different modes for operating VI Editor? Explain in brief
      6. Explain the append, insert and quit modes of operation of VI editor.




                                                                                     61
Unix shell program training
UNIT 2: SHELL SCRIPTING

1: INTRODUCTION TO SHELL ............................................................................... 67

2. SHELL SCRIPTING AND DEBUGGING........................................................ 85

3. CONDITIONAL STATEMENTS ........................................................................ 101

4. REPETITIVE TASKS ............................................................................................. 113

5. REGULAR EXPRESSIONS................................................................................ 133
Unix shell program training
COE                                                                                                                Unit 2, Lesson 1




LESSON 1                       INTRODUCTION T O SHELL
1: INTRODUCTION TO SHELL ........................................................................................ 67

   1.1       INTRODUCTION ........................................................................................................ 67
   1.2       THE SHELL: COMMAND PROCESSOR ..................................................................... 67
   1.3       BASH: BOURNE AGAIN SHELL............................................................................... 68
      1.3.1 Advantages of BASH ....................................................................................... 69
   1.4       REDIRECTION .......................................................................................................... 69
      1.4.1 Standard Output ............................................................................................... 70
      1.4.2 Standard Input .................................................................................................. 71
      1.4.3 Standard Error .................................................................................................. 71
      1.4.4 Combining Streams ......................................................................................... 72
   1.5       VARIABLES .............................................................................................................. 75
      1.5.1 Setting strings with the variable names having $ ........................................ 75
      1.5.2 Types of variables ............................................................................................ 76
      1.5.3 Exporting variables........................................................................................... 76
      1.5.4 Using Shell Variables....................................................................................... 77
   1.6       COMMAND SUBSTITUTION ....................................................................................... 78
   1.7       PATTERN MATCHING – THE WILD CARDS.............................................................. 78
      1.7.1 The * & ? ............................................................................................................ 79
   1.8       THE C HARACTER C LASS......................................................................................... 79
   1.9       MATCHING A DOT (.) ................................................................................................ 80
   1.10      SUMMING UP ........................................................................................................... 81
   1.11      ANSWERS TO THE SELF-CHECK QUESTIONS ......................................................... 81
   1.12      TERMINAL QUESTIONS ............................................................................................ 82
Unix shell program training
COE                                                                          Unit 2, Lesson 1




                          1. Introduction to Shell



The starting point for the unit on Shell Scripting is to first know about Shell. Bash is
also introduced in this chapter. In the subsequent lessons further details pertaining to
advanced concepts are discussed at length.



1.0       Objectives
          After going through this lesson, you will be able to:

         Know about different types of shell
         See how the shell executes commands
         Understand and use Redirection, Variables, Pattern matching etc.


1.1       Introduction
          The Shell in UNIX is the program which acts as an interface between the user
          and UNIX system. It understands the user language, interprets it and tells the
          kernel what user wants, gets the results of the command execution from the
          kernel and gets back to the user with the results which he understands. All
          the wonderful things that we can perform or do using the UNIX system is due
          to the virtue of this program, which can understand so less code and execute
          the commands and user instruction effectively. Shell can also be known as a
          command processor it processes the instructions you issue to the machine.


1.2       The Shell: Command Processor
          On logging onto the UNIX system you encounter a prompt ($ or % or any user
          custom prompt). Apparently though it seems that nothing is happening, but a
          program is running which is waiting for your instructions to execute them, this
          is SHELL. When a user logon the shell starts functioning and keeps on doing
          that until the user logs out.

          When you issue a command, the shell is the first agency to acquire the
          information.It accepts and interprets user requests; these are generally the
          UNIX commands we key in. The shell examines and rebuilds the command
          line and then leaves the execution work to the kernel. The kernel handles the
          hardware on behalf of these commands and all processes in the system.



                                                                                                67
COE                                                                              Unit 2, Lesson 1



          Users can thus afford to remain ignorant of the happenings behind the scene.
          This is one of the beauties of UNIX design and philosophy.

          The shell generally is sleeping. It wakes up when input is keyed in at the
          prompt. This input is the input to the program that represents the shell. Below
          is the list of activities that the shell performs typically.

          It issues the prompt ($ or otherwise) and sleeps till you enter a command.

          After a command has been entered, the shell scans the command line for
          some special characters (metacharacters, we will have a look further) that
          have a special meaning for it. Because it permits abbreviated command lines
          (like the use of * to indicate all files, as in rm *), the shell has to make sure the
          abbreviations are expanded before the command can act upon them.

          It then creates a simplified command line and passes it on to the kernel for
          execution.

          The shell can‘t do any work while the command is being executed, and has to
          wait for its completion.

          After the job is complete, the prompt reappears and the shell returns to its
          sleeping role to start the next ―cycle‖. You are now free to enter some other
          command.

          Note: The command at the lower levels does not know or understand the
          metacharacters thus the shell has to handle and resolve them to normal
          representations before they are parsed to kernel.


1.3       BASH: Bourne Again Shell
          Bourne Again shell is the standard GNU shell, intuitive and flexible. Probably
          most advisable for beginning users while being at the same time a powerful
          tool for the advanced and professional user. On Linux, bash is the standard
          shell for common users. This shell is a so-called superset of the Bourne shell,
          a set of add-ons and plug-in. This means that the Bourne Again shell is
          compatible with the Bourne shell: commands that work in sh, also work in
          bash. However, the reverse is not always the case.

          To know the shell you are using, invoke the command echo $SHELL. The
          output could show /bin/sh (Bourne shell), /bin/csh (C shell), /bin/ksh (Korn
          shell) or /bin/bash (bash shell).

          When BASH is started, it reads its configuration files. The most important are:

         /etc/profile - login time for all shelss
         ~/.bash_profile – login shell wi ndow for bash (eg: printing system details on
          screen)
         ~/.bashrc – non-login shell window


                                                                                                    68
COE                                                                             Unit 2, Lesson 1




1.3.1 Advantages of BASH

          Bash is an sh−compatible shell that incorporates useful features from the
          Korn shell (ksh) and C shell (csh). It is intended to conform to the IEEE
          POSIX P1003.2/ISO 9945.2 Shell and Tools standard. It offers functional
          improvements over sh for both programming and interactive use; these
          include:
      o   Command line editing
      o   Unlimited size command history
      o   Job control
      o   Shell functions and aliases
      o   Indexed arrays of unlimited size
      o   Integer arithmetic in any base from two to sixty−four

          Bash can run most Bourne shell scripts without modifications.

          In our course, we will work with BASH only. The formats and commands
          mentioned in this course will be slightly varied if they are to work in different
          shells.


1.4       Redirection
          Many of the UNIX commands that we have came across, sends their outputs
          to the terminal. There are commands which take their input from keyboard.
          So, one can think of that these commands are designed to accept only fixed
          sources and destinations. These commands are designed to use the
          character streams without knowing its source and destination. A character
          stream is just a sequence of bytes that many commands se as inputs and
          outputs.

          In a UNIX system these streams are dealt to be as files, and a group of UNIX
          commands reads from or writes to these files. A command is usually not
          designed to send output to the terminal—but to this file. Likewise, it is not
          designed to accept input from the keyboard either—but only from a standard
          file which it sees as a stream. There‘s a third stream for all error messages
          thrown out by a program. This stream is the third file.

          It‘s here that the shell comes in. The shell sets up these three standard files
          (for input, output and error) and attaches them to a user‘s terminal at the time
          of logging in.Any program that uses streams will find them open and available.
          The shell also closes these files when the user logs out.

          The standard file for input is known as standard input and that for output is
          known as standard output. The error stream is known as standard error. By
          themselves, these standard files are not associated with any physical device,
          but the shell has set some physical devices as defaults for them:



                                                                                                   69
COE                                                                           Unit 2, Lesson 1




             Streams                Default sources/destinations
             Standard               The default source is Keyboard
             Input
             Standard               The default destination is the terminal screen
             Output
             Standard               The default destination is the terminal screen
             Error

1.4.1 Standard Output

      There are commands like ―more‖ which sends their output as a character
      stream, this stream is called the standard output stream and appears on the
      terminal screen by default. By using the redirection this stream can be
      redirected or sent to a disk file.

      Examples,

      bash>more myFile > newFile

      The shell looks at the >, understands that standard output has to be
      redirected, opens
      the file new file, writes the stream into it and then closes the file. And all this
      happens with more knowing nothing about it because more sends the output
      to the stream and that stream gets redirected to a disk file.

      By using ‗>‘ redirection operator, shell will overwrite and existing file and
      creates a new file if no file with the name is existing. It is possible alternatively
      to append to the an existing file by using another redirecting operator ‗>>‘

              Operator            Action performed
              >                   Creates a new file or if the file is already existing
                                  then overwrites
              >>                  Appends to the file if the file is existing or creates a
                                  new file

      It is also possible to club the commands together and redirect the output to a
      file. A pair of parenthesis groups the files and a redirection can redirect them
      to a file.

      Example,

      bash> (ls –l; who) > myFile

      It is also possible that the results are redirected to another program, this is the
      concept of pipelining which we will discuss later on.
      Thus conclusively the standard output has three possible destinations:
      Terminal or the screen and it is the default destination
      A disk file
      A pipe – to another command

                                                                                                 70
COE                                                                           Unit 2, Lesson 1




      NOTE: Shell creates the file before it redirects the output into it .

1.4.2 Standard Input

      Some commands are designed to take their inputs also as streams. This
      stream represents the standard input to the command. A classical example for
      the use of the standard input could be the ―wc‖ command for counting the
      words:


       bash>wc
            2*4
           23 ^ 64
       [ctrl-d]
           2    10 44                                       with no
       filename in output
      With no filename provided the wc tells the user about the number of lines, number of
      columns and the number of characters used and sends them to the standard output.

       bash>wc < my
         5   9 54


      With some filename provided and redirected to the commands command
      takes the input stream to be the disk file.

      Conclusively we can say that the standard input has three possible sources:
      The keyboard – Used as the default standard input
      The Pipe – input from the results or output of some other command
      The file – inputs from a file

      NOTE: When a file is redirected to a command, then it‘s the shell that opens
      the file and the command does not know as to what is happening. But when
      the command is used with the file name as one of the arguments then t he
      command itself opens the file.

1.4.3 Standard Error

      When you enter an incorrect command or try to open a nonexistent file,
      certain diagnostic messages show up on the screen. This is the standard
      error stream. Like standard output, it too is destined for the terminal. Note
      that they are in fact two separate streams, and the shell possesses a
      mechanism for capturing them individually.

      Before we proceed any further, you should know that each of these three
      standard
      files has a number, called a file descriptor, which is used for identification:



                                                                                                 71
COE                                                                       Unit 2, Lesson 1



      0—Standard input ‗<‘ is same as „0<‘
      1—Standard output ‗>‘ is same as „1>‘
      2—Standard error Must be „2>‘ only

      These descriptors are implicitly prefixed to the redirection symbols. For
      instance, > and1> mean the same thing to the shell, while < and 0< also are
      identical. You normally don‘t need to use the numbers 0 and 1 to prefix the
      redirect symbols because they are the default values. However, we need to
      use the descriptor 2> for the standard error:

       bash>cat bar > errorfile
       cat: cannot open bar: No such file or directory
       bash>cat errorfile

      Without specifying the file descriptor with the redirection symbol we don‘t get
      the errors in the file

       bash> cat bar 2>errorfile
       bash> cat errorfile
       cat: cannot open bar: No such file or directory


      This works. You can also append diagnostic output in a manner similar to the
      one in which you append standard output:

       bash>cat bar 2>> errorfile

      You can now save error messages in a separate file. This enables you to run
      long programs and save error output to be viewed at the end of the day.

1.4.4 Combining Streams

      In UNIX, it is also possible to use both input and output streams at the same
      time and shell in this case keeps the command ignorant of the source and
      destination.

       bash>cat > my


      In this case both input and output are redirected.
      It is also possible to combine < and > operators and the sequence of their use
      is immaterial for the shell.

       bash> wc < infile > newfile
       bash> wc > newfile < infile
       bash> newfile < infile wc




                                                                                             72
COE                                                                          Unit 2, Lesson 1



      All the three commands are different commands for the same task. It is also
      possible to combine the standard output and standard error in the same
      command line.

       bash> cat newfile nofile 2> errorfile > outfile


      By default, the errors are dumped on the standard error (stderr) and normal
      output is sent to standard out (stdout). For example, if you simply type the
      following command to compile some C program, then the only normal output
      will be sent to stdout, error will still show up on the terminal.

       bash> cc x.c y.c > compile.out
       variable x is not defined.
       variable y is redefined.
       variable z is not defined.

      But if you want both the errors and the usual output (e.g. any warnings, etc.)
      to go into a single file, then you can use the following command:

       bash> cc x.c y.c > compile.out 2>&1
                              # Note there is not output printed on
       the script
       bash> cat compile.out
       variable x is not defined.
       Warning: variable type mismatch.
       variable y is redefined.
       variable z is not defined.
      2.3 Pipeline

      In UNIX, it is desired a lot of times that output of some file is fed to another file
      and this is used to accomplish a task. For instance, the following set of
      commands is doing some task:

       bash> who > user.lst
       bash> cat user.lst
       araz tty01 May 18 09:32
       amol tty02 May 18 11:18
       achint tty03 May 18 13:21

      Now, to count the number of users we can certainly redirect the file user.lst to
      make it come from the standard input.

       bash> wc -l < user.lst
            3


      This method of using multiple commands to accomplish tasks has some
      obvious disadvantages:


                                                                                                73
COE                                                                     Unit 2, Lesson 1



      1. The process is slow. The later command cannot get executed if the earlier
         ones are not yet executed.
      2. An intermediate file is required that has to be removed after the wc
         command has been executed.
      3. When handling large files, temporary files can built up easily and eat up
         the disk space.

      Now, shell has a unique and powerful ability to connect the flow of these three
      commands, without needing any intermediate files, and each command takes
      input from the other. This is accomplished using the pipe (|) operator.

      By using the pipes the command sequence shown above can be compressed
      to the following single command:

       bash> who | wc -l
            3

      Here, ‗who‘ is said to be piped to wc. No intermediate files are created when
      they are used. When a sequence of commands is combined togethe r in this
      way, a pipeline is said to be formed. The name is appropriate as the
      connection it establishes between programs, resembles a plumbing joint. It‘s
      the shell that sets up this interconnection, and, the commands have no
      knowledge of it.

      The pipe is a source and destination of standard input and standard output,
      respectively. You can now use one to count the number of files in the current
      directory:

       bash> ls | wc -l
         15


      Note that no separate command was designed to tell you that, though the
      designers could easily have provided another option to ls to perform this
      operation. And because
      wc uses standard output, you can redirect this output to a file:

       bash> ls | wc -l > fkount

      There‘s no restriction on the number of commands you can use in a pipeline.
      But you must know the behavioral properties of these commands to place
      them there. Consider this generalized command line:

      command1 | command2 | command3 | command4

      It should be pretty obvious that command2 and command3 must support both
      standard input and standard output. Command1 requires to use standard
      output only, while command4 must be able to read from standard input. If you
      can ensure that, then you can have a chain of these tools connected together.


                                                                                           74
COE                                                                          Unit 2, Lesson 1



      The commands command2 and command3 who support both streams are
      called filters. These will be discussed later.

1.5   Variables
      It is possible in shell to have shell variables that can have some values stored
      in then and can be later on referenced to get that value or use that values on
      the command line or in shell scripts, we will learn shortly about the shell
      scripts. The shell variables are of string types, which means the value is
      stored in ASCII rather than in binary format. No type declaration is necessary
      before you can use a shell variable. The shell variables are set using a
      generalized form of variable=value , and can be referenced by placing a ‗$‘
      as a prefix to it. By using the unset command, the variable can be removed.

      Example,

       bash> a=4
       bash> echo $a
       4
       bash> unset a
       bash> echo a
       bash>

      NOTE: There should be no space between the variable name, =, and variable
      value else, shell will interpret the variable name to be a command and ‗=‘ and
      the variable value to be the arguments.

      By default the shell variables are initialized to null value, but sometimes it is
      desirable to explicitly set them to a null value by using any one of the following
      constructs:
      x= or x=‘‗ or x=‖‖

      It is also possible to assign multiple word string to a shell variable, for this
      there are two approaches possible:
      1. Escape the blank spaces using the escape character ‗‘
      2. Use the quotes.

       bash> a=‗My name is Amrit‘
       bash> echo $a
       My name is Amrit


1.5.1 Setting strings with the variable names having $

      There could be strings containing the $ character in them. It could be for two
      reasons:

      1. The string inherently contains the $ sign. Example:
      My salary per month is $1000

       bash> echo ‗My salary per month is $1000‘
       My salary per month is $1000
                                                                                                75
COE                                                                              Unit 2, Lesson 1




          In this, $1000 is echoed as it is.


           bash> echo ―My salary per month is $1000‖
           My salary per month is 000

          In this it is assumed that $1 is a shell variable and thus this tries to access the
          value which is undefined, and so replaces it with a null string.

          Thus, there is a difference in the way the shell handles the strings if used in
          the single quotes and double quotes.

          2. The string uses a variable name with $ character to replace the variable
             with its value.

          Example,
          My salary per month is $$x
          The variable x is to be replaced with the salary amount and preceded with a
          dollar sign.

1.5.2 Types of variables

          As a convention, variables are used with uppercase names. Bash keeps a list
          of two types of variables:

         Global variables

          Global variables or environment variables are available in all shells. The env
          or printenv commands can be used to display environment variables.

         Local variables

          Local variables are only available in the current shell. Using the set built−in
          command without any options will display a list of all variables (including
          environment variables) and functions. The output will be sorted according to
          the current locale and displayed in a reusable format.

          A local variable is not automatically available to the sub shell unless exported.


1.5.3 Exporting variables

          A variable created like the ones in the example above is only available to the
          current shell. It is a local variable. Child processes of the current shell will not
          be aware of this variable. In order to pass variables to a subshell, we need to
          export them using the export built−in command. Variables that are exported



                                                                                                    76
COE                                                                          Unit 2, Lesson 1



          are referred to as environment variables. Setting and exporting is usually
          done in one step:

          export VARNAME="value"

          A subshell can change variables it inherited from the parent, but the changes
          made by the child don't affect the parent. This is demonstrated in the
          example:

           bash> full_name=―Amrit Swarup"
           bash> bash
           bash> echo $full_name

           bash> exit
           bash> export full_name
           bash> bash
           bash> echo $full_name
           Amrit Swarup
           bash> export full_name=―Charan Singh"
           bash> echo $full_name
           Charan Singh

        bash> exit
1.5.4 Using Shell Variables
        bash> echo $full_name
        Amrit Swarup
      In UNIX, it is possible to set variables to some path, command and command
      substitution to set the output of the command. We will have a look at the
      usage examples wherein the variables can be set to these values and then
      can be used as substitutes of the operations.

         Setting the path name

           bash> x=‘/home/ganesh/father‘
           bash> cd $x
           bash> pwd
           /home/ganesh/father
          Thus, in some variables we can set the pathname and then cd command can
          be used to access that pathname again and again.
          NOTE: In practical applications and day to day life, this can be a great
          practice to be done, it is because there are sometimes long absolute
          pathnames that can be actually stored in some variables and can be
          accessed again and again without facing the trouble of memorizing them or
          typing long pathnames.




                                                                                                77
COE                                                                         Unit 2, Lesson 1




1.6   Command Substitution
      It is possible in UNIX systems to connect two commands. It is possible to
      connect the standard output of a command to the standard input of another
      command using the pipelines or using the redirection.
      The shell allows obtaining the argument of a command from another
      command; this feature is called command substitution. In some features, it
      is sometimes required that the command argument is the output of another
      command. For example, we need to print some string which tells us about the
      number of files in the directory:

      There are 24 files in the directory.

      So, how will you achieve this? The shell has this feature.

       bash> echo ―There are `ls | wc –l` files in the directory.‖
       There are 24 files in the directory.

      So, you have substituted the command in the string which then acts as an
      argument to the other command (echo), by placing the command in between
      two `` (backquote or backtick). This is a metacharacter that shell looks at (we
      cover metacharacters ahead). If enclosed in between the back quotes the
      shell first executes the command, and then replaces the enclosed command
      text with the output of the command.

      By now, we have seen that all the metacharacters behaves in the similar
      manner when used with either the double or single quotes. Lets try this one:
       $echo ‗There are `ls | wc –l` files in the directory.‘
       There are `ls | wc –l` files in the directory.



      So, they are not interpreted by the shell, if placed in between the single
      quotes.


1.7   Pattern Matching – The Wild Cards
      While working with the UNIX system we often lands up in the situation when
      we have to perform operations which can be used to apply the same
      operations collectively on a larger group. Typically, listing files starting with
      name lesson:

      ls –l lesson01 lesson02 lesson03….

      This can also be represented as:

      ls –l lesson*


                                                                                               78
COE                                                                         Unit 2, Lesson 1




      These are called the metacharacters, these are the special characters that
      the shell understands and does some expanding operations based on the
      character and its intended use. Let‘s now discuss the metacharactes and
      their attributes in some details

1.7.1 The * & ?

      The *, known as a metacharacter, is one of the characters of the shell‘s
      special set. This character matches any number of characters (including
      none).When the * is appended to the string lesson, the pattern lesson*
      matches filenames beginning with the string lesson—including the file lesson.
      It thus matches all the files specified in the previous command line. You can
      now use this pattern as an argument to ls:

       bash> ls –x lesson*
       lesson lesson01 lesson02 lesson03 lesson04 lesson05
       lessonA lesson.pl lesson.c lesson.cpp

      When the shell encounters this command line, it immediately identifies the *
      as a metacharacter. It then creates a list of files from the current directory that
      match this pattern. It reconstructs the command line as below:

       bash> ls –x lesson lesson01 lesson02 lesson03 lesson04
       lesson05 lessonA lesson.pl lesson.c lesson.cpp

      NOTE: Windows users may be surprised to know that the * may occur
      anywhere in a filename, and not merely at the end. Thus, *lesson* matches all
      the following filenames: lesson newlesson lesson03 lesson03.txt.

      The next metacharacter is the ‗?‘ This matches a single character. When used
      with the same string lesson (as lesson?), the shell matches all five-character
      filenames beginning with lesson. Place another? at the end of this string, and
      you have the pattern lesson??. Use both these expressions separately, and
      the meaning of the ? will be obvious:


       bash> ls -x lesson?
       lessonx lessony lessonz
       bash> ls -x lesson??
       lesson01 lesson02 lesson03 lesson04 lesson15 lesson16
       lesson17

      These metacharacters are also called wild cards (to depict something like a
      joker that can match any card). In the upcoming sessions we will take a look
      at other wild cards.


1.8   The Character Class


                                                                                               79
COE                                                                        Unit 2, Lesson 1




      It can be noted in the previous examples that the patterns which we have
      framed in the previous examples are not very restrictive and specific. If we
      want to list only lessonA and lessonZ amongst the entire lesson we cannot do
      that using the patterns, we have studied by now. To do this we need a
      character class for specific matching.

      The character class uses two more metacharacters represented by a pair of
      brackets
      [ ]. You can have multiple characters inside this enclosure, but matching takes
      place for a single character in the class. For example, a single character
      expression that can take one of the values 1, 2 or 4, can be represented by
      the expression:
      [124] Either 1, 2 or 4
      This can be combined with any string or another wild-card expression, so
      selecting the files lesson01, lesson02, lesson03, lesson04 becomes a simple
      matter :

       bash> ls –x lesson0[1234]
       lesson01 lesson02 lesson03 lesson04



1.9   Matching a dot (.)
      In UNIX file systems, there are lots of files that start with dots (.). It is
      sometimes desirable to do some collective wild card operations on these files.
      Example can be,

       bash> ls –x *
       lesson01 lesson02 ….

      This will not show the files starting with dots. To match the dots in the starting
      of a file name it is important to use the dot literally.

       bash> ls -x .*
       .exrc .encrc .profile

      But it is possible to match as many dots, if they occur in the middle of the
      filename.

       bash> ls –x my*c
       my_file.c my.c my.stored.c



      NOTE: Using * with rm




                                                                                              80
COE                                                                       Unit 2, Lesson 1



      Lets discuss a potential issue which each UNIX user faces at least once in his
      life that is the use of very beautiful and powerful command

      bash> rm *

      To remove all the files starting with lesson we can use the command

      bash> rm lesson*

      But with a bit of carelessness you can type

      bash> rm lesson *

      And you have messed up everything beyond repair. Now be ready to have a
      scolding from the system administrator. So be careful while using this
      command


1.10 Summing up
      Shell is a core component of the UNIX Operating System. It interprets the
      user commands and provides powerful features like Redirection, Pipes,
      Metacharacters etc.

      Bash is the shell, compatible with the Bourne shell and incorporating many
      useful features from other shells. Bash‘s biggest feature is a powerful history
      support and command line editing. In our course, we use the BASH shell to
      explain the examples. In other shells the implementation is slightly different.



Self-check Questions
1. While a command is being executed the shell prompts the user for another
   command and puts that command in its priority queue. (True/False)
2. Shell is in __________________ (execution/sleep) mode while there is no
   command keyed in on the terminal and another command is running.
3. The redirection symbol ‗>‘ appends the redirected text to a file. (True/False)
4. Get the odd one out: The possible sources of standard input are:
   a. Pipe
   b. Keyboard
   c. Printer
   d. file




1.11 Answers to the Self-Check Questions


                                                                                             81
COE                                                              Unit 2, Lesson 1




      1.   False
      2.   Sleep
      3.   False
      4.   (c)


1.12 Terminal Questions
      1. What is exporting a variable and why is it used?
      2. Explain what is a metacharacter? Why do you need it?
      3. Explain the difference between pipes and redirection.




                                                                                    82
COE                                                                                                                Unit 2, Lesson 2




LESSON 2                      SHELL SCRIPTING AND DEBUGGING
2. SHELL SCRIPTING AND DEBUGGING..................................................................... 85

  2.0       OBJECTIVES ............................................................................................................ 85
  2.1       INTRODUCTION ........................................................................................................ 85
  2.2       CREATING AND RUNNING A SCRIPT......................................................................... 85
      2.2.1 myScript.sh........................................................................................................ 85
      2.2.2 Writing and naming .......................................................................................... 86
      2.2.3 Executing the Script ......................................................................................... 86
  2.3       SCRIPT BASICS ....................................................................................................... 88
      2.3.1 Which shell will Run the Script? ..................................................................... 88
      2.3.2 Adding comments............................................................................................. 88
  2.4       DEBUGGING BASH SCRIPTS ................................................................................... 89
      2.4.1 Debugging On the Entire Script ..................................................................... 89
      2.4.2 Debugging On Part(s) Of the Script .............................................................. 90
  2.5       QUOTING ................................................................................................................. 93
      2.5.1 Escape Character............................................................................................. 93
      2.5.2 Single Quotes ................................................................................................... 94
      2.5.3 Double-Quotes.................................................................................................. 94
  2.6       SPECIAL VARIABLES................................................................................................ 95
  2.7       SUMMING UP........................................................................................................... 98
  2.8       ANSWERS TO THE SELF-CHECK QUESTIONS .......................................................... 98
  2.9       TERMINAL QUESTIONS ............................................................................................ 98
Unix shell program training
COE                                                                          Unit 2, Lesson 2




                2. Shell Scripting and Debugging


To be able to write effective scripts, it is important to know the structure of a script
and also be able to debug it if required. Therefore it is important to understand these
concepts as they would form a base for subsequent chapters.



2.0       Objectives
          After going through this lesson, you will be able to:

         Write a simple script
         Define the shell type that should execute the script
         Put comments in a script
         Change permissions on a script
         Execute and debug a script


2.1       Introduction
          This chapter is to enable the student to indulge in writing scripts with low
          complexity. It is also pointed out that debugging is also needed at times. The
          student would be enabled to debug effectively using the methodology
          described in this chapter.


2.2       Creating and running a script
2.2.1 myScript.sh

          In this example we use the echo Bash built-in to inform the user about what is
          going to happen, before the task that will create the output is executed.
          The script welcomes the user, gives current date and time, lists the directory
          contents and searches for the text ―Blue‖ in all files starting with the name
          ―demo‖ and stores the result in the file - searchResult .txt. For the scripts in
          this chapter we are assuming they are created in the following directory:
          ~/scripts




                                                                                                85
COE                                                                          Unit 2, Lesson 2



                                  myScript.sh
          #!/bin/bash
          echo ""
          echo "This is my first shell script."
          USERNAME=`whoami`
          echo "Welcome $USERNAME"
          echo ""
          CURRENT_TIME=`date +%T`
          CURRENT_DATE=`date +%D`
          echo "Date: $CURRENT_DATE             Time:
          $CURRENT_TIME"
          echo ""
          echo ""
          echo "Here are the files in your current directory."
          echo ""
          ls
          grep Blue demo* > searchResult.txt



2.2.2 Writing and naming

          To create a shell script:

         Open a new empty file in your editor (vi, vim, gvim, emacs, gedit, dtpad etc.).

         Put UNIX commands in the new empty file, like you would enter them on the
          command line. As discussed in the previous chapter, commands can be shell
          functions, shell built-ins, UNIX commands and other scripts.

         Give your script a sensible name that gives a hint about what the script does.
          Make sure that your script name does not conflict with existing commands. In
          order to ensure that no confusion can rise, script names often end in .sh; even
          so, there might be other scripts on your system with the same name as the
          one you chose.

         Check using which, where is and other commands for finding information
          about programs and files:
          which −a script_name
          whereis script_name
          locate script_name

2.2.3 Executing the Script

          The script can run like any other command:




                                                                                                86
COE                                                                      Unit 2, Lesson 2




                                             The script should have execute
                                             permissions for the correct owners
                                             in order to be runnable.

       bash> chmod u+x myScript.sh

       bash> ls −l myScript.sh
       −rwxrw−r−− 1 salil     salil       456 Dec 24 17:11
       myScript.sh

       bash> myScript.sh
        Check that you really
        obtained the permissions
       This is mywantshell script.
        that you first
       Welcome salil
       Date: 12/21/07      Time: 12:26:40
       Here are the files in your current directory.
       demo.txt
       demo2.txt
       demo3.txt
       lab
       myScript.sh
       newfile.txt
       output.txt
       update.ppt
      The above mentioned scheme is the most common way to execute a script. It
      is preferred to execute the script like this in a sub shell. The variables,
      functions and aliases created in this sub shell are only known to the particular
      bash session of that sub shell. When that shell exits and the parent shell
      regains control, everything is cleaned up.

      Remember to add the directory to the contents of the PATH variable.
      It is essentially a colon separated list of directories. When you execute a
      command, the shell searches through each of these directories, one by one,
      until it finds a directory where the executable exists.

      export PATH="$PATH:~/scripts"

      If you did not put the scripts directory in your PATH, and the current directory
      is not in the PATH either, you need to specify the path of the script and
      activate it. If it is in the current directory activate the script like this:
      ./script_name.sh

      A script can also explicitly be executed by a given shell, but generally we only
      do this if we want to obtain special behavior, such as checking if the script
      works with another shell or printing traces for debugging:

      rbash script_name.sh


                                                                                            87
COE                                                                          Unit 2, Lesson 2



      sh script_name.sh
      bash −x script_name.sh

      The specified shell will start as a sub shell of your current shell and executes
      the script. This is done when you want the script to start up with specific
      options or under specific conditions which are not specified in the script.

      If you don't want to start a new shell but execute the script in the current shell,
      you source it:

      source script_name.sh

      The script does not need execute permission in this case. Commands are
      executed in the current shell context, so any changes made to your
      environment will be available when the script finishes execution


2.3   Script Basics
2.3.1 Which shell will Run the Script?

      When running a script in a subshell, you should define which shell should run
      the script. Consider for example that your login shell may be C – Shell but
      your script may be containing bash comma nds. The shell type in which you
      wrote the script might not be the default on your system, so commands you
      entered might result in errors when executed by the wrong shell.

      The first line of the script determines the shell in which the script will run. The
      first two characters of the first line should be #!, then follows the path to the
      shell that should interpret the commands that follow. Blank lines are also
      considered to be lines, so don't start your script with an empty line.

      For the purpose of this course, all scripts will start with the line
      #!/bin/bash

2.3.2 Adding comments

      It is a good practice to add comments into your scripts. Comments help in
      future when you will need to enhance or fix the script. Comments also make
      the scripts more readable.




                                                                                                88
COE                                                                         Unit 2, Lesson 2


                                            The first line of the script determines
                                            the shell to start – BASH in this case
                         commented_script1.sh
       #!/bin/bash
       # This script clears the terminal, displays a greeting and
       gives information
       # about currently connected users. The current directory
       contents are
       # displayed too           This is a Comment. Everything the shell
                                 encounters after a hash mark on a line is ignored.

       clear                           # clear terminal window

       echo "The script starts now."

       echo "Hi, $USER!"               # dollar sign is used to get
       content of variable
       echo

       echo "List of connected users:"
       echo
       w                            # show who is logged on
       echo

       echo "Displaying the contents of this directory"
       ls                            # To list the contents of this
       directory
      Usually, the initial few lines of script should indicate about the purpose of the
      script. And then you should put comments in the code too.


2.4   Debugging Bash Scripts
2.4.1 Debugging On the Entire Script

      Bash provides extensive debugging features. The most common is to start up
      the sub shell with the −x option, which will run the entire script in debug mode.
      Traces of each command plus its arguments are printed to standard output
      after the commands have been expanded but before they are executed.

      Following is the commented_script1.sh script ran in debug mode. Note again
      that the added comments are not visible in the output of the script.




                                                                                               89
COE                                                                      Unit 2, Lesson 2




        bash> bash −x commented_script1.sh
        + clear

        + echo 'The script starts now.'
        The script starts now.
        + echo 'Hi, salil!'
        Hi, salil!
        + echo

        + echo 'List of connected users:'
        List of connected users:
        + echo

        +w
          4:50pm up 18 days, 6:49, 4 users, load
        average: 0.58, 0.62, 0.40
        USER TTY         FROM LOGIN@ IDLE JCPU
        PCPU WHAT
        root     tty2     −     Sat 2pm 5:36m 0.24s
        0.05s    −bash
        salil   :0        −     Sat 2pm   ?   0.00s          ?
        −
        salil   pts/2     −     Sat 2pm 43:13 0.13s
        0.06s /usr/bin/screen

        + echo

        + echo 'Displaying the contents of this directory'
2.4.2 Debugging the contents ofthe Script
        Displaying On Part(s) Of this directory
        + ls
      Using the set Bash built-in you can run in normal mode those portions of the
        demo1.txt demo2.txt myScript.sh
      script of which you are sure they are without fault, and display debugging
      information only for troublesome zones.

      Say we are not sure what the w command will do in the example
      commented−script1.sh, then we could enclose it in the script like this:


        set −x          # activate debugging from here

        w

        set +x          # stop debugging from here




                                                                                            90
COE                                                                      Unit 2, Lesson 2



      Output then looks like this:
       bash> script1.sh
       The script starts now.
       Hi, salil!

       List of connected users:

       +w
       5:00pm up 18 days, 7:00, 4 users, load average: 0.79,
       0.39, 0.33
       USER TTY FROM LOGIN@ IDLE JCPU PCPU
       WHAT
       Root      tty2 −     Sat 2pm 5:47m 0.24s 0.05s
       −bash
       salil   :0     −   Sat 2pm ?          0.00s ?
       −
       salil    pts/2 −    Sat 2pm 54:02 0.13s 0.06s
       /usr/bin/screen
       + set +x

       Displaying the contents of this directory
       demo1.txt demo2.txt myScript.sh

       bash>




      The table below gives an overview of other useful Bash options:

      Table – Overview of set debugging options

       Short
                      Long notation      Result
       notation
                                         Disable file name generation using
       set –f         set –o noglob
                                         metacharacters (globbing).
                                         Prints shell input lines as they are
       set –v         set –o verbose
                                         read.
                                         Print command traces before
       set –x         set –o xtrace
                                         executing command.

      The dash is used to activate a shell option and a plus to deactivate it.
      In the example below, we demonstrate these options on the command line:

      Alternatively, these modes can be specified in the script itself, by adding the
      desired options to the first line shell declaration. Options can be combined, as
      is usually the case with UNIX commands:

      #!/bin/bash −xv



                                                                                            91
COE                                                                       Unit 2, Lesson 2




       bash> set −v

       bash> ls
       ls
       commented−scripts.sh               script1.sh

       bash> set +v
       set +v

       bash> ls *
       commented−scripts.sh               script1.sh

       bash> set −f

       bash> ls *
       ls: *: No such file or directory

       bash> touch *

       bash> ls
          * commented−scripts.sh             script1.sh

       bash> rm *
       bash> ls

       commented−scripts.sh script1.sh




      Once you found the buggy part of your script, you can add echo statements
      before each command of which you are unsure, so that you will see exactly
      where and why things don't work. In the example commented−script1.sh
      script, it could be done like this, still assuming that the displaying of users
      gives us problems:

       echo "debug message: now attempting to start w
       command"; w
      In more advanced scripts, the echo can be inserted to display the content of
      variables at different stages in the script, so that flaws can be detected:

       echo "Variable VARNAME is now set to $VARNAME."




                                                                                             92
COE                                                                            Unit 2, Lesson 2




2.5      Quoting
         Quoting is used to remove the special meaning of certain characters or words
         to the shell. Quoting can be used to disable special treatment for special
         characters (to preserve their literal meaning), to prevent reserved words from
         being recognized as such, and to prevent parameter expansion. The
         application should quote the following characters if they are to represent
         themselves:
         | & ; < > ( ) $ `  " ' <space> <tab> <newline>

         There are three quoting mechanisms:

      1. The escape character
      2. Single quotes
      3. Double quotes

2.5.1 Escape Character

         A non-quoted backslash ‗‘ is the Bash escape character. It preserves the
         literal value of the next character that follows, with the exception of newline. If
         a newline pair appears, and the backslash itself is not quoted, the newline is
         treated as a line continuation (that is, it is removed from the input stream and
         effectively ignored).

          bash> date=26122007

          bash> echo $date                     Variable date is created and set to
          26122007                             hold a value. The first echo
                                               displays the value of the variable,
          bash> echo $date                    but for the second, the dollar sign
          $date                                is escaped.




         The following script shows the effect of backslash on ne wline
                                   escape.sh
          #!/bin/bash

          echo "Statement 1: This will print
          as two lines."

          echo "Statement 2: This will print 
          as one line."




                                                                                                  93
COE                                                                       Unit 2, Lesson 2




      On running this script:

       bash> escape.sh
       Statement 1: This will print
       as two lines
       Statement 2: This will print as one line

2.5.2 Single Quotes

      Enclosing characters in single quotes (' ') preserves the literal value of each
      character within the quotes. A single quote may not occur between single
      quotes, even when preceded by a backslash.
      Example:
       bash> echo '$date'
       $date

2.5.3 Double-Quotes

      Enclosing characters in double-quotes ( " " ) shall preserve the literal value of
      all characters within the double-quotes, with the exception of the characters
      dollar sign ‗$‘, backquote ‗`‘ and ‗‘.
      The characters ‗$‘ and ‗`‘ retain their special meaning within double quotes.
      The backslash retains its special meaning only when followed by one of the
      following characters: ‗$‘, ‗`‘, ‗"‘, ‗‘, or newline.

       bash> echo "$date"
       20021226

       bash> echo "`date`"
       Sun Apr 20 11:22:06 CEST 2003

       bash> echo "I'd say: "Go for it!""
       I'd say: "Go for it!"

       bash> echo "In DOS directories are separated by 
       character"
       In DOS directories are separated by  character




                                                                                             94
COE                                                                    Unit 2, Lesson 2




2.6   Special Variables
      There are some variables which are set internally by the shell and which are
      available to the user. The following table lists some of them:

             Variable          Definition
                               Expands to the name of the shell script or command
             $0
                               currently being executed or the name of the shell
                               Positional parameter #1. Similarly for 2,3..9. For 10
             $1
                               use ${10}
                               Expands to the positional parameters, starting from
                               one ($1). When the expansion occurs within double
             $*                quotes, it expands to a single word with the value of
                               each parameter separated by the first character of
                               the IFS (Refer note below) special variable.
                               Expands to the positional parameters, starting from
                               one ($1). When the expansion occurs within double
             $@
                               quotes, each parameter expands to a separate
                               word.
                               Expands to the total number of positional
             $#
                               parameters in decimal.
                               The exit status of the last command executed is
             $?
                               given as a decimal string.
             $-                Flags passed to script (using set)
             $$                Expands to the process ID of the shell.
                               Expands to the process ID of the most recently
             $!
                               executed background command.

      Note: $IFS or the internal field separator is a variable which determines how
      Bash recognizes fields, or word boundaries, when it interprets character
      strings. $IFS defaults to whitespace.

      A positional parameter is a variable within a shell script whose value is set
      from an argument specified on the command line that invokes the script.
      Positional parameters are numbered and are referred to with a preceding ``$'':
      $1, $2, $3, and so on. A shell program may reference up to nine positional
      parameters. If a shell program is invoked with a command line that appears
      like this:
      my_script.sh pp1 pp2 pp3 pp4 pp5 pp6 pp7 pp8 pp9

      then positional parameter $1 within the script is assigned the value pp1,
      positional parameter $2 is assigned the value pp2, and so on, at the time the
      shell script is invoked.




                                                                                          95
COE                                                                       Unit 2, Lesson 2



       #!/bin/bash

       # positional.sh
       # This script reads 3 positional parameters and prints
       them out.

       PAR1="$1"
       PAR2="$2"
       PAR3="$3"

       echo "$1 is the first positional parameter, $1."
       echo "$2 is the second positional parameter, $2."
       echo "$3 is the third positional parameter, $3."
       echo
       echo "The total number of positional parameters is $#."

      Upon execution one could give any numbers of arguments:

       bash> positional.sh one two three four five
       one is the first positional parameter, $1.
       two is the second positional parameter, $2.
       three is the third positional parameter, $3.

       The total number of positional parameters is 5.

       bash> positional.sh one two
       one is the first positional parameter, $1.
       two is the second positional parameter, $2.
        is the third positional parameter, $3.         $3 is empty
       The total number of positional parameters is 2.



      When a UNIX command runs, it can return a numeric exit status value to the
      process that called (started) it. The status can tell the calling process whether
      the command succeeded or failed. Many (but not all) UNIX commands return
      a status of zero if everything was okay or non-zero (1, 2, etc.) if something
      went wrong. A few commands, like grep and diff, return a different non-zero
      status for different kinds of problems. See your online manual pages to find
      out.




                                                                                             96
COE                                                                             Unit 2, Lesson 2



      More examples:

      bash> grep dictionary /usr/share/dict/words
      dictionary

                                           User rahul starts entering the
      bash> echo $$
                                           grep command.
      10662
                                           The process ID of his shell is
      bash> mozilla &
                                           10662. After putting a job in the
      [1] 11064                            background, the ! holds the
                                           process ID of the backgrounded
      bash> echo $!                        job.
      11064

      bash> echo $0
      bash

      bash> echo $?                             The shell running is bash.
      0
                                                When a mistake is made, ?
      bash> ls abc                              holds an exit status
      ls: abc: No such file or directory        different from 0 (zero). Else
                                                the status is 0.
      bash> echo $?
      1

      The following script shows the use of ―$*‖ special variable:

                          spl_var_eg.sh
      #!/bin/bash
      echo ―My Process ID is: $$‖
      echo ―The number of Arguments is $#‖
      echo ―The Arguments are $*‖
      grep ―$1‖ $2
      echo ―Job Over‖

      Upon execution:
      bash> spl_var_eg.sh Blue demo1.txt
      My Process ID is: 23465
      The number of Arguments is 2
      The Arguments are Blue demo1.txt
      My favourite colour is Blue.

      Job Over




                                                                                                   97
COE                                                                           Unit 2, Lesson 2




2.7        Summing Up
           A shell script is a reusable series of commands put in an executable text file.
           Any text editor can be used to write scripts.

           Scripts start with #! followed by the path to the shell executing the commands
           from the script. Comments are added to a script for your own future reference,
           and also to make it understandable for other users. It is better to have too
           many explanations than not enough.

           Debugging a script can be done using shell options. Shell options can be
           used for partial debugging or for analyzing the entire script. Inserting echo
           commands at strategic locations is also a co mmon troubleshooting technique.



Self-check Questions
1. What do you need to add to the first line of the script to indicate Bash shell?
2. Why are comments needed and how do you add them?
3. What happens when a script is executed with the option "bash -x" option?




2.8        Answers to the Self-Check questions
      1. #!/bin/bash
      2. Comments are useful to enlighten the reader about the script and make it
         comprehendible. A comment is added in the format: # <the comment>
      3. It will run the entire script in debug mode


2.9        Terminal Questions
      1.   What are the different steps for creating a shell script?
      2.   How would you debug a part of the script?
      3.   What are the different shell debugging options?
      4.   Why is Quoting used? Give examples.




                                                                                                 98
COE                                 Unit 2, Lesson 3




LESSON 3   CONDITIONAL STATEMENTS
Unix shell program training
COE                                                                          Unit 2, Lesson 3




                       3. Conditional statements


One of the advanced concepts, conditional statements are very frequently used in
scripts. A clear understanding of this concept is very important.



3.0       Objectives
          After going through this lesson, you will learn about:

         The if statement
         Using the exit status of a command
         Comparing and testing input and files
         If-then-else constructs
         If-then-elif-else constructs
         Using and testing the positional parameters
         Nested if statements
         Using case statements


3.1       Introduction
          This chapter introduces the use of conditionals in Bash scripts. This would
          enable the student to write scripts that are more powerful and cater to
          different conditions.


3.2       Introduction to if
3.2.1 General

          At times you need to specify different courses of action to be taken in a shell
          script, depending on the success or failure of a command. The if construction
          allows you to specify such conditions.

          The most compact syntax of the if command is:

          if TEST−COMMANDS; then CONSEQUENT−COMMANDS; fi

          Example: For Checking shell options




                                                                                                101
COE                                                                            Unit 2, Lesson 3




           # These lines will print a message if the noclobber
           option is set

            if [ −o noclobber ]
            then
                    echo "Your files are protected against accidental
                   overwriting using redirection."
            fi
          The TEST−COMMAND list is executed, and if its return status is zero, the
          CONSEQUENT−COMMANDS list is executed. The return status is the exit
          status of the last command executed, or zero if no condition tested true.

          The TEST−COMMAND often involves numerical or string comparison tests,
          but it can also be any command that returns a status of zero when it succeeds
          and some other status when it fails. Unary expressions are often used to
          examine the status of a file. If the FILE argument to one of the primaries is of
          the form /dev/fd/N, then file descriptor "N" is checked. stdin, stdout and stderr
          and their respective file descriptors may also be used for tests.

         Expressions used with if

          The table below contains an overview of the so−called "primaries" that make
          up the TEST−COMMAND command or list of commands. These primaries are
          put between square brackets to indicate the test of a conditional expression.

          Table − Primary expressions

                  Primary                    Meaning
                  [ -a FILE ]                True if FILE exists
                  [ -o                       True if shell option ―OPTIONNAME‖ is
                  OPTIONNAME ]               enabled
                  [ -z STRING ]              True of the length of ―STRING‖ is non-
                                             zero.
                  [ -n STRING ]or [          True of the length of ―STRING‖ is non-Zero
                  STRING]
                  [ STRING1 ==               True if the strings are equal. ―=‖may be
                  STRING2 ]                  used instead of ‖==‖ for strict POSIX
                                             compliance
                  [STRING1! =                True if the strings are not equal
                  STRING2]
                  [ STRING1<                 True if ―STRING1‖ sorts before ―STRING2‖
                  STRING2 ]                  lexicographically in the current locale.
                  [ STRING1>                 True if ―STRING1‖ sorts after ―STRING2‖
                  STRING2 ]                  lexicographically in the current locale.
                  [ ARG1 OP                  ―OP‖ is one of – eq, -ne, -lt, -le,-gt or –ge.
                  ARG2 ]                     These arithmetic binary operators return
                                             true if ―ARG1‖ is equal to, not equal to,
                                             less than, less than or equal to, greater
                                             than, or greater than or equal to ―ARG2‖.
                                             ―ARG1‖ and ―ARG2‖ are integers.

                                                                                                  102
COE                                                                             Unit 2, Lesson 3




          Expressions may be combined using the following operators, listed in
          decreasing order of precedence:


          Table – Combining expressions

                  Operation                 Effect
                  [ ! EXPR ]                True if EXPR is false
                                            Returns the value of EXPR. This may be
                  [ (EXPR) ]                used to override the normal precedence of
                                            operators.
                  [ EXPR1 –a
                                            True if both EXPR1 and EXPR2 are True
                  EXPR2 ]
                  [ EXPR1 –o
                                            True if either EXPR1 and EXPR2 is true.
                  EXPR2 ]


          The [ (or test) built−in evaluates conditional expressions using a set of rules
          based on the number of arguments. More information about this subject can
          be found in the Bash documentation. Just like, the if is closed with fi, the
          opening angular bracket should be closed after the conditions have been
          listed.

         Commands following the then statement

          The CONSEQUENT−COMMANDS list that follows the then statement can be
          any valid UNIX command, any executable program, any executable shell
          script or any shell statement, with the exception of the closing fi. It is important
          to remember that the then and fi are considered to be separated statements in
          the shell.
          Therefore, when issued on the command line, they are separated by a
          semi−colon.
          In a script, the different parts of the if statement are usually well−separated.
          Below are a couple of simple examples.

         Checking files

          The first example checks for the existence of a file:




                                                                                                   103
COE                                                                      Unit 2, Lesson 3




                                      filecheck.sh
            #!/bin/bash

            echo "This scripts checks the existence of the demo file."
            echo "Checking..."
            if [ −f /usr/guest/demo.txt ]
            then
                   echo "/usr/guest/demo.txt file exists."
            fi
            echo
            echo "...done."

            bash> ./filecheck.sh
            This scripts checks the existence of the messages file.
            Checking...
            /usr/guest/demo.txt file exists.

            ...done.




3.2.2 Simple applications of if

         Testing exit status

           bash> if [ $? −eq 0 ]
           > then echo 'That was a good job!'
           > fi
           That was a good job!

           bash>



         Numeric comparisons

           bash> num=`wc −l demo1.txt`

          : bash> echo $num
            201

           bash> if [ "$num" −gt "150" ]
           > then echo ; echo "This is a big file."
           > echo ; fi

           This is a big file.

           bash>



                                                                                            104
COE                                                                          Unit 2, Lesson 3




         String comparisons


              dir=`pwd`      # /tmp/proc
              updir=`basename $dir` # /tmp
              if [ "$updir"‖X‖ != ―/tmpX'' ]; then
                          echo "You need to be in a subdirectory of /tmp."
                          exit 1;
              fi




3.3       More advanced if usage

3.3.1 if-then-else constructs

          Like the CONSEQUENT−COMMANDS list following the then statement, the
          ALTERNATE−CONSEQUENT−COMMANDS list following the else statement
          can hold any UNIX−style command that returns an exit status.

            Example 1
          On executing the script we get:
                                         fun_weigh.sh
              bash> bash −x fun_weigh.sh 55 169
              + weight=55                fun_weigh.sh
              + height=169
              + idealweight=59
              + '[' 55 −le 59 ']'
              + echo 'You should eat a bit more fat.'
              You should eat a bit more fat.

               Example 2


              #!/bin/bash

              # This script prints a message about your weight if you
              give it your
              # weight in kilos and hight in centimeters.

              weight="$1"
              height="$2"
              idealweight=$[$height − 110]

              if [ $weight −le $idealweight ] ; then
                     echo "You should eat a bit more fat."
              else
                    echo "You should eat a bit more fruit."
              fi
                                                                                                105
COE                                                                     Unit 2, Lesson 3



      Testing the number of arguments - The previous script is modified so that it
      prints a message if more or less than 2 arguments are given:

                                          fun_weigh.sh
       #!/bin/bash

       # This script prints a message about your weight if
       you give it your
       # weight in kilos and hight in centimeters.

       if [ ! $# == 2 ]; then
             echo "Usage: $0 weight_in_kilos
       length_in_centimeters"
             exit
       fi

       weight="$1"
       height="$2"
       idealweight=$[$height − 110]

       if [ $weight −le $idealweight ] ; then
             echo "You should eat a bit more fat."
       else
             echo "You should eat a bit more fruit."
       fi

       bash> fun_weigh.sh 70 150
       You should eat a bit more fruit.

       bash> fun_weigh.sh 70 150 33
       Usage: ./weight.sh weight_in_kilos
       length_in_centimeters
      The first argument is referred to as $1, the second as $2 and so on. The total
      number of arguments is stored in $#.

3.3.2 if-then-elif-else constructs

      This is the full form of the if statement:

      if TEST−COMMANDS; then
          CONSEQUENT−COMMANDS;
      elif MORE−TEST−COMMANDS; then
          MORE−CONSEQUENT−COMMANDS;
      else ALTERNATE−CONSEQUENT−COMMANDS;
      fi




                                                                                           106
COE                                                                        Unit 2, Lesson 3




                                    testleap.sh

       #!/bin/bash                                                 Also note nested
       # This script will test if we're in a leap year or not.    ifs here. You may
                                                                     use as many
       year=`date +%Y`                                           levels of nested ifs
                                                                      as you can
       if [ $[$year % 400] −eq "0" ]; then                        logically manage.
             echo "This is a leap year. February has 29 days."
       elif [ $[$year % 4] −eq 0 ]; then
                if [ $[$year % 100] −ne 0 ]; then
                     echo "This is a leap year, February has 29
       days."
                else
                     echo "This is not a leap year. February has 28
       days."
                fi
       else
             echo "This is not a leap year. February has 28 days."
       fi

       bash> date
       Fri Dec 21 17:14:28 IST 2007

       bash> testleap.sh
       This is not a leap year.
3.3.3 Returning the exit status using if

      Sometimes, you test for a condition and find that it fails. You would rather like
      the program to terminate since there is no point in continuing further if an
      essential resource is missing—say the file you want to search. The exit
      statement is used to prematurely terminate a program.

      The exit statement takes an optional argument. This argument is the integer
      exit status code, which is passed back to the parent and stored in the $?
      variable.

       #!/bin/bash
         if [ $# -ne 2 ]; then
            echo "Usage $0 <file1> <file2>";
            exit 2
         fi
         ...<rest of script>


      In this example if the number of arguments is not 2 then the execution is
      exited (with a code 2) and a message about the usage is printed.




                                                                                              107
COE                                                                        Unit 2, Lesson 3




3.4   Using case statements
      Nested if statements might be nice, but as soon as you are confronted with a
      couple of different possible actions to take, they tend to confuse. For the more
      complex conditionals, use the case syntax:

      case EXPRESSION in CASE1) COMMAND−LIST;; CASE2)
      COMMAND−LIST;; ... CASEN)
      COMMAND−LIST;; esac

      Each case is an expression matching a pattern. The commands in the
      COMMAND−LIST for the first match are executed. The "|" symbol may be
      used for separating multiple patterns, a nd the ")" operator terminates a pattern
      list. Each case plus its according commands are called a clause. Each clause
      must be terminated with ";;". Each case statement is ended with the esac
      statement.

      In the example, we demonstrate use of case for getting the disk usage.

                              disk_utility.sh
       #!/bin/bash

       echo ―n 1. The free disk spacen 2. Space consumed by
       this user
          3. Exitnn SELECTION: c‖

       read selection
       case $selection in
           1) df ;;
           2) du –s $HOME ;;
           3) exit ;;
           *) echo ― Not a valid option‖
       esac


      Echo interprets and treats the character c as special because of the
      backslash. The c here represents an escape sequence, which positions the
      cursor immediately after the argument instead of the next line.

      The read statement takes input from the user, thereby making the script
      interactive. The input is read into a variable (selection in this case). The output
      is as follows:




                                                                                              108
COE                                                                        Unit 2, Lesson 3




           bash> disk_utility.sh

           1. The free disk space
           2. Space consumed by this user
           3. Exit

           SELECTION: 2
           456100 /home/pallavi


3.5      Summary
         In this chapter we learned how to build conditions into our scripts so that
         different actions can be undertaken upon success or failure of a command.
         The actions can be determined using the if statement. This allows you to
         perform arithmetic and string comparisons, and testing of exit code, input and
         files needed by the script.

         A simple If-then-fi test often precedes commands in a shell script in order to
         prevent output generation, so that the script can easily be run in the
         background or through the cron facility. More complex definitions of conditions
         are usually put in a case statement.



Self-check Questions
1. What is the use of the "if" statement?
2. What is the exit status of a command? What is its normal value and where is the
   value stored?



3.6      Answers to the Self-Check questions
      1. The "if" statement takes two-way decisions depending on the fulfillment of a
         certain condition.
      2. The exit status is an integer that represents the success or failure of a
         command. It has the value 0 when the command executes successfully and is
         stored in the parameter $?


3.8      Terminal Questions
      1. List some applications of the ―if-then-elif-else‖ statement.
      2. Give an example of ―Case‖ usage.



                                                                                              109
Unix shell program training
COE                                                                                                               Unit 2, Lesson 4




LESSON 4 REPETITIVE T ASKS
4. REPETITIVE TASKS .................................................................................................... 113

   4.0       OBJECTIVES .......................................................................................................... 113
   4.1       INTRODUCTION ...................................................................................................... 113
   4.2       THE FOR LOOP....................................................................................................... 113
      4.2.1 How does it work? .......................................................................................... 113
      4.2.2 Examples ......................................................................................................... 114
   4.3       THE WHILE LOOP ................................................................................................... 115
      4.3.1 What is it? ........................................................................................................ 115
      4.3.2 Examples ........................................................................................................... 115
   4.4       THE UNTIL LOOP .................................................................................................... 117
      4.4.1 What is it? ........................................................................................................ 117
      4.4.2 Example ........................................................................................................... 118
   4.5       I/O REDIRECTION AND LOOPS ............................................................................... 118
      4.5.1 Input redirection .............................................................................................. 119
      4.5.2 Output redirection ........................................................................................... 119
   4.6 BREAK AND CONTINUE ................................................................................................ 119
      4.6.1 The break built−in........................................................................................... 120
      4.6.2 The continue built−in...................................................................................... 121
      4.6.3 Examples ......................................................................................................... 121
   4.7       MAKING MENUS WITH THE SELECT BUILT−IN ........................................................ 123
      4.7.1 General ............................................................................................................ 123
      4.7.2 Submenus ....................................................................................................... 126
   4.8       THE SHIFT BUILT− IN .............................................................................................. 126
      4.8.1 What does it do?............................................................................................. 126
      4.8.2 Examples ......................................................................................................... 126
   4.9       SUMMARY.............................................................................................................. 127
   4.10      ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................ 128
   4.11      TERMINAL QUESTIONS .......................................................................................... 128
Unix shell program training
COE                                                                              Unit 2, Lesson 4




                              4. Repetitive tasks


It is important to appreciate the need of loops in scripts. It takes scripting to the next
level and comes very handy in a wide variety of applications.



4.0       Objectives
          Upon completion of this chapter, you will be able to
         Use for, while and until loops, and decide which loop fits which occasion.
         Use the break and continue Bash built−ins.
         Write scripts using the select statement.
         Write scripts that take a variable number of arguments.


4.1       Introduction
          This chapter teaches the student to write different types of loops as per any
          application that requires repetitive tasks. This is very helpful in writing useful
          scripts that require something to be done repeatedly.


4.2       The for loop
4.2.1 How does it work?

          The for loop is the first of the three shell looping constructs. This loop allows
          for specification of a list of values. A list of commands is executed for each
          value in the list.

          The syntax for this loop is:

          for NAME [in LIST ]; do COMMANDS; done

          If [in LIST] is not present, it is replaced with $@ and for executes the
          COMMANDS once for each positional parameter that is set. The return status
          is the exit status of the last command that executes. If no commands are
          executed because LIST does not expand to any items, the return status is
          zero.

          NAME can be any variable name, although it is used very often. LIST can be
          any list of words, strings or numbers, which can be literal or generated by any
          command. The COMMANDS to execute can also be any operating system


                                                                                                    113
COE                                                                                     Unit 2, Lesson 4



          commands, script, program or shell statement. The first time through the loop,
          NAME is set to the first item in the LIST. The second time, its value is set to
          the second item in the list, and so on. The loop terminates when NAME has
          taken on each of the values from LIST and no items are left in the LIST.

4.2.2 Examples

         Using command substitution for specifying LIST items

          The first is a command line example, demonstrating the use of a for loop that
          makes a backup copy of each .xml file. After issuing the command, it is safe
          to start working on your sources:

           bash> ls *.xml
           file1.xml file2.xml        file3.xml

           bash> ls *.xml > list

           bash> for i in `cat list`; do cp "$i" "$i".bak ; done

           bash> ls *.xml*
           file1.xml file1.xml.bak file2.xml file2.xml.bak file3.xml
           file3.xml.bak



          This one lists the files in /sbin that are just plain text files, and possibly scripts:
           for i in `ls /sbin`; do file /sbin/$i | grep ASCII;done


         Using the content of a variable to specify LIST items

          The following is a specific application script for converting HTML files,
          compliant with a certain scheme, to PHP files. The conversion is done by
          taking out the first 25 and the last 21 lines, replacing these with two PHP tags
          that provide header and footer lines:

                                            html2php.sh

           #!/bin/bash
           # specific conversion script for my html files to php
           LIST="$(ls *.html)"
           for i in "$LIST"; do
                   NEWNAME=$(ls "$i" | sed −e 's/html/php/')
                  cat beginfile > "$NEWNAME"
                  cat "$i" | sed −e '1,25d' | tac | sed −e '1,21d'| tac >>
           "$NEWNAME"
                  cat endfile >> "$NEWNAME"
           done

                                                                                                           114
COE                                                                          Unit 2, Lesson 4



          Since we don't do a line count here, there is no way of knowing the line
          number from which to start deleting lines until reaching the end. The problem
          is solved using tac, which reverses the lines in a file.



4.3       The while loop
4.3.1 What is it?

          The while construct allows for repetitive execution of a list of commands, as
          long as the command controlling the while loop executes successfully (exit
          status of zero). The syntax is:

          while CONTROL−COMMAND; do CONSEQUENT−COMMANDS; done

          CONTROL−COMMAND can be any command(s) that can exit with a success
          or failure status. The CONSEQUENT−COMMANDS can be any program,
          script or shell construct.

          As soon as the CONTROL−COMMAND fails, the loop exits. In a script, the
          command following the done statement is executed.

          The return status is the exit status of the last CONSEQUENT−COMMANDS
          command, or zero if none was executed.

4.3.2 Examples

         Simple example using while

          Here is an example for the impatient:

           #!/bin/bash

           # This script opens 4 terminal windows.

           i="0"

           while [ $i −lt 4 ]
           do
            xterm &
            i=$[$i+1]
           done


         Nested while loops

          The example below was written to copy pictures that are made with a webcam
          to a web directory. Every five minutes a picture is taken. Every hour, a new
          directory is created, holding the images for that hour. E very day, a new


                                                                                                115
COE                                                                            Unit 2, Lesson 4



          directory is created containing 24 subdirectories. The script runs in the
          background.
           #!/bin/bash

           # This script copies files from my homedirectory into the
           webserver directory.
           # (use scp and SSH keys for a remote directory)
           # A new directory is created every hour.

           PICSDIR=/home/mohan/pics
           WEBDIR=/var/www/mohan/webcam

           while true; do
                DATE=`date +%Y%m%d`
                HOUR=`date +%H`
                mkdir $WEBDIR/"$DATE"

                  while [ $HOUR −ne "00" ]; do
                       DESTDIR=$WEBDIR/"$DATE"/"$HOUR"
                       mkdir "$DESTDIR"
                       mv $PICDIR/*.jpg "$DESTDIR"/
                       sleep 3600
                       HOUR=`date +%H`
                  done
           done


          Note the use of the true statement. This means: continue execution until we
          are forcibly interrupted (with kill or Ctrl+C).

          This small script can be used for simulation testing; it generates files:

           #!/bin/bash

           # This generates a file every 5 minutes

           while true; do
            touch pic−`date +%s`.jpg
            sleep 300
           done



          Note the use of the date command to generate all kinds of file and directory
          names. See the man page for more information on date command

         Calculating an average




                                                                                                  116
COE                                                                        Unit 2, Lesson 4



      This script calculates the average of user input, which is tested before it is
      processed: if input is not within range, a message is printed. If q is pressed,
      the loop exits:

        #!/bin/bash

        # Calculate the average of a series of numbers.

        SCORE="0"
        AVERAGE="0"
        SUM="0"
        NUM="0"

        while true; do

              echo −n "Enter your score [0−100%] ('q' for quit): ";
        read SCORE;
          if (("$SCORE" < "0")) || (("$SCORE" > "100")); then
               echo "Be serious. Common, try again: "
          elif [ "$SCORE" == "q" ]; then
               echo "Average rating: $AVERAGE%."
               break
           else
                 SUM=$[$SUM + $SCORE]
                 NUM=$[$NUM + 1]
                 AVERAGE=$[$SUM / $NUM]
           fi
        done

        echo "Exiting."


      Note how the variables in the last lines are left unquoted in order to do
      arithmetic.

4.4   The until loop
4.4.1 What is it?

      The until loop is very similar to the while loop, except that the loop executes
      until the TEST−COMMAND executes successfully. As long as this command
      fails, the loop continues. The syntax is the same as for the while loop:

      until TEST−COMMAND; do CONSEQUENT−COMMANDS; done

      The return status is the exit status of the last command executed in the
      CONSEQUENT−COMMANDS list, or zero if none was executed.
      TEST−COMMAND can, again, be any command that can exit with a success
      or failure status, and CONSEQUENT−COMMANDS can be any UNIX
      command, script or shell construct.


                                                                                              117
COE                                                                     Unit 2, Lesson 4




      As was previously explained, the ";" may be replaced with one or more
      newlines wherever it appears.


4.4.2 Example

      An improved picturesort.sh script (see Section 4.2.2.2), which tests for
      available disk space. If disk space is not enough, remove pictures from the
      previous months:

       #!/bin/bash

       # This script copies files from my
       homedirectory into the webserver directory.
       # A new directory is created every hour.
       # If the pics are taking up too much space,
       the oldest are removed.

       while true; do
        DISKFUL=$(df −h $WEBDIR | grep −v File |
       awk '{print $5}' | cut −d "%" −f1 −)

        until [ $DISKFUL −ge "90" ]; do
             DATE=`date +%Y%m%d`
              HOUR=`date +%H`
             mkdir $WEBDIR/"$DATE"

             while [ $HOUR −ne "00" ]; do

       DESTDIR=$WEBDIR/"$DATE"/"$HOUR"
                    mkdir "$DESTDIR"
                    mv $PICDIR/*.jpg "$DESTDIR"/
                    sleep 3600
                    HOUR=`date +%H`
               done
        DISKFULL=$(df −h $WEBDIR | grep −v
       File | awk '{ print $5 }' | cut −d "%" −f1 −)
        done

         TOREMOVE=$(find $WEBDIR −type d −a
       −mtime +30)
           for i in $TOREMOVE; do
                 rm −rf "$i";
           done
      Note the initialization of the HOUR and DISKFUL variables and the use of
      options with ls and date in order to obtain a correct listing for TOREMOVE.
       done
      (Not Clear)

4.5   I/O redirection and loops

                                                                                           118
COE                                                                        Unit 2, Lesson 4




4.5.1 Input redirection

      Instead of controlling a loop by testing the result of a command or by user
      input, you can specify a file from which to read input that controls the loop. In
      such cases, read is often the controlling command. As long as input lines are
      fed into the loop, execution of the loop commands continues. As soon as, all
      the input lines are read the loop exits.

      Since the loop construct is considered to be one command structure (such as
      while TEST−COMMAND; do CONSEQUENT−COMMANDS; done), the
      redirection should occur after the done statement, so that it complies with the
      form

      command < file

      This kind of redirection also works with other kinds of loops.

4.5.2 Output redirection

      In the example below, output of the find command is used as input for the
      read command controlling a while loop:




                             archiveoldstuff.sh
         #!/bin/bash

         # This script creates a subdirectory in the current
         directory, to which old
         # files are moved.
         # Might be something for cron (if slightly adapted) to
         execute weekly or
         # monthly.

         ARCHIVENR=`date +%Y%m%d`
         DESTDIR="$PWD/archive−$ARCHIVENR"

         mkdir $DESTDIR

         find $PWD −type f −a −mtime +5 | while read file
         do
            gzip "$file"; mv "$file".gz "$DESTDIR"
      Filesechocompressed by gzip command before they are moved into the
             are "$file archived"
         done
      archive directory.

4.6 Break and continue

                                                                                              119
COE                                                                     Unit 2, Lesson 4




4.6.1 The break built−in

      The break statement is used to exit the current loop before its normal ending.
      This is done when you don't know in advance how many times the loop will
      have to execute, for instance because it is dependent on user input.

      The example below demonstrates a while loop that can be interrupted. This is
      a slightly improved version of the wisdom.sh script from Section 4.3.2



        #!/bin/bash

        # This script provides wisdom
        # You can now exit in a decent
        way.

        FORTUNE=/usr/games/fortune

        while true; do
        echo "On which topic do you want
        advice?"
        echo "1. politics"
        echo "2. startrek"
        echo "3. kernelnewbies"
        echo "4. sports"
        echo "5. bofh−excuses"
        echo "6. magic"
        echo "7. love"
        echo "8. literature"
        echo "9. drugs"
        echo "10. education"
        echo

        echo −n "Enter your choice, or 0 for
        exit: "
        read choice
        echo

        case $choice in
                  1)
                  $FORTUNE politics
                  ;;
                  2)
                  $FORTUNE startrek
                  ;;
                   3)
                   $FORTUNE
        kernelnewbies
                   ;;
                   4)
                   echo "Sports are a waste                                                120
        of time, energy and money."
                   echo "Go back to your
COE                                                                        Unit 2, Lesson 4



          5)
          $FORTUNE bofh−excuses
          ;;
         6)
         $FORTUNE magic
          ;;
         7)
         $FORTUNE love
          ;;
         8)
         $FORTUNE literature
          ;;
         9)
         $FORTUNE drugs
         ;;
        10)
         $FORTUNE education
          ;;
         0)
         echo "OK, see you!"
         break
         ;;
         *)
        echo "That is not a valid choice, try a
 number from 0 to 10."
         ;;
 esac
 done



      Mind that break exits the loop, not the script. This can be demonstrated by
      adding an echo command at the end of the script. This echo will also be
      executed upon input that causes break to be executed (when the user types
      "0"). In nested loops, break allows for specification of which loop to exit. See
      the Bash info pages for more.

4.6.2 The continue built−in

      The continue statement resumes iteration of an enclosing for, while, until or
      select loop.
      When used in a for loop, the controlling variable takes on the value of the next
      element in the list. When used in a while or until construct, on the other hand,
      execution resumes with TEST−COMMAND at the top of the loop.

4.6.3 Examples

      In the following example, file names are converted to lower case. If no
      conversion needs to be done, a continue statement restarts execution of the
      loop. These commands don't eat much system resources, and most likely,


                                                                                              121
COE                                                                     Unit 2, Lesson 4



      similar problems can be solved using sed and a wk. However, it is useful to
      know about this kind of construction when executing heavy jobs, that might
      not even be necessary when tests are inserted at the correct locations in a
      script, sparing system resources.

                                 tolower.sh

       #!/bin/bash

       # This script converts all file names containing upper case
       characters into
       file

       # names containing LIST="$(ls)"

       for name in "$LIST"; do

        if [[ "$name" != *[[:upper:]]* ]]; then
           continue
        fi

       ORIG="$name"
       NEW=`echo $name | tr 'A−Z' 'a−z'`

       mv "$ORIG" "$NEW"
       echo "new name for $ORIG is $NEW"
       done


      This script has at least one disadvantage: it overwrites existing files. The
      noclobber option to Bash is only useful when redirection occurs. The −b
      option to the mv command provides more security, but is only safe in case of
      one accidental overwrite, as is demonstrated in this test:


       bash> rm *

       bash> touch test Test TEST

       bash> bash −x tolower.sh
       ++ ls
       + LIST=test
       Test
       TEST
       + [[ test != *[[:upper:]]* ]]
       + continue
       + [[ Test != *[[:upper:]]* ]]
       + ORIG=Test




                                                                                           122
COE                                                                             Unit 2, Lesson 4




           ++ echo TEST
           ++ tr A−Z a−z
           + NEW=test
           + mv −b TEST test
           + echo 'new name for TEST is test'
           new name for TEST is test

           bash> ls −a
           ./  ../   test      test~

          The tr is part of the textutils package; it can perform all kinds of character
          transformations.


4.7       Making menus with the select built−in
4.7.1 General

         Use of select

          The select construct allows easy menu generation. The syntax is quite similar
          to that of the for loop:

          select WORD [in LIST]; do RESPECTIVE−COMMANDS; done

          LIST is expanded, generating a list of items. The expansion is printed to
          standard error; each item is preceded by a number. If in LIST is not present,
          the positional parameters are printed, as if in $@ would have been specified.
          LIST is only printed once.

          Upon printing all the items, the PS3 prompt is printed and one line from
          standard input is read. If this line consists of a number corresponding to one
          of the items, the value of WORD is set to the name of that item. If the line is
          empty, the items and the PS3 prompt are displayed again. If an EOF (End Of

                                                                                                   123
COE                                                                          Unit 2, Lesson 4



          File) character is read, the loop exits. Since most users don't have a clue
          which key combination is used for the EOF sequence, it is more user−friendly
          to have a break command as one of the items. Any other value of the read
          line will set WORD to be a null string.

          The read line is saved in the REPLY variable.
          The RESPECTIVE−COMMANDS are executed after each selection until the
          number representing the break is read. This exits the loop.

         Examples

          This is a very simple example, but as you can see, it is not very user−friendly:




                                                                                                124
COE                                                                      Unit 2, Lesson 4




                                       private.sh

       #!/bin/bash

       echo "This script can make any of the files in this directory
       private."
       echo "Enter the number of the file you want to protect:"

       select FILENAME in *;
       do
             echo "You picked $FILENAME ($REPLY), it is now
            only accessible to you."
            chmod go−rwx "$FILENAME"
       done

       bash>./private.sh
       This script can make any of the files in this directory
       private.
       Enter the number of the file you want to protect:
       1) archive−20030129
       2) bash
       3) private.sh
       #? 1
       You picked archive−20030129 (1)
       #?


      Setting the PS3 prompt and adding a possibility to quit makes it better:

       #!/bin/bash

       echo "This script can make any of the files in this directory
       private."
       echo "Enter the number of the file you want to protect:"

       PS3="Your choice: "
       QUIT="QUIT THIS PROGRAM − I feel safe now."
       touch "$QUIT"

       select FILENAME in *;
       do
           case $FILENAME in
              "$QUIT")
                 echo "Exiting."
                 break
                 ;;
              *)
                 echo "You picked $FILENAME ($REPLY)"
                 chmod go−rwx "$FILENAME"
                 ;;
           esac
       done                                                                                 125
       rm "$QUIT"
COE                                                                        Unit 2, Lesson 4



4.7.2 Submenus

      Any statement within a select construct can be another select loop, enabling
      (a) submenu(s) within a menu.

      By default, the PS3 variable is not changed when entering a nested select
      loop. If you want a different prompt in the submenu, be sure to set it at the
      appropriate time(s).


4.8   The shift built−in
4.8.1 What does it do?

      The shift command is one of the Bourne shell built−ins that comes with Bash.
      This command takes one argument, a number. The positional parameters are
      shifted to the left by this number, N. The positional parameters from N+1 to $#
      are renamed to variable names from $1 to $# − N+1. Say you have a
      command that takes 10 arguments, and N is 4, then $4 becomes $1, $5
      becomes $2 and so on. $10 becomes $7 and the original $1, $2 and $3 are
      thrown away.
      If N is zero or greater than $# (the total number of arguments, see Section
      7.2.1.2). If N is not present, it is assumed to be 1. The return status is zero
      unless N is greater than $# or less than zero; otherwise it is non−zero.

4.8.2 Examples

      A shift statement is typically used when the number of arguments to a
      command is not known in advance, for instance when users can give as many
      arguments as they like. In such cases, the arguments are usually processed
      in a while loop with a test condition of (($# )). This condition is true as long as
      the number of arguments is greater than zero. The $1 variable and the shift
      statement process each argument. The number of arguments is reduced each
      time shift is executed and eventually becomes zero, upon which the while
      loop exits.

      The example below, cleanup.sh, uses shift statements to process each file in
      the list generated by find:




                                                                                              126
COE                                                                       Unit 2, Lesson 4




       #!/bin/bash

       # This script can clean up files that were last accessed
       over 365 days ago.

       USAGE="Usage: $0 dir1 dir2 dir3 ... dirN"

       if [ "$#" == "0" ]; then
              echo "$USAGE"
              exit 1
       fi

       while (( "$#" )); do

        if [[ "$(ls $1)" == "" ]]; then
               echo "Empty directory, nothing to be done."
        else
               find $1 −type f −a −atime +365 −exec rm −i {} ;
        fi

        shift

       done

      The above find command can be replaced with the following:

      find options | xargs [commands_to_execute_on_found_files]

      The xargs command builds and executes command lines from standard input.
      This has the advantage that the command line is filled until the system limit is
      reached. Only then will the command to execute be called, in the above
      example this would be rm. If there are more arguments, a new command line
      will be used, until that one is full or until there are no more arguments. The
      same thing using find −exec calls on the command to execute on the found
      files every time a file is found. Thus, using xargs greatly speeds up your
      scripts and the performance of your machine.


4.9   Summary
      In this chapter, we discussed how repetitive commands can be incorporated
      in loop constructs. Most common loops are built using the for, while or until
      statements, or a combination of these commands. The for loop executes a
      task a defined number of times. If you don't know how many times a
      command should execute, use either until or while to specify when the loop
      should end.

      Loops can be interrupted or reiterated using the break and continue
      statements. A file can be used as input for a loop using the input redirection


                                                                                             127
COE                                                                           Unit 2, Lesson 4



           operator, loops can also read output from commands that is fed into the loop
           using a pipe.

           The select construct is used for printing menus in interactive scripts. Looping
           through the command line arguments to a script can be done using the shift
           statement.



Self-check Questions
1.    What is the use of Loops?
2.    List the different types of Loops in shell?
3.    What is the use of the "break" statement?
4.    What will the following construct do and why?
          while [ 5 ]




4.10 Answers to the Self-Check questions
      1.   Loops let the user perform a set of instructions repeatedly.
      2.   For, While and Until.
      3.   The break statement is used to exit the current loop before its normal ending.
      4.   This sets up an infinite loop since a value greater than 0 is considered to be
           true.


4.11 Terminal Questions
      1. How would you decide which type of loop to use?
      2. Explain why it is so important to put the variables in between double quotes in
         the example from Section 4.4.2?
      3. Describe the ―shift‖ built-in command.
      4. There are at least 6 syntactical mistakes in the following program. Locate
         them.




                                                                                                 128
COE                                                  Unit 2, Lesson 4




      1 ppprunning = yes
      2 while $ppprunning = yes ; do
      3     echo ― INTERNET MENUn
      4      1. Dial out
      5      2. Exit
      6
      7      Choice:
      8      read choice
      9      case choice in
      10        1) if [ -z ―$ppprunning‖ ]
      11             echo ―Enter your username and
         password‖
      12            else
      13               chat.sh
      14            endif ;
      15         *) ppprunning=no
      16      endcase
      17 done




                                                                        129
Unix shell program training
COE
                                                                                                            Unit 2, Lesson 5




LESSON 5                      REGULAR EXPRESSIONS
5. REGULAR EXPRESSIONS......................................................................................... 133

   5.0      OBJECTIVES .......................................................................................................... 133
   5.1      INTRODUCTION ...................................................................................................... 133
   5.2      REGULAR EXPRESSIONS ....................................................................................... 133
      5.2.1 What are regular expressions? .................................................................... 133
      5.2.2 The Structure of a Regular Expression....................................................... 134
      5.2.3 Regular expression metacharacters ........................................................... 135
      5.2.4 Creating complex regular expressions by concatenating other regEx .. 136
      5.2.5 Using metacharacters on regEx to create complex regEx ..................... 136
   5.3      THE GREP.............................................................................................................. 137
      5.3.1 Grep and regular expressions ...................................................................... 138
   5.4      PATTERN MATCHING USING SHELL........................................................................ 140
      5.4.1 Character ranges............................................................................................ 140
      5.4.2 Character classes........................................................................................... 141
   5.5      SUMMARY.............................................................................................................. 141
   5.6      ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................ 142
   5.7      TERMINAL QUESTIONS .......................................................................................... 142
Unix shell program training
COE
                                                                              Unit 2, Lesson 5




                          5. Regular Expressions


Regular expressions are very helpful in creating powerful scripts. Regula r
expressions are also used heavily in advanced Unix utilities that we will be studying
further, like sed, AWK and perl language.



5.0       Objectives
          After going through this lesson, you will learn about:

         Using regular expressions
         Regular expression metacharacters
         Finding patterns in files or output
         Character ranges and classes in Bash


5.1       Introduction
          This chapter introduces the concept of regular expressions. A regular
          expression is a pattern that describes a set of strings. This is a very powerful
          concept and can be used effectively in scripting.


5.2       Regular expressions
5.2.1 What are regular expressions?

          Often you will encounter conditions where you need to match specific patterns
          in scripts. For example, given a list of cricket players you may need to find out
          all those players whose names begin with A or B. In other words, you need to
          match with a pattern set. A regular expression helps you define a pattern
          space in a terse way. For example, if you want to match any number where
          no other digit used other than 9 (e.g., 9, 99, 999, 9999, …), then it is
          impossible to write out the entire pattern set. But a regular expression can
          express the same set very easily. Lets see what is a regular expression and
          how are they used.

          Here are few examples of regular expressions. You will begin to understand
          how they represent their patter set as you study this chapter.


           9* => Any number that contains only digit 9 (e.g., 99, 9999,
           etc.)
           India.* => Any string beginning with India (e.g., India, Indian,
           Indiana, etc.)                                                                        133
COE
                                                                             Unit 2, Lesson 5




          A regular expression is a sequence of characters that represents patterns.
          The pattern can be a simple word, like, ―India‖, or can describe more general
          set of patterns like ―India‖, ―Indian‖, ―Indiana‖, etc. Using regular expression
          you can create general patterns like any 3 digit number that does not contain
          the digit 2.

          What is meant by ―regular‖ in the term regular expression? The term ―regular‖
          refers to the fact that there is a pre-defined repetition that it denotes. If the
          repetitions are irregular, then you cannot denote the pattern with a regular
          expression. For example, a set of all the prime numbers cannot be denoted
          using a regular expression!

          What is meant by ―expression‖ in the term regular expression? The
          ―expression‖ in regular expression refers to the fact that, just like
          mathematical expressions, regular expressions can be combined together to
          form new and more complex regular expression.

          By the way, regular expressions are often referred to as regEx by developers.

5.2.2 The Structure of a Regular Expression

          All single characters, including characters like ‗a‘, ‗=‘, ‗3‘, etc., are fundamental
          regular expressions. They match the single character they represent. Most
          characters, including all letters and digits, are regular expressions that match
          themselves. The fundamental regular expressions can be combined to create
          more complex regular expressions. Lets see how we create more complex
          regEx.

          There are three important parts to a regular expression:
         Anchors
         Character sets
         Modifiers

          Anchors are used to specify the position of the pattern in relation to a line of
          text.

          Character Sets match one or more characters in a single position.

          Modifiers specify how many times the previous character set is repeated.

          A simple example that demonstrates all three parts is the regular expression
          "^#*." The up arrow is an anchor that indicates the beginning of the line. The
          character "#" is a simple character set that matches the single character "#".
          The asterisk is a modifier. In a regular expression asterisk specifies that the
          character set can appear any number of times.




                                                                                                  134
COE
                                                                        Unit 2, Lesson 5




5.2.3 Regular expression metacharacters

      There are few special characters that specify repetition styles for the
      preceding character or the preceding expression. These special characters
      that denote the repetition types are called Meta Characters.

      The table below lists various metacharacters and their meanings.

      Table – Regular expression metacharacters

      Operator             Effect
      . (single dot)       Matches any single character
                           The preceding item is optional and will be matched,
      ?
                           at most, once.
                           The preceding item will be matched zero or more
      *
                           times.
                           The preceding item will be matched one or more
      +
                           times.
      {N}                  The preceding item will be matched exactly N times.
                           The preceding item will be matched exactly N or
      {N,}
                           more times.
                           The preceding item will be matched at least N times,
      {N,M}
                           but not more than M times.
                           Represents the range if it‘s not first or last in a list or
      -
                           the ending point of a range in a list.
                           Matches the empty string at the beginning of a line;
      ^
                           also represents the characters in the range of a list.
      $                    Matches the empty string at the end of a line.
      b                   Matches the empty string at the edge of a word.
                           Matches the empty string provided it‘s not at the
      B
                           edge of word.
      <                   Match the empty string at the beginning of a word.
      >                   Match the empty string at the end of word.


       In the example below, the * indicates zero or more
       repetitions of 9.
       9* => Any number that contains only digit 9 (e.g., 99,
       9999, etc.)

       In the example below, the . indicates any character and
       therefore .* indicates any number of repetitions of any
       characters.
       India.* => Any string beginning with India (e.g., India,
       Indian, Indiana, etc.)

       So, for example, India.* will also match India123,
       IndiaZZZ, etc.


                                                                                           135
COE
                                                                      Unit 2, Lesson 5




5.2.4 Creating complex regular expressions by concatenating other regEx

      Suppose you want to use a regular expression to match any string in which
      letter ‗A‘ repeats one or number of times. (e.g., A, AA, AAA, etc.). Then the
      regular expression for this is
       A+ => will match A, AA, AAA, etc. but will not match
       empty string.

      Now suppose you want to use a regular expression to match any string in
      which the digit 4 repeat any number of times.
       4* => will match 4, 44, 444, etc. and will also match an
       empty string.


      Now, suppose you want to create a regular expression to match any string in
      which first the letter ‗A‘ repeats one ore more number of times and then the
      digit 4 repeats any number of times (e.g., A4, A444, AA4, etc). So, you can
      combine the regular expression created earlier:

       A+4* => will match A4, A44, AA4, etc. but will not match
       4AA.


5.2.5 Using metacharacters on regEx to create complex regEx

      Now, suppose you want to create a regular expression that denotes an
      unsigned real number. You can use the following regEx for it:
       [0-9]+(.[0-9]+)? => will match 4, 0.32, 4, etc. but will
       not match -5, .33 or 7e-3.


      Lets dissect this example to understand better:

      First [0-9]+ will match one or more occurrence of a digit.

      To make the fractional part, we need to allow a dot (e.g., dot in .32) . So we
      have . there.

      The fractional part, if present needs to again have at least one digit, so have
      the complete fractional part written as .[[0-9]+ there.

      However we need to make sure that the fractional part should be optional (it
      should match numbers without the fractional parts too). So, the fractional part
      is made optional by putting a question mark for it. Thus making the entire
      regEx as [0-9]+(.[0-9]+)?




                                                                                         136
COE
                                                                         Unit 2, Lesson 5




5.3   The grep command
      Unix has a command to that performs regular expressions based search. This
      command is called grep. grep searches the input for lines containing a match
      to a given pattern list. When it finds a match in a line, it prints the line.

      Note that grep command does not match patterns across multiple lines.
      Here are few examples on grep.
      bash> grep root /etc/passwd
      root:x : 0 : 0 : root:/root:/bin/bash
      operator:x : 11 : 0 : operator:/root:/sbin/nologin

      bash> grep −n root /etc/passwd                     # prints line
      numbers of matches
      1: root:x : 0 : 0 : root:/root:/bin/bash
      12 : operator:x : 11 : 0 : operator:/root:/sbin/nologin

      bash> grep −v bash /etc/passwd | grep −v nologin #
      matching reverted
      sync : x : 5 : 0 : sync : /sbin:/bin/sync
      shutdown : x : 6 : 0 : shutdown : /sbin:/sbin/shutdown
      halt:x : 7: 0 : halt:/sbin:/sbin/halt
      news : x : 9 : 13 : news : /var/spool/news:
      apache : x : 48 : 48 : Apache : /var/www : /bin/false

      bash> grep −c false /etc/passwd       # returns number of
      matches
      7
      bash> grep −i root /etc/pass wd # match regardless of
      the case
      Root:0:0:/root
      root:0:0:/sysadm
      With the first command, user displays the lines from /etc/passwd containing
      the string root. Then displays the line numbers containing this search string.

      With the third command the user checks which users are not using bash, but
      accounts with the nologin shell are not displayed.

      Then the user counts the number of accounts that have /bin/false as the shell.

      The last command displays the lines contining root or Root or ROOT, etc..

      Now let's see what else we can do with grep, using regular expressions.




                                                                                            137
COE
                                                                            Unit 2, Lesson 5




5.3.1 Grep and regular expressions

      a. Line and word anchors

      From the previous example, we now exclusively want to display lines starting
      with the string "root":
       bash> grep ^root /etc/passwd
       root:x:0:0:root:/root:/bin/bash


      If we want to see which accounts have no shell assig ned whatsoever, we
      search for lines ending in ":":
      bash> grep :$ /etc/passwd
      news:x:9:13:news:/var/spool/news:

      To check that PATH is exported in ~/.bashrc, first select "export" lines and
      then search for lines starting with the string "PATH", so as not to display
      MANPATH and other possible paths:
       bash> grep export ~/.bashrc | grep „'<PATH'
       export
       PATH="/bin:/usr/lib/mh:/lib:/usr/bin:/usr/local/bin:/usr/ucb:/
       usr/dbin:$PATH"
      If you want to find a string that is a separate word (enclosed by spaces), it is
      better to use the −w, as in this example where we are displaying information
      for the root partition:
       bash> cat myFile.txt
       Neil Armstrong was the first man to walk on the moon.
       He had said, ―this is a small step for me but a huge step
       for mankind‖.

       bash> grep –w man myFile.txt
       Neil Armstrong was the first man to walk on the moon.

        Note here that the other line is not matched because
      Ifmankind is a single word hence will from the file system table will be
        this option is not used, all the lines not match for the word
      displayed.
        man because –w option is used.


      b. Character classes

      A bracket expression is a list of characters enclosed by "[" and "]". It matches
      any single character in that list; if the first character of the list is the caret, "^",
      then it matches any character NOT in the list. For example, the regular
      expression "[0123456789]" matches any single digit. You can also write it like
      [0-9].




                                                                                                 138
COE
                                                                         Unit 2, Lesson 5


      Within a bracket expression, a range expression consists of two characters
      separated by a hyphen. It matches any single character that sorts between
      the two characters, inclusive, using the locale's collating sequence and
      character set. For example, in the default C locale, "[a−d]" is equivalent to
      "[abcd]". Many locales sort characters in dictionary order, and in these locales
      "[a−d]" is typically not equivalent to "[abcd]"; it might be equivalent to
      "[aBbCcDd]", for example. To obtain the traditional interpretation of bracket
      expressions, you can use the C locale by setting the LC_ALL environment
      variable to the value "C".

      Finally, certain named classes of characters are predefined within bracket
      expressions. See the grep man or info pages for more information about
      these predefined expressions.

        bash> grep [yf] /etc/group
        sys:x : 3 : root,bin,adm
        tty : x : 5 :
        mail : x : 12 :mail,postfix
        ftp : x : 50 :
        nobody : x : 99 :
        floppy:x : 19 :
        xfs : x : 43 :
        nfsnobody : x : 65534 :
        postfix : x : 89 :

        bash> ls *[1−9].xml
        app1.xml chap1.xml chap2.xml chap3.xml chap4.xml

      In the example, all the lines containing either a "y" or "f" character are first
      displayed, followed by an example of using a range with the ls command.

      c. Wildcards

      Use the "." for a single character match. If you want to get a list of all
      five−character English dictionary words starting with "c" and ending in "h"
      (handy for solving crosswords):

       bash> grep „'<c...h>'       /usr/share/dict/words
       catch
       clash
       cloth
       coach
       couch
       cough
       crash
       crush


      If you want to display lines containing the literal dot character, use the −F
      option to grep.


                                                                                            139
COE
                                                                            Unit 2, Lesson 5




      For matching multiple characters, use the asterisk. This example selects all
      words starting with "c" and ending in "h" from the system's dictio nary:
       bash> grep       „'<c.*h>'     /usr/share/dict/words
       caliph
       cash
       catch
       cheesecloth
       cheetah




5.4   Pattern matching using shell
5.4.1 Character ranges

      Apart from grep and regular expressions, there's a good deal of pattern
      matching that you can do directly in the shell, without having to use an
      external program.
      As you already know, the asterisk (*) and the question mark (?) match any
      string or any single character, respectively. Quote these special characters to
      match them literally:

        bash> ls "*"
        This will not list all the files. It will list the file named *.

      But you can also use the square braces to match any enclosed character o r
      range of characters, if pairs of characters are separated by a hyphen. An
      example:
        bash> ls −ld [a−cx−z]*
        drwxr−xr−x 2 radha            radha          4096 Jul 20
                 2002 app−defaults/
        drwxrwxr−x 4 radha            radha          4096 May 25
        2002 arabic/
        drwxrwxr−x 2 radha            radha          4096 Mar 4
        18:30 bin/
        drwxr−xr−x 7 radha            radha          4096 Sep 2
                 2001 crossover/
      Lists all files in radha's home directory, starting with "a", "b", "c", "x", "y" or "z".
        drwxrwxr−x 3 radha            radha          4096 Mar 22
                 2002 xml
      If the first character within the braces is "!" or "^", any character not enclosed
      will be matched. To match the dash ("−"), include it as the first or last
      character in the set. The sorting depends on the current locale and of the
      value of the LC_COLLATE variable, if it is set. Mind that other locales might
      interpret "[a−cx−z]" as "[aBbCcXxYyZz]" if sorting is done in dictionary order.
      If you want to be sure to have the traditional interpretation of ranges, force this
      behavior by setting LC_COLLATE or LC_ALL to "C".



                                                                                                 140
COE
                                                                        Unit 2, Lesson 5




5.4.2 Character classes

         Character classes can be specified within the square braces, using the syntax
         [:CLASS:], where CLASS is defined in the POSIX standard and has one of the
         values
         "alnum", "alpha", "ascii", "blank", "cntrl", "digit", "graph", "lower", "print",
         "punct", "space", "upper", "word" or "xdigit".
          bash> ls −ld [[:digit:]]*
          drwxrwxr−x 2 radha radha               4096 Apr 20 13:45
          2/

          bash> ls −ld [[:upper:]]*
          drwxrwxr−− 3 radha radha              4096 Sep 30 2001
          Nautilus/
          drwxrwxr−x 4 radha radha              4096 Jul 11 2002
          OpenOffice.org1.0/
          −rw−rw−r−− 1 radha radha              997376 Apr 18
          15:39 Schedule.sdc
         When the extglob shell option is enabled (using the shopt built−in), several
         extended pattern matching operators are recognized.


5.5      Summary
         Regular expressions are powerful tools for selecting particular lines from files
         or output. A lot of UNIX commands use regular expressions: vim, perl, the
         PostgreSQL database and so on. They can be made available in any
         language or application using external libraries, and they even found their way
         to non−UNIX systems. For instance, regular expressions are used in the
         Excell spreadsheet that comes with the MicroSoft Windows Office suite. In
         this chapter we got the feel of the grep command, which is indispensable in
         any UNIX environment.

         Bash has built−in features for matching patterns and can recognize character
         classes and ranges.



Self-check Questions
1.    What are regular expressions
2.    What will be the result of ls -l | grep '^.....w'
3.    What does the expression gg* signify?
4.    How do you locate lines in a file foo containing ram and raman using grep?




                                                                                           141
COE
                                                                          Unit 2, Lesson 5



5.6        Answers to the Self-Check questions
      1. A regular expression is a pattern that describes a set of strings.
      2. This locates all files which have write permission for the group (e.g. drwxrw -r-
         x)
      3. One or more occurrences of g.
      4. Use grep ―rama*n*‖ foo


5.7        Terminal Questions
      1.   Describe the structure of a regular expression.
      2.   Describe some regular expression operators.
      3.   What is the difference between a wild card and a regular expression?
      4.   What is the difference between basic and extended regular expression




                                                                                             142
UNIT 3: Advanced Shell Scripting, sed, and
awk

1. FUNCTIONS IN SHELL SCRIPTS .................................................................. 147

2. SED – STREAM EDITOR.................................................................................... 159

3. AWK BASICS ........................................................................................................... 169

4. AWK PROGRAMMING ........................................................................................ 177
Unix shell program training
COE                                                                                                              Unit 3, Lesson 1




LESSON 1                      FUNCTIONS IN SHELL SCRIPTS
1. FUNCTIONS IN SHELL SCRIPTS ............................................................................. 147

  1.0       OBJECTIVES .......................................................................................................... 147
  1.1       INTRODUCTION TO SHELL FUNCTIONS................................................................... 147
      1.1.1 When to use functions? ................................................................................. 147
      1.1.2 Benefits of using functions ............................................................................ 149
      1.1.3 Where you cannot create functions?........................................................... 150
  1.2       WRITING A SHELL FUNCTION ................................................................................. 150
      1.2.1 Function header.............................................................................................. 150
      1.2.2 Function body ................................................................................................. 151
      1.2.3 Returning from a function.............................................................................. 152
      1.2.4 Function arguments ....................................................................................... 152
      1.2.5 IFS (internal field separators) ....................................................................... 153
      1.2.6 Creating a utility library of shell functions ................................................... 154
      1.2.7 Things to keep in mind while writing shell functions ................................. 154
  1.3       SUMMARY.............................................................................................................. 155
  1.4       ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 155
  1.5       TERMINAL QUESTIONS .......................................................................................... 155
Unix shell program training
COE                                                                             Unit 3, Lesson 1




                     1. Functions in Shell Scripts


So far you have learnt various unix commands, plumbing commands together using
pipes and creating shell scripts for programming to carry out useful and routine,
repetitive tasks. Shell scripting in unix can never be complete without knowing how
to write and use functions.



1.0       Objectives
          After going through these lessons you will know
         When to use functions in shell scripts
         How to write and use functions in shell scripts


1.1       Introduction to shell functions
          Often there are few lines of code that need to be used at several places in the
          shell scripts. For example, if you are creating a shell script that will read a 3
          digit STD code and 7 digit phone number and you need to ensure that user
          types in exactly 3 numeric characters for STD code and exactly 7 numeric
          characters for phone number, then it will be better to create and use a
          function instead of replicating the same code at multiple places.

          A function is like a mini script. It can take parameters, can define its own
          variables, can return a value, etc. Unlike a script‘s call, a function executes in
          the same shell. Functions in shell scripts look and work similar to functions in
          C language.

1.1.1 When to use functions?

          Consider the example listed in 1.1 above, without using functions:




                                                                                            147
COE                                                                       Unit 3, Lesson 1




              #!/bin/bash
              stdOK=0
              do
                  echo ―Please enter 3 digit STD code: ―
                  read std
                  chkSTD=`echo $std | grep ―^[0-9][0-9][0-
              9]$‖`
                  if [ ―$chkSTD‖‖X‖ != ―X‖ ]; then
                       stdOK=1
                  else
                       echo ―Please enter exactly 3 digit STD
              code here‖
                  fi
              while [ $stdOK –neq 1 ]

              phoneOK=0
              do
                  echo ―Please enter 7 digit phone number: ―
                  read phone
                  chkPH=`echo $phone | grep ―^[0-9][0-9][0-
              9][0-9][0-9][0-9][0-9]$‖`
                  if [ ―$chkPH‖‖X‖ != ―X‖ ]; then
                       phoneOK=1
                  else
                       echo ―Please enter exactly 7 digit phone
              number here‖
                  fi
              while [ $phoneOK –neq 1 ]
              callup $std $phone

                                         Script 1

      You will find that apart from the marked text below, the rest of the code is
      repeated.




148
COE                                                                    Unit 3, Lesson 1




              #!/bin/bash

              stdOK=0
              do
                  echo ―Please enter 3 digit STD code: ―
                  read std
                  chkSTD=`echo $std | grep ―^[0-9][0-9][0-
              9]$‖`
                  if [ ―$chkSTD‖‖X‖ != ―X‖ ]; then
                       stdOK=1
                  else
                       echo ―Please enter exactly 3 digit STD
              code here‖
                  fi
              while [ $stdOK –neq 1 ]

              phoneOK=0
              do
                  echo ―Please enter 7 digit phone number: ―
                  read phone
                  chkPH=`echo $phone | grep ―^[0-9][0-9][0-
              9][0-9][0-9][0-9][0-9]$‖`
                  if [ ―$chkPH‖‖X‖ != ―X‖ ]; then
                       phoneOK=1
                  else
                       echo ―Please enter exactly 7 digit phone
              number here‖
                  fi
              while [ $phoneOK –neq 1 ]
              callup $std $phone
                                          Script 2

      See how much simpler it would be if you had a function that got you the
      desired numbers!!


              #!/bin/bash

              std=`getNumber 3 ―STD code‖`
              phone=`getNumber 7 ―phone number‖
              callup $std $phone


                                        Script 3

1.1.2 Benefits of using functions

      Functions provide several benefits as listed below:




                                                                                   149
COE                                                                             Unit 3, Lesson 1




         Functions simplify and modularize your scripts. Your scripts become better
          readable (compare script 1 and script 3 above).
         Modularize scripts are easier to maintain and enhance.
         Functions provide you easier debugging.
         Once you enhance a function, the enhanced effect is automatically available
          at all places where the function is used.
         You can even create a utility file containing functions and source it in your
          other scripts so that utility functions are directly available for use, instead of
          writing them over and over again.

1.1.3 Where you cannot create functions?

          Be aware that not all shells provide support for functions. For example csh (C -
          shell) does not provide support for functions.

          But most other shells have this support, including sh (Bourne shell), ksh (korn
          shell), tsh, bash (born again Bourne shell), etc.



Self Check Questions
1. When few lines of code needs to be repeated at several places a ______ should
   be created for it (select one):
   a. script
   b. program
   c. function
2. A function helps in improving the script by making it (select one or many as
   apply):
   d. more readable
   e. more debug gable
   f. modular
   g. more maintainable




1.2       Writing a shell function
          A shell function in bash has the following syntax. Text in bold indicates
          keywords.

          <yourFunctionName>() { <commands>; }
          Or
          function <yourFunctionName> { <commands>; }

1.2.1 Function header




150
COE                                                                      Unit 3, Lesson 1



      You can define a function by using the function keywords or you can define a
      function by putting braces after the function name. For example:

      Following defines a function named ―aaa‖.

              function aaa {
                 a =1
              }


      Following defines a function named ―bbb‖.

              bbb() {
                a=1
              }



      Note that parameters to functions are not passed like C. Therefore, in
      function‘s header you will not declare any parameters. See the definition of
      the function ―bbb‖ above. No parameters are ever listed within the braces.

1.2.2 Function body

      Set of commands comprise of the function body.

      A function can contain any set of shell scripting commands, including flow
      control commands like while and conditional commands like if, etc.
      Commands can also contain calls to other functions and even other shell
      scripts.

      For example:


              getDateString() {
                 echo ―Date format is dd/mm/yy ?:‖
                 read x
                 if [ ―$x‖‖X‖ = ―yX‖ ];then
                      str=`date ‗+%dd%mm%yy‘`
                 else
                      str=`date ‗+%yy%mm%dd‘`
                   fi
                   echo $str
              }


      The above script uses the call to ―date‖ shell command.




                                                                                     151
COE                                                                         Unit 3, Lesson 1




Self-Check Questions
3.    The function keyword is must for writing a function (true /false).
4.    You must declare arguments to a function in the function header (true/false).
5.    You cannot declare arguments to a function in the function header (true/false).
6.    Function body can contain any of the shell commands (true/false).



1.2.3 Returning from a function

         If your function reaches the end of its body and it has an echo command, it
         echoes the return value. Alternatively, you can return without completing the
         execution of the function body by using the return keyword.

         For example:

                 aaa() {
                   a=1
                   b=2
                   echo $a
                 }

                 ret=`aaa`     # ret will be 1

                 bbb() {
                   a=1
                   b=2
                   return $a
                 }

                 ret=`aaa`     # ret will be 1


1.2.4 Function arguments

         Parameters can be passed when calling a function by listing them in front of
         the function. When inside the function, these parameters can be accessed as
         shell variables, $1, $2, etc. Even $# (number of arguments passed) is
         available inside the function.

         Example:




152
COE                                                                        Unit 3, Lesson 1




              addTwoNums() {
                sum=0
                sum=`expr $1 + $2`
                return $sum
              }

              addAllNums() {
                sum=0
                if [ ―$1‖‖X‖ = ―X‖ ];then
                     return $sum
                else
                     sum=`$sum + $1`
                fi
              }


1.2.5 IFS (internal field separators)

      You need to be careful while passing arguments to a function or a shell
      command in a shell script. Shell interprets the values that you supply. As a
      result, a string passed as a parameter can get interpreted as multiple
      parameters if it contains spaces.

      For example:

      paintObj “greenish blue”
      Here you would expect to see $1 inside the function as ―greenish blue‖ but
      you will get $1 as ―greenish‖ and $2 as ―blue‖.

      You can tell shell to interpret newline as a field separator by declaring in your
      script
      IFS=”
      ―             # Yes, the closing quote is on the next line!

      Therefore, if you use the following:
      IFS=”
      “
      paintObj “greenish blue”

      Now, here you will get $1 inside the function as ―greenish blue‖.



Self-Check Questions
7. Parameters passed to a function are accessible using $1, $2, variables.
   (true/false).
8. The $# inside a function indicates the number of parameters passed to the script
   (true/false).



                                                                                       153
COE                                                                             Unit 3, Lesson 1




1.2.6 Creating a utility library of shell functions

          When you create shell functions, you would typically want to make them
          somewhat generic so that they can be reused in other shell scripts as well. In
          such cases, you can simply collect your shell functions into a single file. Such
          a file containing utility shell functions can be used as a library and can be
          sourced in other shell scripts.

          For example:

                  bash>cat a_simple_utility_library.sh

                  #!/bin/sh
                  #--------------------------------- a_simple_utility_library
                  ---------------------------
                  IFS=‖
                  ―
                  # myecho function echoes the input and also
                  writes it into multiple files
                  myecho() {
                     for i in $FILE_LIST
                     do
                        echo $* >> $i
                     done
                     echo $*
                  }

                  mykill() {
                      pid=`ps –ef | grep $1 | grep –v ―grep‖ | awk
                  ‗{print $2}‘`
                      kill $pid
                  }
                  #------------------------- a_simple_utility_library ends
                  ---------------------------



             bash>cat my_application.sh
1.2.7 Things to keep in mind while writing shell functions

          Just like #!/bin/sh commands, there are restrictions when writing shell
                     other shell
          functions.. a_simple_utility_library.sh       # The dot in the
                   beginningbracket must be right on the same line as the function
          The starting curly   sources it
          header. mykill junkjob         # will kill the process running
                   ―junkjob‖
          There must be spaces on both sides of curly brackets.
         There must be either a semi colon or a new line before the closing curly
          bracket.




154
COE                                                                             Unit 3, Lesson 1




1.3        Summary
           Functions help in modularizing the scripts for repetitive tasks. If you use
           functions, scripts become better readable and maintainable.


1.4        Answers to the self check questions
      1.   (c)
      2.   all.
      3.   false.
      4.   false.
      5.   true.
      6.   false.
      7.   true.
      8.   false.


1.5        Terminal Questions
      1. Discuss among your peers how functions are different from aliases.
      2. Write a function that gets you a non-empty string.
      3. Write a function that uses the function created in assignment 2 above to read
         and convert a string into all uppercase.
      4. Write a script that takes name, middle name and family name of a person and
         prints them out in all uppercase or all lowercase depending on a shell
         variable‘s value.
      5. Write a function that takes number of digits as an input and gets a number
         containing those many digits. The function must check that user has to
         provide a number.
      6. Write a function that takes number of digits as an input and gets a number
         containing at most those many digits.




                                                                                            155
Unix shell program training
COE                                                                                                              Unit 3, Lesson 2




LESSON 2                      SED – STREAM EDITOR

2. SED – STREAM EDITOR ............................................................................................ 159

   2.0      OBJECTIVES .......................................................................................................... 159
   2.1      INTRODUCTION TO SED ......................................................................................... 159
   2.2      HOW SED OPERATES ............................................................................................. 159
   2.3      SYNTAX OF THE SED COMMAND............................................................................ 160
      2.3.1 Options for the sed ......................................................................................... 160
   2.4      COMMANDS IN SED................................................................................................ 161
      2.4.1 Syntax of the commands in sed................................................................... 162
   2.5      SUMMARY.............................................................................................................. 164
   2.6      ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 164
   2.7      TERMINAL QUESTIONS .......................................................................................... 165
Unix shell program training
COE                                                                           Unit 3, Lesson 2




                         2. SED – Stream Editor


Sed (Stream editor) is a utility program available in unix. sed is a powerful utility that
can be used to transform the input, line-by-line. sed is commonly used in scripting.



2.0       Objectives
          After going through these lessons you will know

         What is sed? Its options and commands
         What are regular expressions
         Interactive use of sed
         Using sed commands in scripts


2.1       Introduction to sed
          A Stream Editor is used to perform transformations on text read from a file or
          a pipe. Sed sends the result to the standard output which can be redirected
          and collected into another file, if needed.

          Sed does not modify the original input file. Unlike other editors, vi and ed,
          which are interactive editors, sed works on an input stream. Sed therefore is
          suitable in scripts when you need text transformations, like in conversion
          programs.

          For example: If you have a file where ―error‖ is misspelt as ―erorr‖, you can
          correct them by using sed command:
          sed „s/erorr/error/g‟ myfile > myfile_corrected


2.2       How sed operates
          It often comes handy to know how a utility works. Here is the detail on how
          sed works:
         A line of input is copied into a pattern space.
         All editing commands in a sed script are applied in order to the copied line.
         The copied (and modified) line is sent to standard output.
         By default, sed works on all the lines of input. However, its scope can be
          controlled by line addressing.
         Editing commands are applied to all lines (globally) unless line addressing
          restricts the lines affected.


                                                                                          159
COE                                                                             Unit 3, Lesson 2




         If a command changes the input, subsequent command -addresses will be
          applied to the current line in the pattern space, not the original input line.


2.3       Syntax of the sed command
          Sed can be invoked in one of the following forms:

          sed [options ] 'c ommand ' file(s)
          Or
          sed [options] -f scriptfile file(s)

          The first form allows you to specify an editing command on the command line,
          surrounded by single quotes.

          The second form allows you to specify a scriptfile , a file containing sed
          commands. If no files are specified, sed reads from standard input.

2.3.1 Options for the sed

         The –e option
          -e <script> option tells sed to add the commands in <script> to the set of
          commands to run. You can give a series of commands using –e option. For
          example:

          sed -e 's/a/A/' -e 's/b/B/' < oldFile >newFile

         The –f option
          -f <scriptFile> : Tells sed to add the commands from <scriptFile> to the set of
          commands to run. For example, instead of just replacing ‗a‘ and ‗b‘, it you
          want to uppercase all vowels in the input, you can write an sed script file:

                   bash>cat sed_script
                   # sed comment - This script changes lower case
                   vowels            to         upper        case
                   s/a/A/g
                   s/e/E/g
                   s/i/I/g
                   s/o/O/g
          sed -f sed_script < oldFile > newFile
                   s/u/U/g
          will uppercase all vowels.

          Note that in sed script files, each command must be on a separate line. No
          trailing white spaces can exist at the end of lines. No quotes can be used.



         The –n option




160
COE                                                                         Unit 3, Lesson 2



       -n : This option tells sed not to print by default. Only when specific sed
       commands for print are used, those specific items will be printed. For
       example,
       sed –n „s/pattern/&/p‟ file

       will act like grep looking for ―pattern‖.




Self-Check Questions
1. sed is an interactive editor like vi (true/false)
2. sed can be used in scripts (true/false)




2.4    Commands in sed
       Sed supports grep like regular expressions to find the text for pattern
       substitution and deletion. Sed uses vi like commands:

       a         appends text below the current line
       i         Insert text above the current line
       c         change text in the current line with new text
       s         search and replace text
       d         Delete text
       p         Prints text

       For example, if there is a file that lists tasks like:

                bash>cat tasks
                DONE: functions
                TODO: sed
                TODO: awk
                DONE: password change

       To delete all lines in a file that are marked DONE, you can use


                sed ‗/DONE/d‘ tasks > new_tasks
                bash>cat new_tasks
                TODO: sed
                TODO: awk




                                                                                        161
COE                                                                            Unit 3, Lesson 2



2.4.1 Syntax of the commands in sed

         The sed commands have the general form as listed below:

         [address][,address][!]operation [arguments]

         Sed commands consist of addresses and operation. Each operation consists of a
         single letter.

         Let‘s take the following input file for the examples given below:

                 bash>cat input_file
                 This is the first line
                 This is the second line of text
                 This is the third line of input_file
                 This is the fourth and the last line



      1. If no address is specified, the operation is applied to each line. For example:

                 sed ‗s/This/this/g‘ < input_file > output_file

                 bash>cat output_file
                 this is the first line
                 this is the second line of text
                 this is the third line of input_file
                 this is the fourth and the last line


      2. Only the first pattern is matched by default. For example,

                 sed ‗s/the/a/‘ < input_file > output_file

                 bash>cat output_file
                 This is a first line
                 This is a second line of text
                 This is a third line of input_file
                 This is a fourth and the last line


         The second “the” is not modified.

         To tell sed to work on all the matched patterns on a line, use ―g‖.
                 sed ‗s/the/a/g‘ < input_file > output_file

                 bash>cat output_file
                 This is a first line
                 This is a second line of text
                 This is a third line of input_file
                 This is a fourth and a last line
162
COE                                                                          Unit 3, Lesson 2



         The second “the” is also modified now.

      3. Only one address can be given. For example:

                 sed ‗2s/second/SECOND/g‘               <   input_file   >
                 output_file

                 bash>cat output_file
                 This is the first line
                 This is the SECOND line of text
                 This is the third line of input_file
                 This is the fourth and the last line

      4. Two addresses can be given to make a block. For example:

                 sed ‗1,2s/line/input/g‘ < input_file > output_file

                 bash>cat output_file
                 This is the first input
                 This is the second input of text
                 This is the third line of input_file
                 This is the fourth and the last line


      5. $ can be used to denote end of file in specifying addresses For example:

                 sed ‗3,$d‘ < input_file > output_file

                 bash>cat output_file
                 This is the first input
                 This is the second input of text


      6. Address can also be given using patterns. For example:

                 sed ‗/input_file/d‘ < input_file > output_file

                 bash>cat output_file
                 This is the first line
                 This is the second line of text
                 This is the fourth and the last line

      7. Address can also be inverted.
         sed „/SAVE/!d‟

         this will delete all lines that do not have SAVE on them

         sed „/BEGIN/,/MID/s/error/error/g‟



                                                                                         163
COE                                                                          Unit 3, Lesson 2



           this will replace erorr by error from BEGIN to MID.

           sed „/^BEGIN/,/^END/!s/done//g‟

           will delete the word done for all lines except for those lines between BEGIN
           and END.

           Address and patterns can include grep like regular expressions as well. For
           example:

                    sed ‗/This.*first/p‘ input_file > output_file

                    bash>cat output_file
                    This is the first line



Self-Check Questions
3. What argument can be used to tell sed to apply operations o n all the matched
   patterns on a line:
   a. none. Sed already does that by default.
   b. g
   c. i
4. What character can be used to invert the address in sed?
   a. none
   b. i
   c. x



2.5        Summary
           The sed stream editor is a powerful command line tool, which can handle
           streams of data: it can take input lines from a pipe. This makes it fit for
           non−interactive use. The sed editor uses vi−like commands and accepts
           regular expressions. The sed tool can read commands from the command line
           or from a script. It is often used to perform find−and−replace actions on lines
           containing a pattern.


2.6        Answers to the self check questions
      1.   false.
      2.   true.
      3.   (b)
      4.   (b)




164
COE                                                                                Unit 3, Lesson 2




2.7      Terminal Questions
      1. Use sed to implement a head like utility of unix (prints only first 5 lines).
      2. Use sed to implement tail like utility of unix (prints only the last 5 lines).
      3. Print a list of files in your scripts directory, ending in ".sh". Mind that you might
          have to unalias ls. Put the result in a temporary file.
      4. Make a list of files in /usr/bin that have the letter "a" as the second character.
          Put the result in a temporary file.
      5. Delete the first 3 lines of each temporary file.
      6. Print to standard output only the lines containing the pattern "an".
      7. Create a file holding sed commands to perform the previous two tasks. Add
          an extra command to this file that adds a string like "*** This might have
          something to do with man and man pages ***" in the line preceding every
          occurrence of the string "man". Check the results.
      8. A long listing of the root directory, /, is used for input. Create a file holding sed
          commands that check for symbolic links and plain files. If a file is a symbolic
          link, precede it with a line like "−−This is a symlink−−". If the file is a plain file,
          add a string on the same line, adding a comment like "<−−− this is a plain file".
      9. Create a script that shows lines containing trailing white spaces from a file.
          This script should use a sed script and show sensible information to the user
      10. Can sed be used to create tail –f kind of utility?
      11. Search the internet to find how newline can be replaced.
      12. Top 4 lines of a file contain names of students and rest 4 lines contain their
          marks :
          bash>cat file
          Mohit verma
          Sushobhit sinha
          Mukul Khan
          Naina Suman
          20
          25
          35
          28
          Using sed and paste command, create another file that will have
          Mohit verma 20
          Sushobhit sinha 25
          Mukul Khan 35
          Naina Suman 28




                                                                                               165
Unix shell program training
COE                                                                                                               Unit 3, Lesson 3




LESSON 3                       AWK BASICS
3. AWK BASICS ................................................................................................................ 169

   3.0       OBJECTIVES .......................................................................................................... 169
   3.1       INTRODUCTION AND BRIEF HISTORY ..................................................................... 169
   3.2       THE SYNTAX OF AWK........................................................................................... 169
   3.3       USING AWK .......................................................................................................... 170
      3.3.1 The print command in AWK.......................................................................... 171
      3.3.2 Accessing fields on a line.............................................................................. 172
   3.4       SUMMARY.............................................................................................................. 174
   3.5       ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 174
   3.6       TERMINAL QUESTIONS .......................................................................................... 174
Unix shell program training
COE                                                                            Unit 3, Lesson 3




                                  3. AWK Basics


AWK is a utility for performing simple text-processing tasks. Awk also
provides a small but powerful language that allows the user to manipulate files
containing columns of data and strings, to print reports from the data.



3.0       Objectives
          After going through this lesson you will know
         What is AWK, the syntax of AWK
         How is AWK useful
         Print command in AWK
         How to access fields in AWK


3.1       Introduction and brief history
          AWK stands for the names of its authors: "Aho, Weinberger, &
          Kernighan".

          The original version of AWK was developed in 1977. In Unix it is
          available as awk. Advanced versions exist (e.g, nawk, gawk) that
          support user defined functions, multidimensional arrays, ?: operator,
          deleting elements in an array, etc.

          Awk operates in a cycle: get a line, process it, get the next line,
          process it, and so on. It is an "interpreted" language -- that is, an Awk
          program cannot run on its own, it must be executed by the Awk utility
          itself.

          Like sed, AWK reads an input file or reads from a pipe. It does not
          modify the input file and writes its output onto the standard output. In
          addition, because AWK is a programming language in itself, awk is
          very useful in processing data and printing reports.


3.2       The syntax of AWK
                  awk [options] ‗
                      [ BEGIN          {<initializations>} ]
                       [ <program> ]
          '           [ [ <program>] ]
                         ...
                      [ END           {<final actions>} ]
                  ' <File Name>
                                                                                           169
COE                                                                       Unit 3, Lesson 3




      Where each <program> has the format:

      [ <search pattern 1> ] [ {<program actions>} ]

      Awk operates as listed below:
      1. Perform initialization if BEGIN is given
      2. Read a line of text, break it into fields
      3. For each <program>
      4. Perform the program as given by user
      5. Goto step2.
      6. Perform END calculations if specified by the user

      The optional BEGIN clause performs any initializations required before
      Awk starts scanning the input file.

      The subsequent body of the Awk program consists of a series of
      search patterns, each with its own program action. Awk scans each
      line of the input file for each search pattern, and performs the
      appropriate actions for each string found.

      Once the file has been scanned, an optional END clause can be used
      to perform any final actions required.


3.3   Using AWK
      We will use the following example data to see how to use awk. This
      data is a file containing the top marks for some of the subjects along
      with the topper names and years.

              bash>cat toppers.txt
                     Physics       92     2003     Abhay Malhotra
                     Chemistry     97     2003     Suman Gupta
                     Maths         99     2003     Suresh Yadav
                     Physics       94.5   2004     Shriesh Jadhav
                     Chemistry     98.5   2004     Shriesh Jadhav
                     Maths         96     2004     Lokesh Arora
                     Physics       89     2005     Vandana Agarwal
                     Chemistry     92     2005     Srinivas Vardharajan
                     Maths         99     2005     Anup Mathur
                     Physics       98     2006     Ramakant
                     Chemistry     88     2006     Raju Pandy
                     Chemistry     89     2006     Rajni Kumar
                     Maths         98     2006    Javed M. K. Akthar

      Example 1: Since almost all of the awk syntax is optional, at the
      minimum, the simplest awk command can be written as
      awk „‟ input_file


170
COE                                                                           Unit 3, Lesson 3




         This will work like the cat command and print the entire input_file as is.

         Note that here we are running an AWK code using awk command. The
         code is always kept within quotes.

         Example 2: You can ask awk to work on specific lines. For example,
         you can give a search pattern.

         awk '/Physics/' toppers.txt > phy_toppers.txt

         Note that AWK does not modify the input.
         Also note that AWK writes output to the standard output.

         Here we have redirected the output into a file phy_toppers.txt. Now
         let‘s see the contents of this output file:

                  bash>cat phy_toppers.txt
                       Physics      92        2003    Abhay Malhotra
                        Physics     94.5      2004    Shriesh Jadhav
                        Physics     89        2005    Vandana Agarwal
                        Physics     98        2006    Ramakant


         Example 3: Pattern matching is based on case. For example, here if
         you gave ―physics‖ in place of ―Physics‖ here as a search pattern, it
         would not match the lines containing ―Physics‖.

         awk '/physics/' toppers.txt > no_match.txt

         The file no_match.txt will come out an empty file.



Self-Check Questions
1.    Awk is useful for processing text co ntaining columns of data. (true/false).
2.    Awk is a small programming language in itself (true/false).
3.    Awk does not modify the input file (true/false).
4.    Awk program cannot run on its own but needs which one to run:
        (a) awk (b) sed (c) grep



3.3.1 The print command in AWK

         A simple print command is available in AWK. This command does not need
         any format specifications and values can be printed in a simple way.



                                                                                          171
COE                                                                        Unit 3, Lesson 3



      Example 4: If you use print with no arguments, it prints the input text as
      is.

                awk „/Physics/ { print }
                      /Maths/ {print }‟ toppers.txt

                will print
                        Physics      92     2003       Abhay Malhotra
                        Maths        99     2003       Suresh Yadav
                        Physics      94.5   2004       Shriesh Jadhav
                        Maths        96     2004       Lokesh Arora
                        Physics      89     2005       Vandana Agarwal
                        Maths        99     2005       Anup Mathur
                        Physics      98     2006       Ramakant
                        Maths        98     2006      Javed M. K. Akthar


      Example 5: You can give arguments to print

                awk ‗/Maths/ {print ―This is a math topper‖}‘
                toppers.txt
                will print
                This is a math topper
                This is a math topper
                This is a math topper
                This is a math topper
      Note an important point here. The print command prints the arguments
      as is, so if you need any text like spaces, you will need to add that in
      the print command itself as shown above. We will see more concrete
      examples of print command in subsequent examples.

3.3.2 Accessing fields on a line

      The power of AWK lies in the fact that it treats each input line as a
      record consisting of fields. Which means, as it reads lines, it breaks up
      the line into fields and lets you access and manipulate fields and the
      output.

      By default AWK uses spaces as the separator for fields which means
      when it reads a line, it breaks it up into words for you. The separator
      can be changed easily as we will see later in this unit.

      To access the fields of input line, awk provides the following built in
      variables: $0, $1, $2 … $9. The first one, $0, gives you the entire line,
      as is. $1 is the first field, $2 is the second field, .. and so on.

      Example 6: If the input line just read in by awk is

      Physics     92     2003     Abhay Malhotra


172
COE                                                                      Unit 3, Lesson 3



      then,
      $1 will contain ―Physics‖
      $2 will contain 92
      $3 will contain 2003
      $4 will contain ―Abhay‖
      $5 will contain ―Malhotra‖

      Note that because awk is treating space as the separator, it breaks up
      the name too into two separate fields.

      Example 7: To print just the names of chemistry toppers, you can use
      the following command:

       awk '/Chemistry/ {print $4, ― ―, $5, ― ―,$6, ― ―,$7,‖ ―,
       $8);}' toppers.txt > chem_topper.txt
       bash>cat chem_toppers.txt
       Suman Gupta
       Shriesh Jadhav
       Srinivas Vardharajan
       Raju Pandy
       Rajni Kumar


      Note that we have used $5 to $9, though the names that we got in the
      output would have come even with $5 and $6 alone because it seems
      from the output that names are occupying only two fields. However, we
      do have a longer name (Javed M. K. Akhtar) also in the names which is
      occupying 3 fields. Therefore we need to be aware of the data when
      printing multiple fields. AWK does not have a way to say things like
      ―print all fields from $5 onwards‖ so we need to use additional fields.
      However, if you simply want to print the entire line, then you do not
      need to use these fields. For example,

      Example 8: To print all data for math toppers, use the following

       awk '/Maths/ {print}' toppers.txt > math_toppers.txt
       #Note no $1, $2 used
       bash>cat math_toppers.txt
        Maths     99       2003    Suresh Yadav
        Maths     96       2004    Lokesh Arora
        Maths     99       2005    Anup Mathur
        Maths     98       2006    Javed M. K. Akthar

      The examples so far were solving things that can be solved by a combination
      of grep, sed, cut etc., as well. However AWK is much more capable. We will
      see the other features in subsequent chapters.




                                                                                     173
COE                                                                              Unit 3, Lesson 3




Self-Check Questions
5. awk processes how many line(s) of input at a time?
       (a)1, (b) 2, (c) depends on the available memory, (d) all lines in input.
6. awk breaks inputs into columns or words (true/false)
7. awk uses spaces to break inputs (true/false)
8. The print command can be used to print the fields of input with added text
   (true/false)




3.4        Summary
           The awk utility is a powerful command line tool, which can handle
           streams of data: it can take input lines from a pipe. This makes it fit for
           non−interactive use.


3.5        Answers to self check questions
      1.   true.
      2.   true.
      3.   true.
      4.   (a)
      5.   (a)
      6.   true.
      7.   true.
      8.   true.


3.6        Terminal Questions
      1. Take the toppers.txt of this chapter. For each year and subject, print
         the first name of the topper, marks and then year.
      2. Do the same question as listed above but now print the complete name
         of the topper followed by marks and then year.
      3. See the AWK syntax. We have used only one pattern and its program
         in our examples. Try using multiple patterns and their corresponding
         programs and see the outputs.




174
IT 102                                                                                                               Unit 3, Lesson 4




LESSON 4. AWK PROGRAMMING
4. AWK PROGRAMMING ................................................................................................ 177

   4.0         OBJECTIVES .......................................................................................................... 177
   4.1         INTRODUCTION ...................................................................................................... 177
   4.2         RELATIONAL AND LOGIC OPERATORS IN AWK ..................................................... 177
   4.3         CONTROL STRUCTURES IN AWK............................................................................ 178
         4.3.1 The if-else construct....................................................................................... 178
         4.3.2 The for loop ..................................................................................................... 179
   4.4         SPECIAL VARIABLES - NF AND NR........................................................................ 180
         4.4.1 Using BEGIN and END clauses in awk ...................................................... 180
         4.4.2 Using variables in AWK ................................................................................. 181
   4.5         RUNNING AWK PROGRAMS KEPT IN FILES........................................................... 182
   4.6         GENERATING REPORTS USING AWK.................................................................... 184
         4.6.1 The printf command of AWK ........................................................................ 184
         4.6.2 Format specifications in printf....................................................................... 185
         4.6.3 Printing the fields in different order than input ........................................... 186
         4.6.4 Creating simple reports ................................................................................. 186
         4.6.5 Field separator ................................................................................................ 187
         4.6.6 Printing heading/heading row and summary/footer .................................. 189
   4.7         MISCELLANEOUS FEATURES OF AWK.................................................................. 190
         4.7.1 Specifying search patterns in AWK ............................................................. 190
         4.7.2 Limiting the lines on which AWK would work............................................. 191
         4.7.3 Built-in variables ............................................................................................. 192
         4.7.4 Passing arguments to AWK.......................................................................... 193
         4.7.5 Arrays and associative arrays in AWK ........................................................ 195
         4.7.6 String functions in AWK................................................................................. 195
         4.7.7 Few interesting, complex examples ............................................................ 196
   4.8         SUMMARY.............................................................................................................. 197
   4.9         ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 197
   4.10        TERMINAL QUESTIONS .......................................................................................... 197
Unix shell program training
COE                                                                          Unit 5, Lesson 6




                           4. AWK Programming


In the previous chapter we saw how AWK can be used to process the input
data and print in some ways as needed. In this chapter we will see
programming features of AWK that make it very powerful.




4.0       Objectives
          After going through this lesson you will know
         How to use AWK programming
         Relational and logic operators for conditions
         Control structures
         Use of variables, BEGIN and END clauses
         How to generate reports using AWK
         Miscellaneous features of AWK


4.1       Introduction
          AWK provides a simple yet powerful programming language. The
          programming language features are similar to C language constructs.

          Note that we will continue to refer to the toppers.txt file from chapter 3 for
          examples.


4.2       Relational and logic operators in AWK
          AWK supports comparing fields to create conditions. Relational operators,
          that compare two values, are available in awk. For example, a condition like
          $1 == 2006 can be used. We will see such usage in subsequent examples
          below.
          Relational operators like the following are there

                  ==     Compares whether the values specified are
                  equal
                  !=     Compares whether the values specified are
           not equal
                  >      Tells whether a value is greater than the
                  other.
                  >=     Tells whether a value is greater than or equal
                  to other.
                  <      Tells whether a value is less than the other.
                  <=      Tells whether a value is less than or equal to                        177
                  other
COE                                                                    Unit 5, Lesson 6




      Multiple relational conditions can be combined using logic operators. For
      example $1 == 2006 && $2 != 98. This condition will be true only when first
      field will be 2006 and second will not be equal to 98.

      Logic operators like the following are there:

                &&     implies logic and
                ||     implies logic or
                !      Implies logic negation


      Note an important point here. The relational operators only evaluate to
      true/false. Unlike C operators they do not return a value which could be
      printed or used in an expression. So, for example ($1 == 1) + ($2 == 0) will
      result in an error during AWK run.

      Examples in subsequent sections will show conditions that use relational and
      logic operators.


4.3   Control structures in awk
      AWK provides C like control structures as well to facilitate programming.
      Control structures in AWK include the following:

4.3.1 The if-else construct

      The if-else construct of AWK has the following syntax.

      if (condition) statement [ else statement ]

      Example 1: To print the first name of the chemistry topper for year 2006, we
      can use
              awk ‗/Chemistry/ { if( $3 == 2006 ) print $5 }‘
              toppers.txt

              will print
                     Raju


      Note that there is no else in the example above. The else part of if-else is
      optional.

      Example 2: Print whether Maths toppers had more than 97 marks.




178
COE                                                                     Unit 5, Lesson 6




                awk ‗/Maths/ { if( $2 > 98 )
                           {
                       print ―In the year ―, $3;
                       print ― ―, $5, ― had more than 98 marksn‖
                           }
                       else
                          {
                       print ―In the year ―, $3;
                       print ― ―, $5, ― had less than 98 marksn‖
                       } }‘ topper.txt
                This will print
                          In the year 2003 Suresh had more than 98
                marks.
                          In the year 2004 Lokesh had less than 98
                marks.
                          In the year 2005 Anup had more than 98
                marks.
                          In the year 2006 Javed had more than 98
                marks.



      Note that there is an else part used in this example.

      Also note that if there are more than one statements they can be clubbed
      together with curly braces as we have done here in the example above.

4.3.2 The for loop

      The for loop in AWK has the following syntax:

      for(initial condition; termination condition; increment) statement;

      Example 3: To print some text for each of the fields we can use


             awk ‗/Maths/ { for(i=1; i<=4 ) print $i, ‖:‖; }‘
             toppers.txt

             will print
                     Maths:99:2003:Suresh:
                     Maths:96:2004:Lokesh:
                     Maths:99:2005:Anup:
                     Maths:98:2006:Javed:




                                                                                           179
COE                                                                         Unit 5, Lesson 6



      Note that $0 contains the entire text input line and $1 onwards contains the
      fields. Also note that we have used a variable i here. We will see details on
      variables in AWK later.



Self-Check Questions
1. AWK programs can compare two fields of the input line. (true/false).
2. Relational operators give true or false but return value cannot be used in
   expressions (true/false).
3. The if-else construct of AWK mandates that the else part must be there
   (true/false).
4. for loop can have a block of statements enclosed in curly brackets (true/false).




4.4   Special variables - NF and NR
      Awk provides internal special variables called

      NF – stands for the number of fields in the currently read line.
      NR – stands for the total number of records read.

      Example 4: Printing only the long lines more than 5 fields:

               awk ‗{if (NF > 5) print}‘ toppers.txt

               this will print
                       maths      96       2006     Javed M. K.
               Akthar
      Example 5: For Maths toppers, if we want to skip printing the year, we can
      use the following AWK command:

               awk ‗/Maths/ { for(i=1; i<= NR ) if( i != 2) print $i ― ―;
                   print ―n‖;
               }‘ toppers.txt

               will print
                       Maths 99        Suresh Yadav
                       Maths 96        Lokesh Arora
                       Maths 99        Anup Mathur
                       Maths 98        Javed M. K. Akhtar


4.4.1 Using BEGIN and END clauses in awk



180
COE                                                                            Unit 5, Lesson 6



          Usual programming tasks consist of
         Initializing some variables
         Reading inputs, performing some calculations and outputs
         Finally, generating some output based on the complete input set.
         The BEGIN clause of AWK lets you specify initializations. And, the END
          clause lets you perform calculations based on the entire input.



          Example 6: Suppose you want to print the total number of toppers.

                 awk 'END {print ―There are ― NR," toppers"}'
                 toppers.txt

                 will print
                 There are 13 toppers.



4.4.2 Using variables in AWK

          AWK provides $0, $1, $2, .. etc. as fields. In addition, you can use your own
          variables as well for any calculations. You need not declare the variable.
          Simply using a variable is permitted.

          Example 7: Suppose we want to find out the average top marks for physics
          over the years.
                   awk '/physics/ {marks += $2}
                        END {print "The average top marks in physics
                   are " marks/NR}' toppers.txt

                   This will print
                   The average top marks in physics are 93.375.


          In this example, "marks" is a user defined variable. You can use almost any
          string of characters as a variable name in AWK, as long as the name doesn't
          conflict with some string that has a specific meaning to Awk, such as "print" or
          "NR" or "END".

          There is no need to declare the variable, or to initialize it. A variable handled
          as a string variable is initialized to the "null string", meaning that if you try to
          print it, nothing will be there. A variable hand led as a numeric variable will be
          initialized to zero.



Self-Check Questions

                                                                                                  181
COE                                                                    Unit 5, Lesson 6




5. Special AWK variable NF stands for
      (a) Next field, (b) New Format, (c) Number of fields, (d) Next Line
6. END is used in AWK to
      (a) Exit from AWK, (b) To do final calculations
7. You can use any variable in AWK but you need to declare it first
      (a) true, (b) false.




4.5   Running AWK programs kept in files
      As you must have noticed, AWK programs can easily be longer than one line.
      Typing long programs on command line is quite cumbersome. Moreover,
      whenever you create programs, you would want to keep them in files to be
      able to use them over and over again.

      AWK provides a way to run AWK programs. The commands can be written
      into a file, and then AWK can be told to execute the comma nds from that file
      as follows:

      AWK -f <a wk program file name>

      Example 8: Suppose someone has a coin collection with gold and silver coins.
      Details of this collection are listed below.

               bash>cat coin_collection_details.txt

               Coin type     weight(gm)    year of making
               Gold          1             1945
               Gold          1             1952
               Silver        10            1948
               Gold          1             1973
               Gold          1             1973
               Gold          0.5           1945
               Gold          0.1           1933
               Silver        1             1943
               Gold          0.25          1921

      Now we can create an AWK program to print a summary of this coin collection
      as shown below:




182
COE                                                                   Unit 5, Lesson 6




             bash>cat show_coin_summary
               /gold/ { num_gold++; wt_gold += $2 }         # Get
             weight of gold.
               /silver/ { num_silver++; wt_silver += $2 } # Get
             weight of silver.
               END { val_gold = 485 * wt_gold;            #
             Compute value of gold.
                    val_silver = 16 * wt_silver;       # Compute
             value of silver.
                    total = val_gold + val_silver;
                    print "Summary data for coin collection:"; #
             Print results.
                    printf ("n");
                    printf (" Gold pieces:             %2dn",
             num_gold);
                    printf (" Weight of gold pieces:      %5.2fn",
             wt_gold);
                    printf (" Value of gold pieces:
             %7.2fn",val_gold);
                    printf ("n");
                    printf (" Silver pieces:           %2d n",
             num_silver);
                    printf (" Weight of silver pieces:    %5.2fn",
             wt_silver);
                    printf (" Value of silver pieces:
             %7.2fn",val_silver);
                    printf ("n");
                    printf (" Total number of pieces:      %2dn",
             NR);
                    printf (" Value of collection:      %7.2fn",
             total); }



      Note that AWK programs allow you to put comments as well. See the first two
      lines of show_coin_summary file listed above.

      You can run this AWK program as shown below:

             bash>awk –f show_coin_summary
             coin_summary_details.txt

             The Output of this run will be:
                 Gold pieces:                  7
                 Weight of gold pieces:        4.85
                 Value of gold pieces:         2352.25

                 Silver pieces:                2
                 Weight of silver pieces:      11
                 Value of silver pieces:       176
                                                                                         183
                 Total number of pieces:       9
                 Value of collection:          2528.25
COE                                                                       Unit 5, Lesson 6




4.6   Generating reports using AWK
      AWK programs can be used to quickly process text inputs and create various
      reports. Because AWK processes each record as fields, AWK is much more
      helpful in creating reports, compared to other Unix utilities, like sed.

4.6.1 The printf command of AWK

      While print command is available in AWK, print is quite a basic command.
      Often more sophisticated formatting is needed, specially while generating
      reports. For sophisticated output formatting, C like printf command is available
      in AWK

      Printf uses format specifications like %s, %d, etc. for formatting output.
      %s prints string
      %d prints a number in decimal format
      %f prints a floating point number

      In addition, you can use the following as well to control spacing
      t to print a tab
      n to print a new line

      Note that tabs come in very handy specially to print well aligned columns. The
      input text fields may vary in lengths. If you separate out fields with spaces, the
      fields in output may not align well. Use tabs to get better aligned outputs.

      Example 1: Printing the topper name and year for Maths, with spaces.

              awk ‗/Maths/ {printf(―%s          %sn‖, $4, $3); }‘
              toppers.txt
              will print
              Suresh 2003
              Lokesh 2004
              Anup 2005
              Javed 2006

      You can see that the output columns are not aligned.

      Example 2: Printing the topper name and year for Maths, with tabs.

              awk ‗/Maths/ {printf(―%st%sn‖,        $4,    $3);   }‘
              toppers.txt
              will print
              Suresh      2003
              Lokesh      2004
              Anup        2005
              Javed       2006


      You can see that the output columns are well aligned now after using tab.


184
COE                                                                        Unit 5, Lesson 6




4.6.2 Format specifications in printf

       The printf command of AWK accepts many format specifiers. Moreover, for
       each of the format specifier, you can control how the output will be printed.
       This control specially helps further in making the reports better readable.

       The table below lists how values will be printed when certain format specifiers
       are used:

               Format          Value         Results
               %s              ―Hello‖       ―Hello‖
               %10s            ―Hello‖       ―Hello ―
               %-10s           ―Hello‖ ― Hello‖
               -----------------------------------------
               %c             100              "d"
               %10c           100              " d"
               %010c          100              "000000000d"
               --------------------------------------------
               %d             10                      "10"
               %10d           10                     " 10"
               %10.4d         10.123456789 " 0010"
               %10.8d         10.123456789 " 00000010"
               %.8d           10.123456789 "00000010"
               %010d          10.123456789 "0000000010"
               --------------------------------------------
               %e             987.1234567890 "9.871235e+02"
               %10.4e         987.1234567890 "9.8712e+02"
               %10.8e         987.1234567890 "9.87123457e+02"
               %f             987.1234567890 "987.123457"
               %10.4f         987.1234567890 " 987.1235"


        %010.4f         987.1234567890 "00987.1235"
        %10.8f          987.1234567890 "987.12345679"
        --------------------------------------------
        %g              987.1234567890 "987.123"
        %10g            987.1234567890 " 987.123"
        %10.4g          987.1234567890 " 987.1"
        %010.4g 987.1234567890 "00000987.1"
        %.8g            987.1234567890 "987.12346"



Self-Check Questions
8. If you use tabs in printf, the output will not be aligned (true/false)
9. Tab is printed by putting (a) T, (b) tab, (c) t, (d) tab
10. For printing a string using print, a format specification is needed (true/false)



                                                                                              185
COE                                                                        Unit 5, Lesson 6



11. If you use a printf with %10s and give ―worlds‖ as argument to the printf, the
    output will come as ―10worlds‖ (true/false).



4.6.3 Printing the fields in different order than input

       If you want to print some of the fields in a order that is different from the input,
       you can simply change the order of the $ variables in the print commands.
       This powerful feature is often useful when creating reports as well.

       Example 3:

               awk ‗{if ($3 == 2006) print $3,‖ ―, $1); }‘ toppers.txt

               will print the following
                       2006           95
                       2006           88
                       2006           89
                       2006           96


       AWK features make it very useful to process data and print reports, especially
       when the data is arranged in columns like our toppers.txt example. Let‘s see a
       few examples before looking at more AWK features.

4.6.4 Creating simple reports

       Creation of simple reports is straightforward using AWK.

       Example 4: If you want to print the physics toppers for years prior to 2005, you
       can use the following command: (note year is the 3‘rd field in input text):


               awk '/Physics/ {if ($3 < 2005) printf(―%s %s %s
               %sn‖,     $3,$5,$6,$7,$8}'     toppers.txt   >
               phy_toppers_before_2005.txt

               bash>cat phy_toppers_before_2005.txt
                     2003 Abhay Malhotra
                     2004 Shriesh Jadhav


       Example 5: If you want to print a simple yes/no answer whether the topper
       had more than 92 marks or not, you can use the following:




186
COE                                                                     Unit 5, Lesson 6




              awk ‗{if ($2>92)
                         printf(―%st%stypesn‖, $3, $1)
                     else
                        printf(%st%stnon‖, $3, $1); }‘ toppers.txt
              > more_than_92.txt

               bash>cat more_than_92.txt
                    2003 Physics      no
                    2003 Chemistry    yes
                    2003 Maths        yes
                    2004 Physics      yes
                    2004 Chemistry    yes
                    2004 Maths        yes
                    2005 Physics      no
                    2005 Chemistry    no
                    2005 Maths        yes
              and so on.

      You can see how quickly awk can be used to generate reports like this.

      Example 7: For Maths toppers, if we want to put a colon between fields except
      in the names, we can use the following AWK command:

            awk ‗/Maths/ { for(i=1; i<= NF )
            {
                    if( i < 4) printf(‖%s:‖ , $i);
                    else printf(―$s ―, $i);
            }
            printf(―n‖);
                             }‘ toppers.txt‘

            will print
                    Maths:99:2003:Suresh Yadav
                    Maths:96:2004:Lokesh Arora
                    Maths:99:2005:Anup Mathur
                    Maths:98:2006:Javed M. K. Akhtar


      Note that the special variable NF has been used to define the terminating
      condition. With the use of NF you can work with data having variable number
      of columns as well like we are able to print names that fit in 2 fields (e.g.,
      Lokesh Arora) and names that need 4 fields (e.g., Javed M. K. Akhtar).

      Also note that we have used if-else inside a for loop. The if-else part is
      ensuring that there are no colons in the names.

4.6.5 Field separator




                                                                                           187
COE                                                                  Unit 5, Lesson 6



      AWK works by reading one input record (one line) and breaking it up into
      fields. By default, AWK uses white-spaces (space and tabs) as the field
      separator. However, you may encounter tabular data that uses some other
      characters as separator. For example, your input data may look like the output
      of example 8.

                      Maths:99:2003:Suresh Yadav
                      Maths:96:2004:Lokesh Arora
                      Maths:99:2005:Anup Mathur
                      Maths:98:2006:Javed M. K. Akhtar


      Here colon (‗:‘) is the separator.

      In such cases, you can tell AWK what character to use as field separator. The
      field separator is an optional argument to the awk command.
      awk -F<ch>

              e.g.,   awk -F:       tells AWK to use colon as a
              separator
                    awk -F'|'        tells AWK to use bar as a
              separator
                    awk -F'"' tells AWK to use double quote as
              a separator

      Example 8: If the input line is Maths:99:2005:Anup Mathur

              And AWK is run with –F: as an argument, the
              $1 will contain Maths
              $2 will contain 99
              $3 will contain 2005
              $4 will contain ―Anup Mathur‖


      Note that $4 here will contain the entire name itself because the separator has
      been set as colon.

      Example 9: You can pipe the output of one awk into another awk as well. So
      we can pipe the output of the example 7 above into another AWK.




188
COE                                                                      Unit 5, Lesson 6




             awk ‗ { for(i=1; i<= NR )
             {
                     if( i < 4) printf(‖%s:‖ , $i);
                     else printf(―$s ―, $i);
             }
             printf(―n‖);
                     }‘ toppers.txt‘ | awk –F: ‗{printf(―%-18st%dn‖,
             $4, $3); }‘

             will print
             Suresh Yadav            2003
             Lokesh Arora      2004
             Anup Mathur       2005
             Javed M. K. Akhtar 2006


4.6.6 Printing heading/heading row and summary/footer

      The BEGIN and END clauses can be used even to print headings and
      summary for reports, thus making the report more readable and attractive.

      Example 10: Here we will print the physics toppers with headers and will print
      a summary at the end.

             awk ‗BEGIN {
                printf(―Physics toppers details:n‖)
                printf(―-----------------------------------------n‖);
                printf(―YeartMarkstName of the toppern‖);
                printf(―-----------------------------------------n‖);
             }
             /Physics/ {
                printf(―%dt%dt%sn‖, $3, $2, $4); }
                sum += $2
             }
             END {
             printf(―-----------------------------------------n‖);
             printf(―Avg top marks in physics were %f n‖,
             sum/NR)
             printf(―-----------------------------------------n‖);

             }‘ topper.txt


      This will print

               ---------------------------------------------
               Year           Marks Name of the topper
               ---------------------------------------------
               2003           92          Abhay
               2004           94.5        Shiesh
               2005           89          Vandana                                           189
               2006           98          Ramakant
               ---------------------------------------------
COE                                                                    Unit 5, Lesson 6




Self-Check Questions
12. AWK always prints the fields in the same order as they appear in the input
    (true/false).
13. AWK can generate reports containing only the input fields. No other items can
    be added.     (true/false).
14. Filed separator in AWK is fixed and cannot be changed (true/false).




4.7   Miscellaneous features of AWK
4.7.1 Specifying search patterns in AWK

      As we have seen in several examples and in AWK syntax, search patterns,
      along with their respective programs can be used in AWK. So far we have
      used simple search patterns like the example below:

      awk „/Physics/ {print}‟ toppers.txt

      However, AWK supports much more sophisticated patterns also, as listed
      below.

      /The/ matches any lines containing The
      So this will match lines containing There, These, Them too.
      But this will not match lines containing the, these, them, etc. because AWK
      uses case sensitive matching.

      /^The/ matches any lines beginning with The.
      So this will match lines which contain The, These, Them in the beginning only.

      /The$/ matches any lines ending with The

      /The$$/ matches any lines ending with The$

      /[Tt][Hh][Ee]/ matches any lines with THE, The, tHe, thE, etc.

      /^[a-zA-Z][a-zA-Z0-9_]*$/ matches lines containing only identifiers.

      /(^India)|(^Pakistan)/ matches lines beginning with India or Pakistan

      You can even use complex regular expressions in AWK. The regular
      expressions can be created by using the following characters:

190
COE                                                                       Unit 5, Lesson 6




        ?          matches zero or one occurrence of
                   character before it
        +          matches one or more occurrences of
                   character before it
        *          matches zero or more occurrences of
                   character before it
        .          The dot matches any character

      For example, the following expression will match any line containing only a
      signed integer. The matched line cannot contain any other characters.
      /^[+-]?[0-9]+$/ matches signed integers.

      Example 1: A data file contains some text and some integer numbers. Here is
      the data file:

              bash>cat data_file.txt
              The number of loans given
              12399
              The number of loans fully repaid by now
              2893
              The number of defaulters
              129
              Defaulted amount (loss)
              -8929972
              Loss after adjusting procedural expenses
              -9288990.72

              awk ‗/^[+-]?[0-9]+$/ {print }‘ data_file.txt
              will print
              12399
              2893
              129
              -8929972

4.7.2 Limiting the lines on which AWK would work

      By default, awk works on each of the lines of input. We have already seen
      that we can use search patterns to limit the lines on which AWK would work.
      In addition, you can limit AWK to work only on some block of input lines.

      /^India/,/^Pakistan/ will operate on lines starting with India and will end
      operation with the line starting at Pakistan.

      NR == 15 will operate only on the 15'th line!
      NR==10,NR==25 will operate on lines 10 to 25.
      $1 == "India" will operate on lines where the first field is "India"
      $1 ~ /India/ will operate on lines where the first field contains India.


                                                                                             191
COE                                                                       Unit 5, Lesson 6




          You can even create complex conditions using &&, || operators
          e.g.,
          ((NR >= 30) && ($1 == "India")) || ($1 == "Pakistan")

          Example 2: If you know that your input data has some header text and some
          footer text and the data of your interest lies in between, then you should use
          such patterns to limit AWK to work only on data and not on the header and
          footer.

                 bash>cat data.txt
                 -------------------------------------------------
                 The weather report for 24.05.2007
                 -------------------------------------------------
                 City              Humidity          Max Temp
                 Agra              92                38
                 Delhi             93                39
                 Mumbai            98                34
                 Copyright CNN world
                 Data from 2pm IST

                 awk ‗NR > 3 && NR < 8 {printf (―%stTemp=%dn‖,
                 $1, $3); }‘ data.txt
                 will print
                 Agra           Temp=38
                 Delhi          Temp=39
                 Mumbai         Temp=34



Self-Check Questions
15. AWK search patterns are case-insensitive. (true/false)
16. /NASA/ will match only lines containing NASA. (true/false).
17. AWK will work on each line of input. There is no way to limit the scope.
    (true/false)



4.7.3 Built-in variables

          We have used many of the built-in variables of AWK, such as $0, $1, $2,.. etc.
          and NF, NR. In addition, AWK has few other built in variables as listed below.

          Note that these variables are not read-only. That means, during a AWK
          program‟s run, the program itself can change the value of the variable!

         FS : Field separator. By default AWK uses spaces as field separator and we
          have seen the –F option that can be used on the command line to specify the



192
COE                                                                          Unit 5, Lesson 6



          field separator to be used by AWK. In addition, AWK has a built in variable FS
          that specifies the field separator.
         RS : Record separator. By default AWK reads each line as an input line which
          means the default record separator is the new line. However, you can use RS
          to change the record separator.
         OFS: Stores the "output field separator", which separates the fields when Awk
          prints them. The default is a "space" character.
         ORS: Stores the "output record separator", which separates the output lines
          when Awk prints them. The default is a "newline" character.
         FILENAME: Contains the name of the current input file.

4.7.4 Passing arguments to AWK

          So far we have seen AWK programs and commands where the values were
          fixed. For example, consider example from chapter 4 where a fixed value is
          being used:

          Example 3: Print whether Maths toppers had more than 98 marks.

                 awk ‗/Maths/ { if( $2 > 98 )
                            {
                                print ―In the year ―, $3;
                                print ― ―, $4, ― had more than 98
                        marksn‖
                            }
                        else
                           {
                                print ―In the year ―, $3;
                                print ― ―, $4, ― had less than 98 marksn‖
                        } }‘ topper.txt


          This will print
                            In the year 2003 Suresh had more than 98
                   marks.
                            In the year 2004 Lokesh had less than 98
                   marks.
                            In the year 2005 Anup had more than 98
                   marks.
                            In the year 2006 Javed had more than 98
                   marks.

          Now, you may be asked to print the same report but for 94 marks. In which
          case, you will need to copy and modify the same script to replace 98 by 94.
          Such copying must be avoided because (a) it creates multiple scripts doing
          nearly the same things, (b) if you fix some error in one file you will need to fix
          it in all the files of same type, (c) the operation of copying and modifying is
          very error prone (what if the change from 98 to 94 is done in all places but
          gets accidentally left out at one place). Therefore, it is safer to make your


                                                                                                193
COE                                                                Unit 5, Lesson 6



      scripts in a generic way. Consider the example 3 again but made generalized
      as example 4 below:

      Example 4: Print whether Maths toppers had more than N marks.




194
COE                                                                      Unit 5, Lesson 6




             bash>cat report_script
             if( $2 > N )
             {
             print ―In the year ―, $3;
             printf( ― %s had more than %d marksn‖, $5, N);
             }
             else
             {
             print ―In the year ―, $3;
             printf( ― %s had less that %d marksn‖, $5, N);
             }


      It is invoked as

      awk –f report_script N=94 toppers.txt


      Note that we are passing N=94 in the command line. So if another report is
      needed to find with N=55, we need not copy/modify the file but we can simply
      pass N=55 on the command line itself.

4.7.5 Arrays and associative arrays in AWK

      Any user defined variable can work as an array in AWK. You can simply
      assign values with indexing. For example,

      Field[1] = $1
      Field[3] = $3

      AWK also supports associative arrays.

      For example, if $i contains the name of city and $j contains the city‘s
      temperature, you can store this information in an associative array.

      Temperature[ $i ] = $j;

4.7.6 String functions in AWK

      If you place multiple strings side by side, they will be joined.

       a = "DTU" "Delhi" # a will become "DTUDelhi".


      length() function returns the length of a given string.

      substring(str, startIndex, length) function takes out the substring.
      substring("DTU", 5, 3) will return "bag".



                                                                                            195
COE                                                                         Unit 5, Lesson 6



      Note that index starts from 1, not 0.

      index(str, searchStr) gives the index of the searchStr or 0.
      index("DTU", "bag") will return 5.
      index("DTU", "DEI") will return 0.

      split(str, array [,separator]) splits an string by separator and fills them into an
      array.
      split("mera bharat mahan", slogan) will put
              slogan[1] as "mera"
              slogan[2] = "bharat", etc.



Self-Check Questions
18. AWK provides a built in variable for field separator (true/false).
19. Built in variables are read only (true/false).
20. Variables passed to AWK are accessed as $1, $2, etc. (true/false)
21. AWK does not support complex structures but supports associative arrays
    (true/false).



4.7.7 Few interesting, complex examples

      Few interesting examples are listed below. These exemplify the power of
      AWK.

      Example 5: Counting non blank lines in a file:
      awk 'NF != 0 {++count} END {print count}' input_file.txt

      Example 6: Computing avg size of files in a directory

      ls -l | awk 'NR!=1 {s+=$5} END {print "Average: " s/(NR-1)}'

      Example 7: Print Fibonacci numbers:

      awk 'BEGIN {a=1;b=1; while(++x<=10){print a; t=a;a=a+b;b=t}; exit}'

      Example 8: Sometimes we may repeat words unintentionally like: "When I
      was going there". Detecting these manually is difficult, But we can write an
      AWK program to do this!!

                BEGIN { dups=0; w="xy-zzy" }
                    { for( n=1; n<=NF; n++)
                        { if ( w == $n ) { print w, "::", $0 ; dups = 1 }
              ; w = $n }
                    }
                END { if (dups == 0) print "No duplicates found."}

196
COE                                                                            Unit 5, Lesson 6




4.8        Summary
           Awk is a very powerful utility in Unix. It helps in scripting and report
           generation.


4.9        Answers to the self check questions
      1    true
      2    true
      3    false
      4    true
      5    (c)
      6    (b)
      7    (b)
      8     false
      9    (c)
      10   false
      11   false
      12   false
      13   false
      14   false
      15   false
      16   false
      17   false
      18   true
      19   false
      20   false
      21   true


4.10 Terminal Questions

      1. Take the toppers.txt of this chapter. For each year and subject, print the first
         name of the topper, marks and then year.
      2. Do the same question as listed above but now print the complete name of the
         topper followed by marks and then year.
      3. Print the chemistry toppers marks, year and names for even years.
      4. Print the years whenever the toppers scored >= 97 marks.
      5. Input contains name and phone number records. To simplify, assume there is
         only one name (first name) and only one phone number per name. Use
         associative arrays to store numbers and names and at the end print them.
      6. Upgrade example 8 to print the line number too where the repeated word i s
         there.




                                                                                                  197
COE                                                                  Unit 5, Lesson 6



      7. See the AWK syntax. We have used only one pattern and its program in our
         examples. Try using multiple patterns and their corresponding programs and
         see the outputs.
      8. Generalize the coins example of chapter 4 by passing the values of per gram
         of gold and solver in place of hard coded values used in that example.




198

More Related Content

PPTX
Operating systems
PPT
Operatingsystem
PPT
Operating System
PDF
Computer system and peripherals
PPT
computer fundmental
DOC
system software and operating System
DOCX
Computer Fundamental
Operating systems
Operatingsystem
Operating System
Computer system and peripherals
computer fundmental
system software and operating System
Computer Fundamental

What's hot (20)

PPT
Utility Programs
DOCX
Os by nishant raghav
DOC
Operating system
PDF
Operating system 2
PPT
Operating system notes ch1
PPTX
Operating systems
DOC
COMPUTER HARDWARE - SHORT NOTES
DOC
Os question
PPT
Lecture01 introduction
PPTX
CH08-Types of Utility programs and Operating System
PPTX
Operating System And Utility Program
PDF
Know thyubuntu
PDF
ITFT _ Operating system
DOCX
Operating syestem class 9 notes.doc
PDF
Ch1kiit [compatibility mode]
PPTX
OS - Operating System
PDF
Operating Systems and Utility Programs
PPT
Presentation1 cc
PPT
OS - Ch1
Utility Programs
Os by nishant raghav
Operating system
Operating system 2
Operating system notes ch1
Operating systems
COMPUTER HARDWARE - SHORT NOTES
Os question
Lecture01 introduction
CH08-Types of Utility programs and Operating System
Operating System And Utility Program
Know thyubuntu
ITFT _ Operating system
Operating syestem class 9 notes.doc
Ch1kiit [compatibility mode]
OS - Operating System
Operating Systems and Utility Programs
Presentation1 cc
OS - Ch1
Ad

Viewers also liked (15)

PDF
Imperial Financial Brand Strategy Case Study
PDF
Solution of 2016 esonance business case study
PDF
Boost business in ict field
PDF
BergHOFF Catalogue
PDF
University of Sydney, Master Of Management Case Study - Danielle Warby
PDF
IBM Blue Water Shipping Case study
PDF
Rowling Energy - Strategic Change PDF
PDF
Plan estrategico institucional fie 2012 2016 1
DOCX
Dell ad Google Case Study
PDF
Marketing In A Down Market
PDF
Marketing in the show business a case study about lady gaga s career
PDF
Research report health tourism
PDF
Individual Case Study – Production of Bicycles Based on SAP (Vanessa Günther,...
PDF
Informe Trabajo Legislativo 2012 - 2016
PDF
Business analysis and strategy recommendation of juc
Imperial Financial Brand Strategy Case Study
Solution of 2016 esonance business case study
Boost business in ict field
BergHOFF Catalogue
University of Sydney, Master Of Management Case Study - Danielle Warby
IBM Blue Water Shipping Case study
Rowling Energy - Strategic Change PDF
Plan estrategico institucional fie 2012 2016 1
Dell ad Google Case Study
Marketing In A Down Market
Marketing in the show business a case study about lady gaga s career
Research report health tourism
Individual Case Study – Production of Bicycles Based on SAP (Vanessa Günther,...
Informe Trabajo Legislativo 2012 - 2016
Business analysis and strategy recommendation of juc
Ad

Similar to Unix shell program training (20)

DOCX
PPTX
Fundamentals of Computers & Information System
DOCX
1. Intro to Computer and OPERATING SYSTEM.docx
PPT
Module 1.ppt
PPT
Chap1
PDF
Computer Notes
PDF
01intro
PDF
Chapter-1 || Computer Overview || Class XI || 2024 ||
DOC
operating system lecture notes
PPT
Cso gaddis java_chapter1
PDF
Unix introduction
PPT
Chapter 7A Peter Norton
PDF
OS-UNIT-1-INTRODUCTION.pptx.pdf
PPT
Introduction to Operating System
PDF
Computer system and peripherals
PDF
PPTX
K04 software
PPT
OS Functions and Services
PDF
Intro to operating_system
PDF
COMPUTER ORGNAIZATION NOTES
Fundamentals of Computers & Information System
1. Intro to Computer and OPERATING SYSTEM.docx
Module 1.ppt
Chap1
Computer Notes
01intro
Chapter-1 || Computer Overview || Class XI || 2024 ||
operating system lecture notes
Cso gaddis java_chapter1
Unix introduction
Chapter 7A Peter Norton
OS-UNIT-1-INTRODUCTION.pptx.pdf
Introduction to Operating System
Computer system and peripherals
K04 software
OS Functions and Services
Intro to operating_system
COMPUTER ORGNAIZATION NOTES

More from Aditya Sharat (16)

PDF
Neural networks
PDF
Google apps cloud computing
PDF
Deloitte's Cloud Perspectives
DOCX
Virtual Reality
PDF
Number system
PDF
Introduction to IT
PDF
Humanware
PDF
Generation of computers
PDF
Flow charts
PDF
Electronic computer classification
PDF
Language translators
PDF
Railway Management system
PPTX
Mobile communication
PDF
Conducting polymers
PDF
IS95 CDMA Technology
Neural networks
Google apps cloud computing
Deloitte's Cloud Perspectives
Virtual Reality
Number system
Introduction to IT
Humanware
Generation of computers
Flow charts
Electronic computer classification
Language translators
Railway Management system
Mobile communication
Conducting polymers
IS95 CDMA Technology

Recently uploaded (20)

PDF
Computing-Curriculum for Schools in Ghana
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Classroom Observation Tools for Teachers
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
Lesson notes of climatology university.
Computing-Curriculum for Schools in Ghana
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Microbial diseases, their pathogenesis and prophylaxis
Abdominal Access Techniques with Prof. Dr. R K Mishra
Final Presentation General Medicine 03-08-2024.pptx
Classroom Observation Tools for Teachers
Final Presentation General Medicine 03-08-2024.pptx
human mycosis Human fungal infections are called human mycosis..pptx
GDM (1) (1).pptx small presentation for students
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Supply Chain Operations Speaking Notes -ICLT Program
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
VCE English Exam - Section C Student Revision Booklet
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Cell Structure & Organelles in detailed.
Lesson notes of climatology university.

Unix shell program training

  • 1. Winter Training, December 2011 Unix and Shell Programming Department of COE and SE, Delhi Technological University Instructor: Divyashikha Sethia
  • 2. Contents UNIT 1: INTRODUCTION TO UNIX ..........................................................................3 UNIT 2: SHELL SCRIPTING..................................................................................... 63 UNIT 3: ADVANCED SHELL SCRIPTING, SED, AND AWK .................. 143
  • 3. UNIT 1: INTRODUCTION TO UNIX 1. THE UNIX OPERATING SYSTEM – AN OVERVIEW.................................7 2. UNIX COMMANDS................................................................................................... 21 3. UNIX FILE SYSTEM ................................................................................................ 33 4. THE VI TEXT EDITOR ............................................................................................ 45
  • 5. COE Unit 1, Lesson 1 LESSON 1 T HE UNIX OPERATING S YSTEM – AN OVERVIEW 1. THE UNIX OPERATING SYSTEM – AN OVERVIEW .................................................7 1.0 OBJECTIVES ...............................................................................................................7 1.1 INTRODUCTION ...........................................................................................................7 1.2 INTRODUCTION TO THE COMPUTERS .........................................................................7 1.2.1 Typical hardware components of a computer.................................................8 1.3 OPERATING SYSTEM..................................................................................................8 1.3.1 Virtual Memory.....................................................................................................9 1.4 UNIX OPERATING SYSTEM .................................................................................... 10 1.4.1 History of UNIX ................................................................................................. 10 1.4.2 Importance of UNIX ......................................................................................... 11 1.5 UNIX OPERATING SYSTEM – ATTRIBUTES AND COMPONENTS ............................ 12 1.6 STARTING WITH UNIX............................................................................................. 14 1.7 CHANGING YOUR PASSWORD ................................................................................ 15 1.8 ENTERING COMMANDS IN THE UNIX SYSTEM ....................................................... 16 1.8.1 Command Options and Arguments ............................................................... 17 1.9 SUMMING UP........................................................................................................... 17 1.10 ANSWERS TO THE SELF CHECK QUESTIONS ........................................................... 17 1.11 TERMINAL QUESTIONS............................................................................................. 18 1.12 REFERENCES .......................................................................................................... 18
  • 7. COE Unit 1, Lesson 1 1. The UNIX Operating System – An Overview Use and influence of computers has been steadily increasing in the last few decades. Today, computers play a pivotal role in all walks o f life. An operating system (OS) is a core component of the computer system. An operating system lets a computer function as multi-user, multitasking and multithreading environment, thus augmenting the power of the computer. UNIX is an operating system that offers its users all these capabilities along with numerous other features. In this lesson we will look upon the features and components of the UNIX system that make it very useful and popular. In the subsequent lessons we will explore the features and components of UNIX in more details. 1.0 Objectives After going through this lesson, you will be able to  Understand the concepts of the Operating System  Understand what is the UNIX Operating Systems  Understand the importance and popularity of UNIX Operating System  Understand how to start working on a UNIX machines 1.1 Introduction In the modern age, we have seen the computer doing wonders, from children playing games to the scientists launching satellites; we can clearly see that the computers are playing a important role. It is the operating system that has made the computing in the modern world possible and efficient. 1.2 Introduction to the computers Unlike calculator, a computer carries out user specified tasks. An inherent power provided by a computer is that it can be programmed to do variety of tasks. Computers are mostly general purpose computers in the sense that a computer can be used to play a game and the same computer can be used to perform a circuit simulation. A computer consists of hardware and software. A computer can be defined as a programmable machine which responds and executes a list of instructions. These lists of instructions are called programs. The hardware components are the physical components and software is data o r instruction. 7
  • 8. COE Unit 1, Lesson 1 1.2.1 Typical hardware components of a computer Hardware components in computer are what you can see and touch.  Memory: Enables the computer to store the temporary data and instructions. This is used in the computer during the execution of various instruction sets. While evaluating the following expression, the intermediate results are stored in memory Sum = 2 + 1 + 3 * 4  Mass storage devices: These are used for the bulk storage of data, such as, disk drives and tape drives.  Input devices: Interface to take the instructions from the user to the computer. Commonly used input devices are keyboard, mouse, web camera, etc.  Output Devices: Display the results of the instruction processing done by the computer. Commonly used are display monitors and the printers.  Central Processing Unit (CPU): The brain of the computer in which all the processing is done. It reads the data from memory or input and executes the instructions. CPU consists of ALU (Arithmetic Logic Unit) and CU (Control Unit). ALU is responsible for all calculations and CU is responsible for getting instructions and data for execution. Working with the hardware components alone is very difficult because their controls are very cryptic. Instead, software components are used to drive the hardware components. The operating system is also one such software. 1.3 Operating System An Operating System (OS) is an important program that runs on the computer. An operating system performs the very basic tasks, such as recognizing inputs from the user, sending outputs to the display, keeping track of file and directories on the disk, and controlling the peripheral devices such as the disk drivers and printers. 8
  • 9. COE Unit 1, Lesson 1 The OS also works as a traffic cop - it makes sure that different program and users running at the same time do not interfere with each other. The operating system is also responsible for security and blocking unauthorized users. Operating systems can be classified as follows:  Multi-user: Allows multiple users to use computers at the same time.  Multiprocessing: Supports running parts of a program in parallel.  Multitasking: Allows multiple programs to run concurrently on a single CPU.  Multithreading: Allows different parts of a single program to run concurrently. Operating systems provide a platform on which other programs, called application programs, can run. The application programs must be written to run on a particular operating system. Your choice of operating system, therefore, determines to a great extent the applications you can run. For PCs, the popular operating systems are DOS, OS/2, Windows and Linux. 1.3.1 Virtual Memory Programs that run on a computer may need more memory than what is available physically on that computer. Many operating systems provide an illusion to the user of much larger memory. This is done by loading only partial program and data in physical memory. Only the parts that are needed for current execution are brought into physical memory. So, bigger programs can be run even if physical memory is small. 9
  • 10. COE Unit 1, Lesson 1 Self-Check Questions 1. A ____________ is a prerecorded set of instructions, which is executed b y the computer to perform some task. 2. A computer is a specific purpose machine that can not be tweaked to perform some other tasks. (True/False) 3. The operating systems keep the temperature inside the computer down, so that the functioning is proper. (True/False) 4. A ___________ system allows running parts of a program in parallel, on more than one CPU. 5. In a _______________ system, a large number of users can use the system concurrently. 6. The ____________ memory is an imaginary memory which is used by the Operating System to get a larger address space. 1.4 UNIX Operating System 1.4.1 History of UNIX The UNIX operating system found its beginnings in MULTICS, which stands for Multiplexed Operating and Computing System. The MULTICS project began in the mid 1960s as a joint effort by General Electric, Massachusetts Institute for Technology and Bell Laboratories. In 1969 Bell Laboratories pulled out of the project. One of Bell Laboratories people involved in the project was Ken Thompson. He liked the potential MULTICS had, but felt it was too complex and that the same thing could be done in simpler way. In 1969 he wrote the first version of UNIX, called UNICS. UNICS stood for Uniplexed Operating and Computing System. Although the operating system has changed, the name stuck and was eventually shortened to UNIX. Ken Thompson teamed up with Dennis Ritchie, who wrote the first C compiler. In 1973 they rewrote the UNIX core (called kernel) in C. The following year a version of UNIX known as the Fifth Edition was first licensed to universities. The Seventh Edition, released in 1978, served as a dividing point for two divergent lines of UNIX development. These two branches are known as SVR4 (Release 4) and BSD. Ken Thompson spent a year's sabbatical with the University of California at Berkeley. While there are two graduate students, Bill Joy and Chuck Haley, wrote the first Berkeley version of UNIX, which was distributed to students. This resulted in the source code being worked on and developed by many different people. The Berkeley version of UNIX is known as BSD, Berkeley 10
  • 11. COE Unit 1, Lesson 1 Software Distribution. From BSD came the VI editor, C shell, virtual memory, Send mail, and support for TCP/IP. 1.4.2 Importance of UNIX During past 25 years the UNIX OS has evolved into powerful, flexible, and versatile and robust operating system. It serves as the operating system for variety of computers , for single user personal computers , engineering workstation , multi-user microcomputers, minicomputers, mainframes, super computers and as well as special application devices . There are approximately 20 million machines now running UNIX and more than 100 million users, and this popularity and rapid growth is estimated to be increased further. The success of UNIX is due to many factors including its portability to a wide range of machines, its adaptability and simplicity, the wide range of tasks it can perform, its multi-user and multitasking nature, and its suitability for networking. What follows is a description of the features that have made UNIX system so popular.  Multi-user and Multitasking abilities The UNIX OS allows the use of a single computer by many users. It is also a multitasking system that is it allows more than one application to be run on the same computer at the same time.  Powerful command set The UNIX OS provides a consistent and powerful set of commands that has made it very useful particularly for the technical people .  Combining commands The UNIX provides constructs like pipes and redirection of commands which enables the user to create his own powerful utilities from UNIX commands.  Excellent environment for Networking UNIX offers program and utilities that provide the services needed to build networked applications - the basis for distributed, networked computing. With networked computing, information and processing is shared amongst different computers in a network. It is useful in client server computing where the machines on the network can be client and servers at the same time. UNIX system is used as the base system for the development of the internet services and the growth of internet.  Portability The UNIX system is far easy to be ported to new machines than other operating systems. The fact that, it is portable to almost any computer, results from its being almost entirely written in C programming language. 11
  • 12. COE Unit 1, Lesson 1 1.5 UNIX Operating System – Attributes and Components The UNIX operating system is made up of several major components. Some of these components are the commands, the file system, the shell, the kernel and the commands. 12
  • 13. COE Unit 1, Lesson 1  The Commands and User Programs UNIX provides a number of built-in commands and in addition user programs can also run.  The File System The basic unit that stores information in the UNIX system is called a file. The UNIX file system provides a logical method of organizing files. Files are organized in a hierarchical file system where the files are grouped together in a directory. Example: Hierarchical File Structure /dtu/COE_Course/COE_101/schedule Here ―dtu‖ is the parent directory which is in ‗/‘ root and other directories are in it An important simplifying feature of the UNIX system is the wa y it treats the files. For example, physical devices are treated as files, this permits the same command to work for an ordinary file or a device i.e. same command can be used to write to a file and printer.  The Shell and shell scripts The shell is the command interpreter in the UNIX operating system. It reads the user specified commands and interprets them as requests to execute a program or a set of programs, which it then arrange to carry them out. Shell also provides a programming language. Shell scripts are covered in subsequent chapters of this unit.  The kernel The kernel is the core of the OS. The kernel interacts directly with the hardware through a set of programs called the device drivers that are built into the kernel. It provides the set of services that can be used by the other programs; also it safeguards these programs from hardware layers. The major functions of the kernel are to maintain the file system, manage memory, access control to the computer, and handle the interrupts (these are the signals to terminate the processes, ctrl + C is a common example)., error handling, I/O handling which enables the computer interaction with the peripheral devices such as printers, monitors, storage devices, etc.). Programs use kernel through the system calls. For example, if the user wants some file to be opened then the program generates a system call to open the directories and then the files. The figure below shows the relationship amongst various components of the UNIX file system. 13
  • 14. COE Unit 1, Lesson 1 The User Commands The Shell The Kernel Hardware Components of UNIX operating system (shown in gray). Self-check Questions 7. UNIX is a multi-user OS and also possesses multitasking abilities. (True/False) 8. The first version of the UNIX Operating System was known as _____________. 9. The file system in a UNIX Operating System is a hierarchical structure. (True/False) 10. The ____________ in a UNIX Operating System is used to interact with the hardware and executes the user commands and program. 11. The command interpreter in the UNIX system is called ___________. 12. The programs in the UNIX systems interact using the __________ calls with the kernel to perform the tasks. 1.6 Starting with UNIX This section is dedicated to the learning of how to log into a UNIX system and how to change password on a UNIX system. We will touch the details of the different types of system configurations and how we can log on to systems having these configurations.  Selecting a login Every UNIX user on a multi-user system is recognized by a login name which is the only identity he has on the system. This is to be set before you use a multi-user or a single user UNIX system, to log onto the system. UNIX provides excellent built-in security. Therefore no users are permitted unless they are identified. For this identification, each user has a login ID. 14
  • 15. COE Unit 1, Lesson 1 The login ID is typically allocated by an authority (known as the system administrator). The system administrator is also responsible to add new users to the system and provide them a login name and an initial work enviro nment and password on the computer. UNIX shows a login prompt initially. User needs to type-in his login ID. Then the password prompt comes. After you correctly type in the password, you get logged into the system. The example below shows this process. login: akash password: ―akash‖ is the user login name. Note to keep password secure, it is not displayed when you type it.  Connecting to the UNIX System In a multi-user system you have to contact the system administrator as to how you can connect to the system using your PC or terminal. Your PC can be directly wired to a computer or it can be connected via LAN. Direct Connect - This is a method of connecting to UNIX machines when there is a single machine. Dial-in Access - You can dial in to the UNIX network using a modem, use terminal emulators to get the UNIX prompt. Local Area Network (LAN) - LAN is a client server model. Connect to the server using the client workstation and use the UNIX capabilities. IP Networks Using IP networks like internet one can connect to some remote machines using telnet capability of UNIX. 1.7 Changing Your Password Your password is very important information that you must not share with anyone. You must change it regularly (say once in 2 months) and also should remember it (you must not write it on paper). Your password should contain 6 to 8 letters and should not simply be your name, your date of birth, etc. Your password should also contain at least one non alphabet (maybe a number). To change the password of your login you can use the passwd command. bash> password password: Changing password for sushobhit Old password: New Password: Re-enter new password: bash> 15
  • 16. COE Unit 1, Lesson 1 There is a simple scheme to create complex passwords and still remember them! All you do is to take the first letters of a line of your favorite poem or song and add a number or symbol to make a complex password. Here is an example: Say you pick the like ―Twinkle twinkle little star‖. Take the first letters to makes a string Ttls. And suppose your favorite symbol is = (equal sign) and favorite number is 2 so you append these to the string to make your complex password as Ttls2=. You can see that for anyone else it will too hard to find out while it is very easy for you to remember. NOTE: If you forget your password it cannot be retrieved even by the system administrator. The only remedy in such cases is that the system administrator can reset the password. Self-Check Questions 13. ________________ is the program which is used to connect to the UNIX system from a remote system. 14. ___________________ in a multi-user system is the person who is responsible for maintaining the system. 15. Get the odd one out To connect to a UNIX system one of the following measures can be used a. Dial-in access b. IP Networks c. LAN d. System Calls 16. If you forget your password system administrator can give you permissions. (True/False) 1.8 Entering Commands in the UNIX System UNIX provides numerous commands. When the user types some command on UNIX prompt then the shell invokes the program for the command, the command program can invoke many system calls, these calls then interacts with the hardware. 16
  • 17. COE Unit 1, Lesson 1 1.8.1 Command Options and Arguments UNIX system has a standardized comma nd syntax that is applicable to almost all the UNIX commands. Every command has some base functionality and additional functionality that are provided by the command line arguments. For Example, the ls command can be used to list the contents of a directory. bash> ls README 2134.tar.gz game_scores game_schedule Now let‘s use ls command with some option bash> ls –l -rw-r--r-- 1 anmol friends 10777 Mar 30 16:26 README -rw-r--r-- 1 achint friends 21483 Feb 28 17:39 2134.tar.gz drwxr-xr-x 2 amit friends 4096 Dec 12 16:41 game_scores This example shows the usage of –l argument of ls command, which outputs thedrwx------ 3of ls command. 4096 May 10 2006 long format arat friends game_schedule Another command that is frequently used is ‗man‘ command. This is used to displays the manuals of different commands. 1.9 Summing Up An operating system is the most important software in any computer as it fills the communication gap between a user and the underlying hardware. UNIX operating system with its unique qualities and ease to adapt is a popular and powerful operating system now days. In the chapters to follow we will explore the powers of UNIX in some details. 1.10 Answers to the self check questions 1. program 2. False 3. False 4. multitasking 5. multi-user 6. virtual memory 7. True 8. MULTICS 9. True 10. Shell 11. Shell 12. System calls 17
  • 18. COE Unit 1, Lesson 1 13. telnet 14. system administrator 15. h 16. False 1.11 Terminal questions 1. List and expand briefly the components of the UNIX operating system. 2. What are the features of UNIX operating system that are the cause of its popularity amongst the users? 3. Explain briefly the possible modes to log onto a UNIX system 1.12 References 1. http://www.uwsg.iu.edu/usail/concepts/unixhx.html 18
  • 19. COE Unit 1, Lesson 2 LESSON 2 UNIX COMMAND 2. UNIX COMMANDS ......................................................................................................... 21 2.0 OBJECTIVES ............................................................................................................ 21 2.1 INTRODUCTION ........................................................................................................ 21 2.2 THE C OMMANDS CLASS .......................................................................................... 21 2.3 CONNECTING TO UNIX ........................................................................................... 22 2.3.1 telnet command ................................................................................................ 22 2.3.2 rlogin command ................................................................................................ 22 2.4 FILE MANAGEMENT ................................................................................................. 22 2.4.1 mv command..................................................................................................... 23 2.4.2 cp command...................................................................................................... 23 2.4.3 rm command ..................................................................................................... 23 2.5 A COMMUNICATION RELATED COMMAND - FTP ....................................................... 23 2.6 INFORMATION .......................................................................................................... 24 2.6.1 man command .................................................................................................. 24 2.6.2 du – Disk usage ................................................................................................ 25 2.6.3 df – Disk free ..................................................................................................... 25 2.6.4 quota................................................................................................................... 25 2.6.5 who – Finding out who is logged on .............................................................. 25 2.7 PRINTING ................................................................................................................. 26 2.7.1 lpr – Printing ...................................................................................................... 26 2.7.2 lprm – Removing a printing job ...................................................................... 26 2.7.3 lpq – Checking the printing queue ................................................................. 26 2.8 PROCESS CONTROL................................................................................................ 26 2.8.1 ps – Finding the process ................................................................................. 26 2.8.2 & - Running process in background .............................................................. 27 2.8.3 Cntrl-z – Suspending a processes................................................................. 27 2.8.4 Jobs – Finding the process in background................................................... 27 2.8.5 Kill – Killing a process...................................................................................... 27 2.8.6 nice – reducing the priority of process .......................................................... 27 2.9 MISCELLANEOUS COMMANDS ................................................................................. 28
  • 20. COE Unit 1, Lesson 2 2.9.1 alias / unalias command.................................................................................. 28 2.9.2 cal (calendar) command.................................................................................. 28 2.9.3 clear command ................................................................................................. 28 2.9.4 crontab command............................................................................................. 28 2.9.5 csh command.................................................................................................... 28 2.9.6 history command .............................................................................................. 29 2.9.7 date command .................................................................................................. 29 2.9.8 echo command ................................................................................................. 29 2.9.9 grep command.................................................................................................. 29 2.9.10 unset command ................................................................................................ 29 2.9.11 tar command .................................................................................................... 29 2.9.12 tee command .................................................................................................... 29 2.9.13 touch command ................................................................................................ 29 2.10 SUMMING UP........................................................................................................... 30 2.11 ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................... 30 2.12 TERMINAL QUESTIONS ............................................................................................ 30
  • 21. COE Unit 1, Lesson 2 2. Unix Commands UNIX as any other operating system provides a set of commands to its users, using which, the users can perform the tasks they want. There is a huge variety of commands that UNIX provides its user. In the present lesson we will discover and read about the usage of many of the commands in UNIX. 2.0 Objectives After going through this lesson, you will be able to  Use the UNIX commands to perform tasks  Understand how to send and receive mails on UNIX  Understand the file management basic command  Understand the information and communication system using the UNIX 2.1 Introduction UNIX provides a number of commands. For the ease of understanding we can divide these commands into various categories. 2.2 The Commands class UNIX commands can be grouped amongst few broader classes:  Starting and Ending These are the commands which are basically used to logon to the UNIX system, or to initiate working on to the UNIX system.  File Management File is the basic data holding entity in the UNIX systems. There is a set of commands that can be used to maintain the file system so as to keep the data stored in the files, secured, updated and maintained.  Communication UNIX provides communications at many levels, including mails, writing messages, exchanging files, etc. 21
  • 22. COE Unit 1, Lesson 2  Information UNIX provides a number of commands to get information about the system like who are logged in, how much disk space is available, etc.  Printing In UNIX user can give the print command and also can monitor the status of the job or can remove the job if required from the queue.  Job and Process control As there are lots of processes which are going on in a UNIX system, it is sometimes required to get the information related to the user jobs running on the system. For this purpose UNIX provides a set of commands to monitor, kill, prioritize and resuming the jobs. In the present chapter we will look at some of these commands in detail and the other commands will be discussed in the chapters to follow. 2.3 Connecting to UNIX Before we learn anything in details the very first thing we will look at is the process that a user has to adopt to start with the UNIX system. 2.3.1 telnet command The telnet command is used for logging into a remote system. The telnet command presents the same login and password prompts as done on a local system. 2.3.2 rlogin command The rlogin command is used to connect to a remote computer. It is comparatively easier to use then telnet. Here is the syntax of rlogin command: rlogin [-l username] hostname In this the username is taken by default the username of the current user. Hostname is the name of the UNIX machine that is to be logged on. 2.4 File Management A file is a basic data storage entity in a UNIX system. There is a set of commands that can be used to maintain this system. We will be having an introductory flavor of these commands in this chapter with the complete discussion being taken up in the chapter on file system. Readers are advised to have a look at the man pages of each of these commands and try to understand what exactly these commands are used for. 22
  • 23. COE Unit 1, Lesson 2 2.4.1 mv command The mv command moves a file. The command can also be used to rename a file. Here is a simple example of mv command. bash> ls tempPresentation.txt bash> mv tempPresentation.txt finalPresentation.txt bash> ls finalPresentation.txt 2.4.2 cp command The cp command copies a file. Here is a simple example of the cp command. bash> ls tempPresentation.txt bash> cp tempPresentation.txt finalPresentation.txt bash> ls tempPresentation.txt finalPresentation.txt 2.4.3 rm command The rm command removes a file. Here is an e xample of the rm command. bash> ls tempPresentation.txt finalPresentation.txt bash> rm tempPresentation.txt bash> ls finalPresentation.txt 2.5 A communication related command - ftp The ftp (file transfer protocol) command is used for copying files from a remote computer to another computer. While mv and cp works on the same system at a time you might need to get files from across systems at the same time ftp can be used for that. In the example below we can see how ftp can be used to connect to a remote machine. In this example user ‗achint‘ gets file from machine mitserv. 23
  • 24. COE Unit 1, Lesson 2 bash> ftp mitserv Connected to mitserv Name: achint # User types his login id 31 Please specify the password. Password: # password will not be visible 230 Login successful. Remote system type is UNIX. ftp> get myPresentation.txt # Now you are in ftp. See the prompt 250KB data transfer successful ftp> quit The ftp prompt provides few limited commands as listed below: bash> # You are out of ftp now.  bin – Changes the file transfer type to support the binary image transfer.  get – Used to ‗get‘ the files from remote machine  mget- multiple get commands  ls – Used to list the contents of a directory on a remote machine  cd – Used to change directories on the remote machine  pwd – Used to get the present working directory on remote host  lpwd – Gives the current working directory in local host. 2.6 Information The information UNIX commands, regarding other users, disk quota and other things can be retrieved using some of the UNIX commands. In this section we will be discussing about some of these commands. 2.6.1 man command UNIX traditionally provides the manual pages (called ‗man‘ pages) for all the built-in commands and for system calls. You can learn a lot by referring to the manual pages for commands. The general syntax of the command is man [-] [-k keywords] topic/command The example below shows a part of the manual page of ‗du‘ command. bash> man du 24
  • 25. COE Unit 1, Lesson 2 2.6.2 du – Disk usage This command is used to find out how much disk space is been occupied at present by the files and directories of the user. 2.6.3 df – Disk free The df command tells how much disk space is left which can be used. 2.6.4 quota This command is used for knowing as to how much disk space the files are occupying on the file system. 2.6.5 who – Finding out who is logged on The who command displays the information like the usernames, terminal IDs and process IDs of other users and processes running on the computer. General syntax of the command is: who [-q] [am i] Following example shows the output of who command. bash> who singhs :0 May 28 14:05 achint pts/0 May 28 14:06 (lx-ptiwari:0.0) anmol pts/1 May 28 14:12 (lx-ptiwari:0.0) Self-Check Questions 1. The commands below are used to connect to the remote computers: i. telnet ii. rlogin iii. rm 2. It is not possible to logon to another machine with another username by any means. (True/False) 3. If some files are needed to be transferred from a remote location to the current location, we can use the ________________ command for this purpose. 4. If a user needs to know the usage of the write command, he can use the ____________ command to know how the command works. 5. There is a restriction on the usage of the disk space by a user or a group on the UNIX system and this disk space restriction can be found by using the command _____________. 6. To know as to how much total disk space your files and directories have taken, issue __________ command. 25
  • 26. COE Unit 1, Lesson 2 7. On a multi-user system, there are more than one people logged onto a machine and this sometimes chokes that machine off. To get in information as to who all are logged onto the machine we can use ______________ command. 2.7 Printing UNIX provides commands that for printing documents. Additionally, it is possible to control the printer queue and also to kill the processes if required to cancel the printing job. 2.7.1 lpr – Printing This command can be used to print some text in a file. This is used to specify a printer otherwise it issues a print job to the default printer set by the user. 2.7.2 lprm – Removing a printing job The lprm command can be used to cancel the print jobs that have been queued or printing. It can be used to cancel printing jobs on the specified printer or to cancel the job on the default printer. 2.7.3 lpq – Checking the printing queue This command shows the printer queue status on the named printer. Jobs queued on the default destination will be shown if no printer or class is specified on the command-line. 2.8 Process Control When you run a program in UNIX, the program‘s copy starts to run. This running program copy is called a process. The concept of process is fundamental to UNIX OS. So, you should find out and understand details about processes. If you run the same commands twice, each time a new process is started. Every process is identified by a unique process ID and this ID can be used to refer to this process or to perform any further operations on the process, like killing the process. We will have a look at the commands which can be used to control the processes. 2.8.1 ps – Finding the process This command is used to list all the processes being run on the machine. bash> ps –ef PID PPID User Process … 233 230 achint ls –l 345 342 anmol ps –ef 26
  • 27. COE Unit 1, Lesson 2 2.8.2 & - Running process in background By put ‗&‘ at the end of any command, that command runs in the background. Time consuming commands can be put into background so that you can continue working on the same terminal. 2.8.3 Cntrl-z – Suspending a processes If some command is by mistake issued and you want to suspend this command and do something else first. Then you can use Cntrl-z to suspend this process and get the CPU free for some other more important work. 2.8.4 Jobs – Finding the process in background To find the processes running in the background you can use the jobs command. This is different from the ps command. 2.8.5 Kill – Killing a process If some process is running for long time or is producing some unwanted results you can use the ‗kill‘ command to kill the process. The syntax of command is Kill [-signal] [process id] Sometimes a process may still not get killed and you still want to kill it, you can send the -9 signal to kill it. 2.8.6 nice – reducing the priority of process This command can be used to reduce the priority of a command and let other commands run earlier than the command. The syntax of command is nice command [command option] Self-Check Questions 8. If a print job is fired it is not possible to abort the printing. (True/False) 9. To know as to what all are the print processes that are at the printer in queue, we can use ____________ command. 10. To print some text in a file, use ______________ command. 11. To change the priority of a job we can use the _________ command. 12. If some process is fired which is not required at the moment and we need to fire another process, then we suspend the process using _______________ command and continue with the process later on. 27
  • 28. COE Unit 1, Lesson 2 13. If it is required to know the processes running on to the system then we will issue ______________ command. 2.9 Miscellaneous commands Besides the other commands that we have discussed in this lesson by now, there are numerous other commands in UNIX with lots of options which can be used to perform some amazing tasks. We will be discussing some of these commands with useful and common options that are used. For other options readers can refer the man pages of these commands. 2.9.1 alias / unalias command To create or remove an alias for some command these commands are used. The example shows the use bash> alias rm ―rm –i‖ Creates an alias rm which calls rm –i bash> unalias rm Now rm will call rm command 2.9.2 cal (calendar) command This command displays the calendar. 2.9.3 clear command This command clears the screen 2.9.4 crontab command It is sometimes required to run some commands at a specific date and time. For this purpose ‗crontab‘ command can be used. See man crontab for see details. The cron (see man cron) maintains a file which is managed using the crontab command. This file contains the information about the command and the time and date of the execution of the command. Here is an example: bash> crontab –l 0 0 * * 5 echo ―This is a cron‖ | mail john Contents of crontab file. 2.9.5 csh command This command is used to run the C shell or to execute a C shell script. The syntax for this command is csh [filename] 28
  • 29. COE Unit 1, Lesson 2 2.9.6 history command This command is used to list the commands that you have typed so far. 2.9.7 date command This command prints the system date and time. The date command has many formatting arguments. See man date for details. bash> date Friday 25 Jan 2008 2.9.8 echo command This command echoes back string given to it. bash> echo ―My name is achint‖ My name is achint 2.9.9 grep command This command is used to search a pattern in a file. We will see more details on grep command in subsequent chapters. Here is a simple example. bash> grep goto file.c /*You should not use goto in c programming */ 2.9.10 unset command The unset commands removes a shell variable. 2.9.11 tar command This command is used to create an archive of files or to extract files from an existing archive. See man tar for details. 2.9.12 tee command This command copies text from a pipe into a file. See man tee for details. 2.9.13 touch command This command changes the date and time of a file without changing the files content. The touch command creates a file if no t exiting. 29
  • 30. COE Unit 1, Lesson 2 Self-Check Questions 14. An ____________ is a short command or word that points at some path, or absolute command name. 15. To change the date and time stamp on a file without reading the file __________ command can be used. 16. To get the text from a pipe into a file ______ command can be used. 2.10 Summing Up UNIX provides a rich set of commands for file management, printing, process control, etc. 2.11 Answers to the self-check questions 1. telnet, rlogin. 2. False. 3. ftp 4. man. 5. quota 6. du. 7. Who 8. False 9. lpq 10. enscript 11. nice 12. cntrl-Z 13. ps 14. alias 15. touch 16. tee 2.12 Terminal Questions 1. Define and explain the various command classes 2. How is communication handled in UNIX? What is FTP? 3. Describe how File Management is implemented in UNIX 4. List the commands and their usage for various commands used in process control 5. Explain the various print commands in UNIX 30
  • 31. COE Unit 1, Lesson 3 LESSON 3 UNIX FILE S YSTEMS 3. UNIX FILE SYSTEM ....................................................................................................... 33 3.0 OBJECTIVES ............................................................................................................ 33 3.1 INTRODUCTION ........................................................................................................ 33 3.2 FILES ....................................................................................................................... 33 3.2.1 Filenames .......................................................................................................... 33 3.2.2 Filename Extensions ....................................................................................... 34 3.3 DIRECTORIES .......................................................................................................... 34 3.4 FILE TYPE................................................................................................................ 34 3.4.1 Links ................................................................................................................... 35 3.4.2 Special Files...................................................................................................... 35 3.5 PATH TO A FILE ........................................................................................................ 36 3.5.1 The root directory ............................................................................................. 36 3.5.2 Absolute Path.................................................................................................... 36 3.5.3 Relative Path..................................................................................................... 36 3.6 MANIPULATING FILES .............................................................................................. 36 3.6.1 Moving and Renaming Files and Directories ............................................... 36 3.6.2 Copying files and directories .......................................................................... 36 3.6.3 Removing Files and Directories ..................................................................... 37 3.6.4 Creating a directory.......................................................................................... 37 3.6.5 Listing the files .................................................................................................. 37 3.7 FILE PERMISSIONS .................................................................................................. 38 3.7.1 File Permissions ............................................................................................... 38 3.7.2 Permissions for directories ............................................................................. 39 3.7.3 Changing the permissions on the file ............................................................ 39 3.8 CHANGING FILE OWNER AND GROUP .................................................................... 40 3.9 FILE SEARCH........................................................................................................... 40 3.10 VIEWING BEGINNING AND END OF A FILE................................................................ 40 3.11 ANSWERS TO THE SELF CHECK QUESTIONS ........................................................... 41 3.12 TERMINAL QUESTIONS............................................................................................. 42 3.13 SUGGESTED READING MATERIAL........................................................................... 42
  • 33. COE Unit 1, Lesson 3 3. UNIX File System In the UNIX operating system the basic storage block is known as a file. This lesson focuses at understanding the concepts of file manipulation and handling. 3.0 Objectives After going through this lesson, you will be able to  Understand the basic concepts of files and directories  Understand the paths and pathnames in UNIX systems  Understand the UNIX file types  Understand the basic UNIX commands related to the file system  Understand the file manipulation and file security 3.1 Introduction In a UNIX operating system the basic structure that stores data is known as a file. You can store data of any format in a file. Multiple files can be put together in a directory. Apart from containing files, a directory can contain other directories as well. A directory that is inside another directory is called a subdirectory. A file is analogous to a notebook. A directory is analogous to a bag that contains files. 3.2 Files A file contains a sequence of bytes stored on a storage device, such as a disk. On the disk the file is not necessarily stored on a single sector but can be scattered on the disk The OS, keeps track of the information that belongs to a specific sequence of data. 3.2.1 Filenames Each file has a name. Any name can be given to a file. The name of a file can be changed anytime. Unlike windows, UNIX file names do not contain spaces. An important thing to remember here is UNIX is case sensitive. Which means ‗A‘ is different than ‗a‘, so one should be very careful while using the cases for separating the file names. So, myfile.txt and myFile.txt are different files. 33
  • 34. COE Unit 1, Lesson 3 3.2.2 Filename Extensions UNIX does not enforce any specific extensions on file names. This is unlike Windows where extensions are used to invoke applications directly. In UNIX you can choose any extension for your files. Even multiple extensions are permitted (e.g.,data,tar.gz). Also files need not always have extensions (e.g., myFileOf24Dec2007). Since it is possible to not give extensions, one can create files where extensions are misleading. For example, myProg.db may be a C program while myData.cpp may be containing simple text data. Obviously this is not desirable and one must be careful in putting proper extensions. Though UNIX itself does not enforce any extensions, there are many important utilities/programs that expect a specific file exte nsion. For example, the C compiler expects files with .c or .h extensions. 3.3 Directories Files are kept in directories. Directories are the groups of files in some logical structure totally dependent on the application and the user requirements. A directory can contain files and other subdirectories. The figure below shows how the directory myData contains subdirectories which in turn contains the files. myDat a/ Investmen Official ts/ / RBI ICI Sal custo Reports Bonds CI es mers pla n Each directory in UNIX contains two special subdirectories: ./ (The dot directory) This indicates the current directory itself. ../ (The dot dot directory) indicates the parent directory of current directory. bash> pwd Investments Shows current directory as Investments/ bash>cd .. bash>pwd myData Current directory after cd .. is myData/ (the parent) 3.4 My name is achint 34
  • 35. COE Unit 1, Lesson 3 File Type Regardless of the data contained in a file, UNIX associates a file type for each file. There are 4 file types - ordinary files, directories, links and special files. Ordinary file is any file that you commonly use. These include text files, executable programs, shell scripts, etc. Also, we have already see what are directories. Lets now see links and special files. 3.4.1 Links A link is not a file but it is a second name to a file. Sometimes linking files is a good option over copying because once copied, the copies can be changed differently. On the other hand if you create a link then there is actually only one copy of the file. A link is created using the ln command of UNIX. There two types of links, soft link and hard link. See man ln for more details. 3.4.2 Special Files UNIX represents even devices with files. These files are special files. For example, the audio output is typically /dev/audio file. What can you do with such a special file? Well, you can write into it or read from a special file and UNIX hides the details on how it is actually working with the device. For example, you can simply cat a music file to /dev/audio and it will be played! Self-Check Questions 1. IT is possible to have multiple filename extensions in a file in UNIX. (True/False) 2. It is required to have a filename extension in a file in UNIX, which signifies the properties of that file. (True/False) 3. Filename work and Work points to the same file in a UNIX file system. (True/False) 4. Directories acts as a categorization structure of the data in a UNIX file system. (True/False) 5. __________________ is a directory under the parent directory, which can be used for the categorization of data further down the hierarchical file structure. 6. Which is not a UNIX file type? a) Links b) Symbolic Links c) Program files d) Directories 7. A ______________ (soft/hard) is only a te xt file that points to some other file somewhere in the file system and does not contains the data. 35
  • 36. COE Unit 1, Lesson 3 3.5 Path to a file 3.5.1 The root directory UNIX OS treats the directory / as the root directory. The root directory is the ultimate parent of all other directories on a UNIX system. 3.5.2 Absolute Path Every file on a system has a path that starts from the root. For example, bash> pwd /dtu/IT_Courses/IT_101/schedules.txt This is the absolute path to the ―schedules‖ file . The pwd command always lists the absolute path. 3.5.3 Relative Path When in a directory, if you know the relative position of a file, you need not access that file using absolute path. You can simply use the relative path to the desired file as well. This is shown in an example below: You can also access files using relative paths. For example, bash> pwd This is the relative path of /dtu/It_Courses/IT_999 ―schedules.txt‖ with respect to ―/dtu/It_Courses/IT_999‖ bash> ls ../IT-102/schedule.txt 3.6 Manipulating Files The file manipulation operations are – file deletion, file renaming and moving files from one location to another. 3.6.1 Moving and Renaming Files and Directories The mv command of UNIX moves files and directories to specified locations. bash> mv –i data data.old Moves data to data.old bash> mv –i data new bash> mv –i oldDir newDir Moves data into new/ directory Moves oldDir to newDir 3.6.2 Copying files and directories 36
  • 37. COE Unit 1, Lesson 3 The cp command of UNIX copies files and directories.. bash> cp old new Copies file old to new. Overwrites new if exists. bash> cp –R /home/joe/bread /home/jam/food Copies all files and subdirectories to the target directory 3.6.3 Removing Files and Directories Often you want to files or some directory (including its contents). For example you may be cleaning your system. The rm command deletes files and directories. bash> rm file.txt my.txt Removes specified files. -f option indicates that rm will not give bash> rm –f file.txt error even if file given to be deleted does not exist. bash> rm –r directory1 -r option indicates delete all subdirectories as well. Be careful with rm command. A file or directory once deleted cannot e undeleted in UNIX. There is no such thing as trash can in UNIX. It is advisable to use the –i option of rm command all the time. See man rm for details. If a directory is empty, then it can be deleted using rmdir command. See man rmdir for details. 3.6.4 Creating a directory The mkdir command creates a new directory. bash> mkdir project Will create directory project/ bash> mkdir /home/anmol/data bash> mkdir ../../myDir Absolute path can be given to create a dir Relative path can be given 3.6.5 Listing the files The ls command of UNIX lists files and directories in the current directory. lt has a large number of other options (see man ls). 37
  • 38. COE Unit 1, Lesson 3 bash> ls -l achint is the file owner. drwxr--r-- 1 achint editors 4096 drafts editors is the group. Size is -rw-r--r-- 1 achint editors 30405 edition-32 8460 bytes -r-xr-xr-x 1 achint editors 8460 final_draft This field explains file permissions and file type the fields are explained in table below Self-Check Questions 8. The __________________ is the parent directory of all types of directories in the UNIX file system. 9. The name of file starting from the root directory is called the _____________ pathname of the file. 10. The relative pathname of a file is the name of the file with respect to the parent directory. (True/False) 11. Pick the odd one out Following operations can be performed on the file system a) Building b) Listing c) Renaming filenames d) Copying 12. On using the ‗mv‘ command from one file to an existing file it ___________ (appends/overwrites) the contents of the moved file onto existing file. 13. To copy one directory to the other it is mandatory to use the option _______ with the command ‗cp‘. 14. Command ‗rmdir‘ can be used to delete the complete hierarchical directory structure. (True/False) 3.7 File Permissions UNIX enforces permissions for files and directories. If you are the owner of a file, you can put permissions whether the file should be readable by others or not, and so on. Lets see more details about file permissions. 3.7.1 File Permissions The user of the UNIX file system can belong to three classes:  The owner of the file  The group which the file belongs to  Other users 38
  • 39. COE Unit 1, Lesson 3 bash> ls -l drwxr--r-- 1 achint editors 4096 drafts -rw-r--r-- 1 achint editors 30405 edition-32 These 3 indicate -rwxr-xr-- 1 achint editors 8460 final_draft group people can read/execute but cannot write into -rwxr-xr-- this file These 3 indicates First letter: others can only - means read this file. ordinary file d means These 3 letters indicates file directory readable, writable l means its a and can be executed link by the owner. 3.7.2 Permissions for directories For the directories read permissions enables the user to list the contents of the directory; Write permissions allows the users to create a file or a directory inside that directory and execute permissions allows to change the present working directory to that directory. 3.7.3 Changing the permissions on the file The chmod command changes the permissions for a file and directory. See man chmod for details. There are several ways to change the permissions of a file. Here are few examples: bash>chmod ug+r w sample Permits user and group to read and write bash> ls -ld sample in file drw-rw---- 2 achint editor 96 Dec 8 12:53 sample bash> chmod a-rwx sample Removes permissions for all bash> ls -l sample ---------- 2 amol editor 96 Dec 8 12:53 sample There is another form in which the permissions can be directly set for the files by using an octal code. With three-digit octal notation, each numeral represents a different component of the permission set: user class, group class, and "others" class respectively. For example, the number 764 in octal can be represented as following in binary 111110100. 39
  • 40. COE Unit 1, Lesson 3  The first octal digit when converted to binary represents the permissions for owner (7 in octal is 111 in binary which implies rwx for owner).  The next octal digit when converted to binary represents the permissions for the group (6 in octal is 110 in binary which implies rw- for group).  The last octal digit when converted to binary represents the permissions for the others (4 in octal is 100 in binary which implies r-- for other). 3.8 Changing File Owner and Group The chown command changes the owner of a file. See man chown for details. The chgrp command changes the group of a file. See man chgrp for details. 3.9 File Search The find command helps in locating files and directories. This is a powerful command and has lots of options. See man find for details. Here is the syntax of the find command. find search_directory –name file_name [-print] The find command searches through the contents of one or more directories including all of their subdirectories. bash> find / -name schedule -print /dtu/IT_courses/IT_101/schedule Finds all the files in ‗/‘ named /dtu/IT_courses/IT_102/schedule schedule Another example in which same file name is searched in two directories: bash> find . –type d –name abc -print Finds ‗directory‘ abc and not file in the present directory . 3.10 Viewing Beginning and End of a file UNIX provides commands using which it is possible to display the contents of the start or end of the file. These are head and tail commands. head – Start of the file tail – end of the file 40
  • 41. COE Unit 1, Lesson 3 Example usage bash> head –n 10 file Shows the 10 starting lines of ‗file‘ Self-Check Questions 15. Pick the odd one out The users in a UNIX file system can be categorized as: a) Owners b) Group c) Friends d) Other users 16. To change the file permissions from one set to another, the command ___________ can be used. 17. __________________ command is used to change the owner and the group of the file. 18. The _______ command lets you search for files and directories. 19. The _______ command will be useful to show the last few lines of a file. 3.11 Answers to the self check questions 1. True 2. False 3. False 4. True 5. Subdirectory 6. Program files. 7. Soft link 8. Root. 9. Absolute path.. 10. True 11. Building 12. overwrites. 13. –r 14. False 15. Friends 16. Chmod 17. Chown, chgrp 18. Find 19. tail 41
  • 42. COE Unit 1, Lesson 3 3.12 Terminal questions 1. Write a detailed note about the hierarchical file structure. 2. Explain briefly the manipulating operations possible on the file structure 3. Write a brief note on the permissions on the files and directories in UNIX. Also, explain how we can change permissions of the files in UNIX using the chmod command. Use some relevant examples to explain the concepts. 4. Explain the UNIX system file types, also explain the salient features of each file type 3.13 Suggested Reading Material 1. Unix Programming Environment, by Kernighan and Pike. 2. Design of Unix Operating System, by Maurice J. Bach 42
  • 43. COE Unit 1, Lesson 4 LESSON 4 T HE VI T EXT EDITOR 4. THE VI TEXT EDITOR.................................................................................................... 45 4.0 OBJECTIVES ............................................................................................................ 45 4.1 INTRODUCTION ........................................................................................................ 45 4.2 FILES CONTAIN STREAM OF CHARACTERS .............................................................. 45 4.3 HOW VI HANDLES THE FILES ................................................................................. 46 4.4 INVOKING VI ............................................................................................................. 46 4.5 MODES OF VI ........................................................................................................... 46 4.5.1 Command mode ............................................................................................... 46 4.5.2 Edit mode........................................................................................................... 46 4.5.3 Switching between command mode and edit mode................................... 47 4.6 POSITIONING TE XT ON THE SCREEN ...................................................................... 47 4.6.1 Scrolling and moving the Screen ................................................................... 47 4.6.2 The GOTO Command ..................................................................................... 48 4.6.3 Searching........................................................................................................... 48 4.7 POSITIONING THE C URSOR : H, L, J, K COMMANDS................................................. 48 4.8 EDITING USING SCOPES .......................................................................................... 49 4.8.1 Delete Text (d, D) ............................................................................................. 50 4.8.2 Change Text (c, C) ........................................................................................... 50 4.8.3 Replace Command (r, R) ................................................................................ 50 4.8.4 Erase Command (x, X) .................................................................................... 51 4.8.5 Undo Command (u, U) .................................................................................... 51 4.9 TE XT INSERTION ...................................................................................................... 51 4.9.1 Append Command (a, A) ................................................................................ 51 4.9.2 Insert Command (i, I) ....................................................................................... 52 4.9.3 Open Command (o, O) .................................................................................... 52 4.9.4 Read Command (:r) ......................................................................................... 52 4.10 GLOBAL SEARCH AND REPLACE FOR TEXT ............................................................ 52 4.11 REARRANGING AND DUPLICATING TEXT................................................................. 53 4.11.1 Copying Text and Moving the Copy .............................................................. 53 4.11.2 Deleting Text and Moving It ............................................................................ 54
  • 44. COE Unit 1, Lesson 4 4.12 NAMED BUFFERS .................................................................................................... 54 4.12.1 Using the named buffers ................................................................................. 55 4.13 MISCELLANEOUS INFORMATION .............................................................................. 56 4.13.1 Creating Line Numbers ................................................................................... 56 4.13.2 Lines and Sentences in VI .............................................................................. 56 4.13.3 Joining Lines ..................................................................................................... 57 4.13.4 Repeating a Command ................................................................................... 57 4.13.5 Editing Multiple Files Using vi......................................................................... 57 4.13.6 Mark Command ................................................................................................ 58 4.14 SAVING OR STORING A FILE.................................................................................... 58 4.14.1 Writing to the file ............................................................................................... 59 4.14.2 Exiting the vi editor ........................................................................................... 59 4.15 SUMMING UP........................................................................................................... 60 4.16 ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................... 60 4.17 TERMINAL QUESTIONS ............................................................................................ 61
  • 45. COE Unit 1, Lesson 4 4. The VI Text Editor When you write programs, scripts or modify data, write mails, etc., you will need to use text editor. This lesson focuses on the VI text editor; one of the most commonly used text editors in UNIX systems. 4.0 Objectives After going through this lesson, you will be able to  Understand how to open and edit files using vi  Understand various text insertion and deletion methods in vi  Understand the basic structure of vi text editor  Understand the commands to edit text using vi and scopes  Understand miscellaneous other features of vi 4.1 Introduction vi is a visual, non-graphical and interactive text editor which allows a user to create, modify, and store files on the computer. Note that in this chapter, the cursor is shown by putting an underscore for a character. For example: The cursor is at the letter ‗n‘ in the following line. This is a line. There's an editor out there that programmers have been using to edit their programs for the last 24 years. It's called vi (say vee-eye) and it is it is quite powerful. http://www.websiterepairguy.com/articles/vi/12_learn_vi.html 4.2 Files contain stream of characters When you type characters or numbers, etc. each key goes as an ASCII character. For example, ‗a‘ gets recorded as ASCII 97. When you write lines like these This is line 1 This is line 2 These lines are stored as a stream of characters like ―This is line 1 nThis is line 2‖. Here the n is a special character which signifies a new line. 45
  • 46. COE Unit 1, Lesson 4 4.3 How Vi Handles The Files When you open a file in vi, the file contents are read into a buffer. All text editing jobs are done in memory as the buffer. The file on the disk is not updated unless vi is explicitly asked to save the changes. This gives an option to change the content of the buffer until you are not satisfied without changing the file on the disk. 4.4 Invoking vi The vi editor can be invoked using the following command $ vi demo.txt The figure below shows how the file looks when opened in vi. The cursor ~ ~ Tile(~) in vi represents an ~ empty line. ~ . 4.5 Modes of vi . File information ―myfile‖ [new vi has two modes in which you will work. file] 4.5.1 Command mode The command mode is the default mode. All vi commands work only in the command mode. In the command mode you cannot write text. You can only move around in the text, delete text, modify existing text, search for text, etc. 4.5.2 Edit mode In edit mode you can add new text in vi. In edit mode you cannot use any commands to search or navigate in the text. 46
  • 47. COE Unit 1, Lesson 4 4.5.3 Switching between command mode and edit mode When in command mode, few commands take you to edit mode. For example, in the command mode, if you press i, you will get to the edit mode and can add text. When in the edit mode, you can stop editing further and go to the command mode by pressing the <Esc> key. 4.6 Positioning Text on the Screen This is a You are in command line mode and cursor is at ‘a’. press ‗i‘ This is a Cursor is at same position line but edit mode has started This is da now press ‗d‘ Cursor is at letter ‗a‘ and line Press letter ‗d‘ is added. ‗esc‘ This is provides several ways you are in text you want to edit in a file. vi da Now to reach the line command mode 4.6.1 Scrolling and moving the Screen By scrolling the screen we can reach the text desired. The table below explains how one can scroll the screen. Command Resulting Action Cntrl+u Moves window upwards one complete screen Cntrl+d Moves window downwards one complete screen H Takes cursor to the top of the screen L Takes the cursor to the bottom of the screen M Takes the cursor to the middle of the screen All these commands work only in the command mode. 47
  • 48. COE Unit 1, Lesson 4 4.6.2 The GOTO Command Sometimes you already know the line number where you want to reach. You can use the GOTO in such cases. The table below explains the command and the resulting action. Command Resulting Action G Moves cursor to the last line <N>G Moves the cursor to the Nth line Like 33G :<N> Moves the cursor to the Nth line Like :65 4.6.3 Searching It is also possible to search for a pattern and by this the screen will be moved to the occurrences of the desired pattern. Here are the commands that work for search in vi.. Command Resulting Action ‗/pattern‘ Searches the pattern forward from current cursor position ‗?pattern‘ Searches the pattern backward from current cursor position :set ic This makes the subsequent searches case insensitive (ic in set ic stands for ignore case) :set noic This makes the subsequent searches case sensitive Once you start a search you can repeat the search in a simple way. On keying in ‗n‘ vi goes to the next instance of pattern in the file and using ‗N‘ it searches in opposite direction. 4.7 Positioning the Cursor : h, l, j, k commands This section explains finer control of the cursor. You can move the cursor by use of "arrow" keys. You can also use the "direction" keys "h" (move left by one character), "j" (move down to next lined), "k" (move up to previous line), and "l" (move right by one character). The "RETURN" key is similar to the "j" key in that it moves the cursor down one line. However, the "RETURN" key always positions the cursor at the beginning of the next line; whereas, the "j" key moves the cursor straight down from its present position, which may be the middle of a line. Moving several spaces may be accomplished by repeatedly pressing the "RETURN", direction 48
  • 49. COE Unit 1, Lesson 4 or arrow key; such as, "k" "k" "k" to move upward 3 lines. You can also precede any of these keys with a number and achieve the same results, "3k". Self-Check Questions 1. If in a file cursor is resting at the 34 line and it is desired to be placed onto the 74 line then the command that is to be issued is _____________G. 2. On searching with ―?‖ and ―/‖, the search respectively will be done ______________ and ____________________. (backwards/forward). 3. To get the file statistics using the VI editor the command required to be issued is ___________. 4. On keying in ―N‖ while searching for a pattern using ―?‖ the cursor will reach the next instance of the pattern ________________. (backward/forward) 5. To move to the 25 word in the line while the cursor is on 18 line the command that can be issued is ___________. 6. To move to the beginning of the line on which the cursor is residing in a text file the command that can be issued is __________. 7. The vi editor sets or creates a temporary buffer area while editing a file which is stored on the disk and is used later on for the reference purpose by the editor. (True/False) 4.8 Editing using scopes vi commands have scope built into them. For example, when you say ‗dd‘ then first ‗d‘ indicates the delete operations and the second ‗d‘ tells it to apply the command on a line. Similarly, ‗yy‘ yanks a line. But the commands like ‗d‘ and ‗y‘ can be given a scope and VI commands also have upper case versions. Scope Text Unit Encompassed 0 Beginning of line $ End of line W w Word right B b Word left E e End of word right With the scopes we can use the operators to get more powerful outcomes. We can further do editing very much locally using the combination of the operators and scopes. In this section we will discuss this combination. 49
  • 50. COE Unit 1, Lesson 4 4.8.1 Delete Text (d, D) The delete command is used in command mode to remove portions of text from the file being edited. The scope must be specified after the delete operator. Some of the most common scopes used with the delete operator shown in the next table. Delete Resulting Action operator and scope dw Delete word forward D( Delete complete sentence backward d) Delete complete sentence forward dG Delete from current line to end of file dL Delete from current line to end of screen d/^xyz Delete from current line to first occurrence of pattern dtx Delete from current place to first occurrence of ‗x‘ NOTE: The same scope prefixes can be used with all the scoped text editing commands so we will not discuss them with any further commands b ut different scopes or operators, if any will be discussed. NOTE: It is important to remember that the current cursor position serves as the starting point for the scope. This means if you do scoped deletion, it will happen starting from the current point. For example, typing "2dd" will delete two consecutive lines beginning with the current line. 4.8.2 Change Text (c, C) You can use the change command to change the text in a line. Scopes are applied in the same manner as they are used with the delete command. On issuing the change text command, vi gets into the edit mode and after the text insertion on issuing the <ESC> key it returns to the command mode. The example shows how change command can be used. This is the line to watch Cursor is positioned at‗t‘ On issuing the command ‗2cw‘ or change two words and keying in ―new line‖ Text inserted in place of 4.8.3 Replace Command (r, R) This is new line to watch two words The replace command is used to replace portions of text on the screen. The table shows the two variants of the replace command and their usage for replacing text. 50
  • 51. COE Unit 1, Lesson 4 Replace Text replacing action command r Used to replace a single character at a time R Used to replace as many characters as there are keystroke until user issue <ESC> This is the line to watch out for. Cursor positioned at ‗l‘ On issuing ‗r‘ command and typing ‗m‘ ‘l‘ is replaced by ‗m‘ This is the mine to watch out for. 4.8.4 Erase issuing ‗R‘ command, On Command (x, X) keying in ―kite‖ and <ESC> Complete word is The erase command removes a character. replaced This is the kite to watch out for. Erase Erase Action Command x Erase character on which cursor is placed X Erase character left to cursor 4.8.5 Undo Command (u, U) Undo command reverses the effect of the editing operations done on a file. ‗u‘ reverses the effect of last editing command whereas ‗U‘ reverses the effect of all the editing operations on the file since last save. 4.9 Text Insertion vi editor provides several ways to insert the text in the file. We will be discussing each of these methods in some detail but it is advisable for a newly inducted candidate to take up one approach and use that to insert the text. 4.9.1 Append Command (a, A) It is used to add to the existing text. It has two forms ‗a‘ and ‗A‘. These two forms are explained in the figure below. The student laughed. On issuing ‗a‘ command and typing ‗s‘ and <ESC> The students laughed. Text appended after the cursor 51 The students laughed. Aloud.
  • 52. COE Unit 1, Lesson 4 4.9.2 Insert Command (i, I) This command is used to insert the text into a text file. This command has two forms ‗i‘ and ‗I‘. In the figure below it is explained how to use this command. The student laughed. On issuing ‗i‘ command and typing ‗new ‘and <ESC> The new student laughed. Text inserted before the cursor On issuing ‗I‘ command and typing appended at end of line. Text ‗Again‘and<ESC> Text appended in the Again The student laughed. beginning of line. 4.9.3 Open Command (o, O) Open command opens a new line to add text. This has two forms ‗o‘ and ‗O‘, in the figure below the usage is explained. The student laughed. On issuing ‘O’ command and typing ‘A new line is added’ and ESC> A new line is added The student laughed. Text inserted above the current line On issuing ‘o’ command and typing ‘Another line ’ and <ESC> A new line added The student laughed. Another line Text appended in the beginning of the line. 4.9.4 Read Command (:r) The read command is allows the user to copy of another file into the current file. While in command mode and with the cursor on the line above where you want the special file read in, type: :r <File> Reads the file specified at cursor location in the current file 4.10 Global Search and Replace for text 52
  • 53. COE Unit 1, Lesson 4 The example below shows different commands that can be used for searching and replacing with different purpose. :1,$s/oldText/newText/g This command replaces all the instances of oldText with :1,15s/oldText/newText/g newText in the file This command replaces :g/oldText/s//newText/gc oldText with newText from line number 1 to 15 This command asks before replacing text each time Self-Check Questions 8. To delete the word on which the cursor is placed ―D‖ command can be issued. (True/False) 9. The change operator invokes the text insertion mode. (True/False). 10. The operator _______________ changes the text, yet does that in command mode and not in text insertion mode. 11. The command ______________ replaces the characters on screen one at a time as the user keys in the new characters. 12. To erase the character on which the cursor is place __________ command is to be issued, whereas to delete the character prior to the character (left) on which the cursor is placed _________ command needs to be issued. 13. To replace the name ―shahs‖ with ―mazes‖ in a text file the command to be issued is ___________. 4.11 Rearranging and Duplicating Text You can yank text for copying it at another place in the text file. 4.11.1 Copying Text and Moving the Copy Step 1: Copying Text with the Yank Command (y, Y) The yank command ‗y‘ can be used with the scopes and similar scopes can be used as we have seen in delete command. Yanking places the yanked content into an unnamed buffer. Some of the examples of yanking are: This is the line to be yanked . cursor is character ‗l‘ On issuing the command ‗3yw‘ which means yank 3 words, it yanks 3 words starting from current cursor position cursor is at first line This is the line to be yanked This is another line ‗3yy‘ will Issuing command to yank This is yet another line that can be yanked yank 3 lines starting from 53 current line
  • 54. COE Unit 1, Lesson 4 Step 2: Put Command (p, P) The put command is used to place the contents of the unnamed buffer back into the file being edited. Returning whole lines into the text is handled differently than word and sentence fragments. The lower-case "p" places the line or lines below the current line and the upper-case "P" places them above the current line. A handy feature of yank & put is the ability to insert copy repeatedly within the same file. The format for this action is yank, relocate cursor, put, relocate cursor, put, etc. until all needed copies have been placed. 4.11.2 Deleting Text and Moving It When you delete a text, it gets yanked and thus it can be used to put in another place in the text. This is the file. This line will It contains text. be deleted This line will be deleted. using ‗dd‘ Below this it will be later command. on pasted. Currently cursor This will be the end of is placed on this file. is the file. This line It contains text. On using the ‗p‘ Below this it will be later command the on pasted. line is placed This line will be deleted. below the This will be the end of present cursor file position 4.12 Named Buffers Named buffers offer another way to copy (yank) or remove (delete) text. The unnamed buffer only saves the last deleted or yanked text. vi provides 26 named buffers (a-z) are created for your use. Named buffers allow users to yank multiple text and put them at different places. These named buffers remain only for the life of the current editing session. Once you quit vi, these buffers are no longer available. Here are few examples of how named buffers are used. Typing "g7yy in command mode, implies the following: Quote (―) calls for a named buffer ―g gets the buffer named g 7yy implies yanking 7 lines into the named buffer g. 54
  • 55. COE Unit 1, Lesson 4 Now, if you type ―gp, it implies the following: ―g calls for the named buffer g ―gp implies paste the contents of the named buffer g. You can append more information into a named buffer. When you use the capital letter to yank into a named buffer, the yanked contents are appended into the named buffer. For example ―g7yy yanks 7 lines into buffer g, now ―G3yy would yank and append the 3 lines after the already yanked 7 lines into the buffer g. These named buffers are not write-protected. If a named buffer contains information and it is called a second time with its lower-case name, the original material is over-written. 4.12.1 Using the named buffers Once you yank contents into a named buffer g, you can paste it anywhere in the file. If you type ―gp, it implies the following: ―g calls for the named buffer g ―gp implies paste the contents of the named buffer g. p putting the contents below the current line P putting the contents above the current line It is important to note that VI editor will not tell you which all buffers are defined currently also it cannot tell you which buffer contain what; you must remember the names of the buffers and what all contents they have. Self-Check Questions 14. 1To copy 10 lines of text into an unnamed buffer 10_____ command can be used. (Y/y) 15. The text saved in an unnamed buffer created by yanking or deleting can be placed back into the text below the current line where the cursor is placed by using _________ command. 16. To append 5 more lines to the named buffer ‗a‘, the command to be issued is__________. 17. If a named buffer is called upon again and new information is written into it then the new information is appended to the buffer. (True/False) 18. It is possible to get the buffer name on the basis of the content stored in the buffer. (True/False) 55
  • 56. COE Unit 1, Lesson 4 4.13 Miscellaneous Information In this section we will discuss about some miscellaneous information which can be used to be more productive in editing the files. 4.13.1 Creating Line Numbers In vi editor by default the line numbers are not shown. But vi editor allows the line number view. Command for this is: :%nu Sometimes depending upon the requirements it is desired that the line numbers are seen only for the current session. To have line numbers inserted for the current session, type: :set number Immediately you will see the line numbers appear in your file and they will remain until you exit the editor or type: :set nonu The "control s" command stops screen movement. The "control q" command releases frozen screen. The ―control l‖ command refreshes vi screen without modifying the file. The .exrc file There are many setup (set) commands that can be set or changed for vi. It is advisable to put these commands into the ~/.exrc file so that every time vi automatically loads these settings. For example: bash> cat ~/.exrc set nu # Show line numbers set nows # Do not wrap file while searching. bash> The following command will show you the available setup commands. :set all 4.13.2 Lines and Sentences in VI To be successful in your editing, it is necessary to understand what the editor considers a line and a sentence. Just for clarity, a line and a sentence are different items to the editor. To the editor, a line begins on the left of a screen and terminates at a carriage return. The carriage return is the invisible character placed in your file every time you press the "RETURN" key. A sentence to the editor is a string of characters of unspecified length (a few characters to many lines) terminating with the punctuation marks ―.‖, ―?‖, ―!‖ followed by either a carriage return or two blank spaces. 56
  • 57. COE Unit 1, Lesson 4 4.13.3 Joining Lines As you are editing files, you will find it is desirable to combine or join lines. This is easily done using the "J" (join) command. An illustration of joining lines is given below. The cursor is located on the top line when the "J" command is issued. vi will move the lower line and butt it to the end of the upper line. The editor takes care of necessary spacing for you. 4.13.4 Repeating a Command To make life a bit easier, vi allows text alteration commands to be repeated by using the ―.‖ (Repeat) command. A handy way to illustrate the repeat command is with the “c w” command replacing a single word with two new words throughout a paragraph. In this example, the first occurrence of ―PU‖ is located with the search command PU”. Then with the cursor on the ―P‖ of ―PU‖, the ―cw” command is issued followed with ―Purdue University‖ and the ―ESC”. The „n‟ key is pressed to find the next occurrence of ―PU‖. The cursor relocates on the ‗P‘ of the next ―PU‖ and all that is required to change it to ―Purdue University‖ is to type ―.‖ 4.13.5 Editing Multiple Files Using vi The vi editor provides a feature which allows a user to edit multiple files by use of the ":e" (edit) command. This ability to access multiple files without leaving the editor permits a user to see information in another file without exiting the editor. Additionally, because files are opened within the same editor invocation they can share the same named buffers, thereby making the transfer of text possible between the files. When vi is invoked, a work area called a buffer is created for editing purposes. It is into this work space that a copy of a specified disk file is placed. The editor permits only one file copy in 57
  • 58. COE Unit 1, Lesson 4 this buffer space at a time. Thus after making changes to a file (delete, add, or change), you must inform the editor what you wish done to the current buffer contents before you will be permitted to bring another file into this space. You do this by use of the ":w" (write current buffer contents to opened file), ":e! newfile" (toss current buffer contents, no update to opened file, and place a copy of newly called file in buffer), or ":quit!" (Exit editor and toss buffer and buffer contents). When you have two files open, VI permits toggling between files by use of ":e #". This works because whenever VI sees the character "#" used in a command where a filename is expected, it substitutes the "#" with the name of the previous file. For example if you had been in fruits then opened vegetables, the command ":e #" would return you to where you were in the fruits file. Repeat ":e #" and you would be back in vegetables. 4.13.6 Mark Command The mark command sets up a mark in vi and while editing you can go back to the places where you had placed these marks. vi provides 26 marks which are named ‗a‘ to ‗z‘. You can put a mark ―g‖ in a position using a command like the following: mg Note that the marks are not visible at all in vi. You have to remember the marks that you have put. To go back to the marked location ―g‖, use the following command: ‗g 4.14 Saving or Storing a File As mentioned earlier, the VI text editor creates a temporary working area which can be a copy of the existing file on the disk or a new file. This area is at the disposal of the user until he saves the file. On saving the file, the buffer is removed from storage and changes saved on to the file which gets stored on the disk. Disk storage on the other hand gets removed with the remove command of UNIX. The changes made in the buffer are not saved until you specify the command to do so, thus it is advisable to keep on saving the work periodically. We will discuss how to save our work periodically. Below is a schematic showing how the work is saved on the disk. 58
  • 59. COE Unit 1, Lesson 4 4.14.1 Writing to the file It is useful and safe to save the work periodically when typing text. The ‗:w‘ command writes the buffer to the file on the disk thus saving the changes. This works in the command mode. :w <File> Saves the changes done in the <file> 4.14.2 Exiting the vi editor To exit the vi editor you can use the quit command ‗:q‘. This command in conjunction with write command leads to ‗:wq‘ (write and quit). To discard the changes made you can use ‗:q!‖. Self-Check Questions 19. The text insertion command takes the VI control from command mode to text insertion mode. (True/False) 20. If some text is required to be added to the current text, such that the new inserted text is added in the end of the line on which cursor is positioned then text insertion is invoked with the command ____________. 21. If in some application it is required that the same piece of text from one text file is to be inserted in another text file, user can use the command _______________. 22. When using text insertion command read ‗:r‘, to switch back to the command mode from text insertion mode the ESC key can be used. (True/False) 23. On issuing the write command once in the complete session we ensure that in that all the text inserted in the session, includi ng the text inserted after the write command is issued, is saved. (True/False) 24. If we need to store the editing work done in the editor, the command ___________ is needed to be issued. 25. If one finds out that he does not need the text he has inserted into t he editor window in the present session, then he is required to issue ____________ command. 26. In some application it is required to create a file ‗new‘ from a file ‗old‘ with some new text and the file ‗old‘ needs to be kept unchanged. The VI commands that 59
  • 60. COE Unit 1, Lesson 4 should be issued for writing the new changes is __________________ and exiting the VI session is ____________. 27. The VI editor can operate in two modes. The mode which can let the user change the text in the file is _____________________ mode. 4.15 Summing Up In this chapter we have looked upon Vi text editor quantitatively. We discussed a lot of techniques and viewed examples that can help you in editing text files very efficiently. With these techniques at hand you will be able to learn other advanced techniques, when you work in actual environment and situations. 4.16 Answers to self-check questions 1. 74G. 2. backwards and forward. 3. cntrl-g 4. forward 5. 7w 6. 0 (zero) 7. False. 8. False. 9. False. 10. r 11. R. 12. x, X. 13. : g/shahs/s///xyz/g 14. 10yy 15. p 16. ―A5yy 17. False 18. False 19. True 20. A 21. :r 22. True 23. True 24. :w. 25. :q! 26. :w <new>, :q! 27. Edit. 60
  • 61. COE Unit 1, Lesson 4 4.17 Terminal Questions 1. Explain the processes that are used for changing the text using the VI text editor 2. Explain the processes that can be used to delete the text using the VI text editor 3. Write a note about the named buffers and also explain some usage with practical examples Write briefly about the rearranging and duplicating of text in the VI text 4. Explain how the VI editor functions 5. What are the different modes for operating VI Editor? Explain in brief 6. Explain the append, insert and quit modes of operation of VI editor. 61
  • 63. UNIT 2: SHELL SCRIPTING 1: INTRODUCTION TO SHELL ............................................................................... 67 2. SHELL SCRIPTING AND DEBUGGING........................................................ 85 3. CONDITIONAL STATEMENTS ........................................................................ 101 4. REPETITIVE TASKS ............................................................................................. 113 5. REGULAR EXPRESSIONS................................................................................ 133
  • 65. COE Unit 2, Lesson 1 LESSON 1 INTRODUCTION T O SHELL 1: INTRODUCTION TO SHELL ........................................................................................ 67 1.1 INTRODUCTION ........................................................................................................ 67 1.2 THE SHELL: COMMAND PROCESSOR ..................................................................... 67 1.3 BASH: BOURNE AGAIN SHELL............................................................................... 68 1.3.1 Advantages of BASH ....................................................................................... 69 1.4 REDIRECTION .......................................................................................................... 69 1.4.1 Standard Output ............................................................................................... 70 1.4.2 Standard Input .................................................................................................. 71 1.4.3 Standard Error .................................................................................................. 71 1.4.4 Combining Streams ......................................................................................... 72 1.5 VARIABLES .............................................................................................................. 75 1.5.1 Setting strings with the variable names having $ ........................................ 75 1.5.2 Types of variables ............................................................................................ 76 1.5.3 Exporting variables........................................................................................... 76 1.5.4 Using Shell Variables....................................................................................... 77 1.6 COMMAND SUBSTITUTION ....................................................................................... 78 1.7 PATTERN MATCHING – THE WILD CARDS.............................................................. 78 1.7.1 The * & ? ............................................................................................................ 79 1.8 THE C HARACTER C LASS......................................................................................... 79 1.9 MATCHING A DOT (.) ................................................................................................ 80 1.10 SUMMING UP ........................................................................................................... 81 1.11 ANSWERS TO THE SELF-CHECK QUESTIONS ......................................................... 81 1.12 TERMINAL QUESTIONS ............................................................................................ 82
  • 67. COE Unit 2, Lesson 1 1. Introduction to Shell The starting point for the unit on Shell Scripting is to first know about Shell. Bash is also introduced in this chapter. In the subsequent lessons further details pertaining to advanced concepts are discussed at length. 1.0 Objectives After going through this lesson, you will be able to:  Know about different types of shell  See how the shell executes commands  Understand and use Redirection, Variables, Pattern matching etc. 1.1 Introduction The Shell in UNIX is the program which acts as an interface between the user and UNIX system. It understands the user language, interprets it and tells the kernel what user wants, gets the results of the command execution from the kernel and gets back to the user with the results which he understands. All the wonderful things that we can perform or do using the UNIX system is due to the virtue of this program, which can understand so less code and execute the commands and user instruction effectively. Shell can also be known as a command processor it processes the instructions you issue to the machine. 1.2 The Shell: Command Processor On logging onto the UNIX system you encounter a prompt ($ or % or any user custom prompt). Apparently though it seems that nothing is happening, but a program is running which is waiting for your instructions to execute them, this is SHELL. When a user logon the shell starts functioning and keeps on doing that until the user logs out. When you issue a command, the shell is the first agency to acquire the information.It accepts and interprets user requests; these are generally the UNIX commands we key in. The shell examines and rebuilds the command line and then leaves the execution work to the kernel. The kernel handles the hardware on behalf of these commands and all processes in the system. 67
  • 68. COE Unit 2, Lesson 1 Users can thus afford to remain ignorant of the happenings behind the scene. This is one of the beauties of UNIX design and philosophy. The shell generally is sleeping. It wakes up when input is keyed in at the prompt. This input is the input to the program that represents the shell. Below is the list of activities that the shell performs typically. It issues the prompt ($ or otherwise) and sleeps till you enter a command. After a command has been entered, the shell scans the command line for some special characters (metacharacters, we will have a look further) that have a special meaning for it. Because it permits abbreviated command lines (like the use of * to indicate all files, as in rm *), the shell has to make sure the abbreviations are expanded before the command can act upon them. It then creates a simplified command line and passes it on to the kernel for execution. The shell can‘t do any work while the command is being executed, and has to wait for its completion. After the job is complete, the prompt reappears and the shell returns to its sleeping role to start the next ―cycle‖. You are now free to enter some other command. Note: The command at the lower levels does not know or understand the metacharacters thus the shell has to handle and resolve them to normal representations before they are parsed to kernel. 1.3 BASH: Bourne Again Shell Bourne Again shell is the standard GNU shell, intuitive and flexible. Probably most advisable for beginning users while being at the same time a powerful tool for the advanced and professional user. On Linux, bash is the standard shell for common users. This shell is a so-called superset of the Bourne shell, a set of add-ons and plug-in. This means that the Bourne Again shell is compatible with the Bourne shell: commands that work in sh, also work in bash. However, the reverse is not always the case. To know the shell you are using, invoke the command echo $SHELL. The output could show /bin/sh (Bourne shell), /bin/csh (C shell), /bin/ksh (Korn shell) or /bin/bash (bash shell). When BASH is started, it reads its configuration files. The most important are:  /etc/profile - login time for all shelss  ~/.bash_profile – login shell wi ndow for bash (eg: printing system details on screen)  ~/.bashrc – non-login shell window 68
  • 69. COE Unit 2, Lesson 1 1.3.1 Advantages of BASH Bash is an sh−compatible shell that incorporates useful features from the Korn shell (ksh) and C shell (csh). It is intended to conform to the IEEE POSIX P1003.2/ISO 9945.2 Shell and Tools standard. It offers functional improvements over sh for both programming and interactive use; these include: o Command line editing o Unlimited size command history o Job control o Shell functions and aliases o Indexed arrays of unlimited size o Integer arithmetic in any base from two to sixty−four Bash can run most Bourne shell scripts without modifications. In our course, we will work with BASH only. The formats and commands mentioned in this course will be slightly varied if they are to work in different shells. 1.4 Redirection Many of the UNIX commands that we have came across, sends their outputs to the terminal. There are commands which take their input from keyboard. So, one can think of that these commands are designed to accept only fixed sources and destinations. These commands are designed to use the character streams without knowing its source and destination. A character stream is just a sequence of bytes that many commands se as inputs and outputs. In a UNIX system these streams are dealt to be as files, and a group of UNIX commands reads from or writes to these files. A command is usually not designed to send output to the terminal—but to this file. Likewise, it is not designed to accept input from the keyboard either—but only from a standard file which it sees as a stream. There‘s a third stream for all error messages thrown out by a program. This stream is the third file. It‘s here that the shell comes in. The shell sets up these three standard files (for input, output and error) and attaches them to a user‘s terminal at the time of logging in.Any program that uses streams will find them open and available. The shell also closes these files when the user logs out. The standard file for input is known as standard input and that for output is known as standard output. The error stream is known as standard error. By themselves, these standard files are not associated with any physical device, but the shell has set some physical devices as defaults for them: 69
  • 70. COE Unit 2, Lesson 1 Streams Default sources/destinations Standard The default source is Keyboard Input Standard The default destination is the terminal screen Output Standard The default destination is the terminal screen Error 1.4.1 Standard Output There are commands like ―more‖ which sends their output as a character stream, this stream is called the standard output stream and appears on the terminal screen by default. By using the redirection this stream can be redirected or sent to a disk file. Examples, bash>more myFile > newFile The shell looks at the >, understands that standard output has to be redirected, opens the file new file, writes the stream into it and then closes the file. And all this happens with more knowing nothing about it because more sends the output to the stream and that stream gets redirected to a disk file. By using ‗>‘ redirection operator, shell will overwrite and existing file and creates a new file if no file with the name is existing. It is possible alternatively to append to the an existing file by using another redirecting operator ‗>>‘ Operator Action performed > Creates a new file or if the file is already existing then overwrites >> Appends to the file if the file is existing or creates a new file It is also possible to club the commands together and redirect the output to a file. A pair of parenthesis groups the files and a redirection can redirect them to a file. Example, bash> (ls –l; who) > myFile It is also possible that the results are redirected to another program, this is the concept of pipelining which we will discuss later on. Thus conclusively the standard output has three possible destinations: Terminal or the screen and it is the default destination A disk file A pipe – to another command 70
  • 71. COE Unit 2, Lesson 1 NOTE: Shell creates the file before it redirects the output into it . 1.4.2 Standard Input Some commands are designed to take their inputs also as streams. This stream represents the standard input to the command. A classical example for the use of the standard input could be the ―wc‖ command for counting the words: bash>wc 2*4 23 ^ 64 [ctrl-d] 2 10 44 with no filename in output With no filename provided the wc tells the user about the number of lines, number of columns and the number of characters used and sends them to the standard output. bash>wc < my 5 9 54 With some filename provided and redirected to the commands command takes the input stream to be the disk file. Conclusively we can say that the standard input has three possible sources: The keyboard – Used as the default standard input The Pipe – input from the results or output of some other command The file – inputs from a file NOTE: When a file is redirected to a command, then it‘s the shell that opens the file and the command does not know as to what is happening. But when the command is used with the file name as one of the arguments then t he command itself opens the file. 1.4.3 Standard Error When you enter an incorrect command or try to open a nonexistent file, certain diagnostic messages show up on the screen. This is the standard error stream. Like standard output, it too is destined for the terminal. Note that they are in fact two separate streams, and the shell possesses a mechanism for capturing them individually. Before we proceed any further, you should know that each of these three standard files has a number, called a file descriptor, which is used for identification: 71
  • 72. COE Unit 2, Lesson 1 0—Standard input ‗<‘ is same as „0<‘ 1—Standard output ‗>‘ is same as „1>‘ 2—Standard error Must be „2>‘ only These descriptors are implicitly prefixed to the redirection symbols. For instance, > and1> mean the same thing to the shell, while < and 0< also are identical. You normally don‘t need to use the numbers 0 and 1 to prefix the redirect symbols because they are the default values. However, we need to use the descriptor 2> for the standard error: bash>cat bar > errorfile cat: cannot open bar: No such file or directory bash>cat errorfile Without specifying the file descriptor with the redirection symbol we don‘t get the errors in the file bash> cat bar 2>errorfile bash> cat errorfile cat: cannot open bar: No such file or directory This works. You can also append diagnostic output in a manner similar to the one in which you append standard output: bash>cat bar 2>> errorfile You can now save error messages in a separate file. This enables you to run long programs and save error output to be viewed at the end of the day. 1.4.4 Combining Streams In UNIX, it is also possible to use both input and output streams at the same time and shell in this case keeps the command ignorant of the source and destination. bash>cat > my In this case both input and output are redirected. It is also possible to combine < and > operators and the sequence of their use is immaterial for the shell. bash> wc < infile > newfile bash> wc > newfile < infile bash> newfile < infile wc 72
  • 73. COE Unit 2, Lesson 1 All the three commands are different commands for the same task. It is also possible to combine the standard output and standard error in the same command line. bash> cat newfile nofile 2> errorfile > outfile By default, the errors are dumped on the standard error (stderr) and normal output is sent to standard out (stdout). For example, if you simply type the following command to compile some C program, then the only normal output will be sent to stdout, error will still show up on the terminal. bash> cc x.c y.c > compile.out variable x is not defined. variable y is redefined. variable z is not defined. But if you want both the errors and the usual output (e.g. any warnings, etc.) to go into a single file, then you can use the following command: bash> cc x.c y.c > compile.out 2>&1 # Note there is not output printed on the script bash> cat compile.out variable x is not defined. Warning: variable type mismatch. variable y is redefined. variable z is not defined. 2.3 Pipeline In UNIX, it is desired a lot of times that output of some file is fed to another file and this is used to accomplish a task. For instance, the following set of commands is doing some task: bash> who > user.lst bash> cat user.lst araz tty01 May 18 09:32 amol tty02 May 18 11:18 achint tty03 May 18 13:21 Now, to count the number of users we can certainly redirect the file user.lst to make it come from the standard input. bash> wc -l < user.lst 3 This method of using multiple commands to accomplish tasks has some obvious disadvantages: 73
  • 74. COE Unit 2, Lesson 1 1. The process is slow. The later command cannot get executed if the earlier ones are not yet executed. 2. An intermediate file is required that has to be removed after the wc command has been executed. 3. When handling large files, temporary files can built up easily and eat up the disk space. Now, shell has a unique and powerful ability to connect the flow of these three commands, without needing any intermediate files, and each command takes input from the other. This is accomplished using the pipe (|) operator. By using the pipes the command sequence shown above can be compressed to the following single command: bash> who | wc -l 3 Here, ‗who‘ is said to be piped to wc. No intermediate files are created when they are used. When a sequence of commands is combined togethe r in this way, a pipeline is said to be formed. The name is appropriate as the connection it establishes between programs, resembles a plumbing joint. It‘s the shell that sets up this interconnection, and, the commands have no knowledge of it. The pipe is a source and destination of standard input and standard output, respectively. You can now use one to count the number of files in the current directory: bash> ls | wc -l 15 Note that no separate command was designed to tell you that, though the designers could easily have provided another option to ls to perform this operation. And because wc uses standard output, you can redirect this output to a file: bash> ls | wc -l > fkount There‘s no restriction on the number of commands you can use in a pipeline. But you must know the behavioral properties of these commands to place them there. Consider this generalized command line: command1 | command2 | command3 | command4 It should be pretty obvious that command2 and command3 must support both standard input and standard output. Command1 requires to use standard output only, while command4 must be able to read from standard input. If you can ensure that, then you can have a chain of these tools connected together. 74
  • 75. COE Unit 2, Lesson 1 The commands command2 and command3 who support both streams are called filters. These will be discussed later. 1.5 Variables It is possible in shell to have shell variables that can have some values stored in then and can be later on referenced to get that value or use that values on the command line or in shell scripts, we will learn shortly about the shell scripts. The shell variables are of string types, which means the value is stored in ASCII rather than in binary format. No type declaration is necessary before you can use a shell variable. The shell variables are set using a generalized form of variable=value , and can be referenced by placing a ‗$‘ as a prefix to it. By using the unset command, the variable can be removed. Example, bash> a=4 bash> echo $a 4 bash> unset a bash> echo a bash> NOTE: There should be no space between the variable name, =, and variable value else, shell will interpret the variable name to be a command and ‗=‘ and the variable value to be the arguments. By default the shell variables are initialized to null value, but sometimes it is desirable to explicitly set them to a null value by using any one of the following constructs: x= or x=‘‗ or x=‖‖ It is also possible to assign multiple word string to a shell variable, for this there are two approaches possible: 1. Escape the blank spaces using the escape character ‗‘ 2. Use the quotes. bash> a=‗My name is Amrit‘ bash> echo $a My name is Amrit 1.5.1 Setting strings with the variable names having $ There could be strings containing the $ character in them. It could be for two reasons: 1. The string inherently contains the $ sign. Example: My salary per month is $1000 bash> echo ‗My salary per month is $1000‘ My salary per month is $1000 75
  • 76. COE Unit 2, Lesson 1 In this, $1000 is echoed as it is. bash> echo ―My salary per month is $1000‖ My salary per month is 000 In this it is assumed that $1 is a shell variable and thus this tries to access the value which is undefined, and so replaces it with a null string. Thus, there is a difference in the way the shell handles the strings if used in the single quotes and double quotes. 2. The string uses a variable name with $ character to replace the variable with its value. Example, My salary per month is $$x The variable x is to be replaced with the salary amount and preceded with a dollar sign. 1.5.2 Types of variables As a convention, variables are used with uppercase names. Bash keeps a list of two types of variables:  Global variables Global variables or environment variables are available in all shells. The env or printenv commands can be used to display environment variables.  Local variables Local variables are only available in the current shell. Using the set built−in command without any options will display a list of all variables (including environment variables) and functions. The output will be sorted according to the current locale and displayed in a reusable format. A local variable is not automatically available to the sub shell unless exported. 1.5.3 Exporting variables A variable created like the ones in the example above is only available to the current shell. It is a local variable. Child processes of the current shell will not be aware of this variable. In order to pass variables to a subshell, we need to export them using the export built−in command. Variables that are exported 76
  • 77. COE Unit 2, Lesson 1 are referred to as environment variables. Setting and exporting is usually done in one step: export VARNAME="value" A subshell can change variables it inherited from the parent, but the changes made by the child don't affect the parent. This is demonstrated in the example: bash> full_name=―Amrit Swarup" bash> bash bash> echo $full_name bash> exit bash> export full_name bash> bash bash> echo $full_name Amrit Swarup bash> export full_name=―Charan Singh" bash> echo $full_name Charan Singh bash> exit 1.5.4 Using Shell Variables bash> echo $full_name Amrit Swarup In UNIX, it is possible to set variables to some path, command and command substitution to set the output of the command. We will have a look at the usage examples wherein the variables can be set to these values and then can be used as substitutes of the operations.  Setting the path name bash> x=‘/home/ganesh/father‘ bash> cd $x bash> pwd /home/ganesh/father Thus, in some variables we can set the pathname and then cd command can be used to access that pathname again and again. NOTE: In practical applications and day to day life, this can be a great practice to be done, it is because there are sometimes long absolute pathnames that can be actually stored in some variables and can be accessed again and again without facing the trouble of memorizing them or typing long pathnames. 77
  • 78. COE Unit 2, Lesson 1 1.6 Command Substitution It is possible in UNIX systems to connect two commands. It is possible to connect the standard output of a command to the standard input of another command using the pipelines or using the redirection. The shell allows obtaining the argument of a command from another command; this feature is called command substitution. In some features, it is sometimes required that the command argument is the output of another command. For example, we need to print some string which tells us about the number of files in the directory: There are 24 files in the directory. So, how will you achieve this? The shell has this feature. bash> echo ―There are `ls | wc –l` files in the directory.‖ There are 24 files in the directory. So, you have substituted the command in the string which then acts as an argument to the other command (echo), by placing the command in between two `` (backquote or backtick). This is a metacharacter that shell looks at (we cover metacharacters ahead). If enclosed in between the back quotes the shell first executes the command, and then replaces the enclosed command text with the output of the command. By now, we have seen that all the metacharacters behaves in the similar manner when used with either the double or single quotes. Lets try this one: $echo ‗There are `ls | wc –l` files in the directory.‘ There are `ls | wc –l` files in the directory. So, they are not interpreted by the shell, if placed in between the single quotes. 1.7 Pattern Matching – The Wild Cards While working with the UNIX system we often lands up in the situation when we have to perform operations which can be used to apply the same operations collectively on a larger group. Typically, listing files starting with name lesson: ls –l lesson01 lesson02 lesson03…. This can also be represented as: ls –l lesson* 78
  • 79. COE Unit 2, Lesson 1 These are called the metacharacters, these are the special characters that the shell understands and does some expanding operations based on the character and its intended use. Let‘s now discuss the metacharactes and their attributes in some details 1.7.1 The * & ? The *, known as a metacharacter, is one of the characters of the shell‘s special set. This character matches any number of characters (including none).When the * is appended to the string lesson, the pattern lesson* matches filenames beginning with the string lesson—including the file lesson. It thus matches all the files specified in the previous command line. You can now use this pattern as an argument to ls: bash> ls –x lesson* lesson lesson01 lesson02 lesson03 lesson04 lesson05 lessonA lesson.pl lesson.c lesson.cpp When the shell encounters this command line, it immediately identifies the * as a metacharacter. It then creates a list of files from the current directory that match this pattern. It reconstructs the command line as below: bash> ls –x lesson lesson01 lesson02 lesson03 lesson04 lesson05 lessonA lesson.pl lesson.c lesson.cpp NOTE: Windows users may be surprised to know that the * may occur anywhere in a filename, and not merely at the end. Thus, *lesson* matches all the following filenames: lesson newlesson lesson03 lesson03.txt. The next metacharacter is the ‗?‘ This matches a single character. When used with the same string lesson (as lesson?), the shell matches all five-character filenames beginning with lesson. Place another? at the end of this string, and you have the pattern lesson??. Use both these expressions separately, and the meaning of the ? will be obvious: bash> ls -x lesson? lessonx lessony lessonz bash> ls -x lesson?? lesson01 lesson02 lesson03 lesson04 lesson15 lesson16 lesson17 These metacharacters are also called wild cards (to depict something like a joker that can match any card). In the upcoming sessions we will take a look at other wild cards. 1.8 The Character Class 79
  • 80. COE Unit 2, Lesson 1 It can be noted in the previous examples that the patterns which we have framed in the previous examples are not very restrictive and specific. If we want to list only lessonA and lessonZ amongst the entire lesson we cannot do that using the patterns, we have studied by now. To do this we need a character class for specific matching. The character class uses two more metacharacters represented by a pair of brackets [ ]. You can have multiple characters inside this enclosure, but matching takes place for a single character in the class. For example, a single character expression that can take one of the values 1, 2 or 4, can be represented by the expression: [124] Either 1, 2 or 4 This can be combined with any string or another wild-card expression, so selecting the files lesson01, lesson02, lesson03, lesson04 becomes a simple matter : bash> ls –x lesson0[1234] lesson01 lesson02 lesson03 lesson04 1.9 Matching a dot (.) In UNIX file systems, there are lots of files that start with dots (.). It is sometimes desirable to do some collective wild card operations on these files. Example can be, bash> ls –x * lesson01 lesson02 …. This will not show the files starting with dots. To match the dots in the starting of a file name it is important to use the dot literally. bash> ls -x .* .exrc .encrc .profile But it is possible to match as many dots, if they occur in the middle of the filename. bash> ls –x my*c my_file.c my.c my.stored.c NOTE: Using * with rm 80
  • 81. COE Unit 2, Lesson 1 Lets discuss a potential issue which each UNIX user faces at least once in his life that is the use of very beautiful and powerful command bash> rm * To remove all the files starting with lesson we can use the command bash> rm lesson* But with a bit of carelessness you can type bash> rm lesson * And you have messed up everything beyond repair. Now be ready to have a scolding from the system administrator. So be careful while using this command 1.10 Summing up Shell is a core component of the UNIX Operating System. It interprets the user commands and provides powerful features like Redirection, Pipes, Metacharacters etc. Bash is the shell, compatible with the Bourne shell and incorporating many useful features from other shells. Bash‘s biggest feature is a powerful history support and command line editing. In our course, we use the BASH shell to explain the examples. In other shells the implementation is slightly different. Self-check Questions 1. While a command is being executed the shell prompts the user for another command and puts that command in its priority queue. (True/False) 2. Shell is in __________________ (execution/sleep) mode while there is no command keyed in on the terminal and another command is running. 3. The redirection symbol ‗>‘ appends the redirected text to a file. (True/False) 4. Get the odd one out: The possible sources of standard input are: a. Pipe b. Keyboard c. Printer d. file 1.11 Answers to the Self-Check Questions 81
  • 82. COE Unit 2, Lesson 1 1. False 2. Sleep 3. False 4. (c) 1.12 Terminal Questions 1. What is exporting a variable and why is it used? 2. Explain what is a metacharacter? Why do you need it? 3. Explain the difference between pipes and redirection. 82
  • 83. COE Unit 2, Lesson 2 LESSON 2 SHELL SCRIPTING AND DEBUGGING 2. SHELL SCRIPTING AND DEBUGGING..................................................................... 85 2.0 OBJECTIVES ............................................................................................................ 85 2.1 INTRODUCTION ........................................................................................................ 85 2.2 CREATING AND RUNNING A SCRIPT......................................................................... 85 2.2.1 myScript.sh........................................................................................................ 85 2.2.2 Writing and naming .......................................................................................... 86 2.2.3 Executing the Script ......................................................................................... 86 2.3 SCRIPT BASICS ....................................................................................................... 88 2.3.1 Which shell will Run the Script? ..................................................................... 88 2.3.2 Adding comments............................................................................................. 88 2.4 DEBUGGING BASH SCRIPTS ................................................................................... 89 2.4.1 Debugging On the Entire Script ..................................................................... 89 2.4.2 Debugging On Part(s) Of the Script .............................................................. 90 2.5 QUOTING ................................................................................................................. 93 2.5.1 Escape Character............................................................................................. 93 2.5.2 Single Quotes ................................................................................................... 94 2.5.3 Double-Quotes.................................................................................................. 94 2.6 SPECIAL VARIABLES................................................................................................ 95 2.7 SUMMING UP........................................................................................................... 98 2.8 ANSWERS TO THE SELF-CHECK QUESTIONS .......................................................... 98 2.9 TERMINAL QUESTIONS ............................................................................................ 98
  • 85. COE Unit 2, Lesson 2 2. Shell Scripting and Debugging To be able to write effective scripts, it is important to know the structure of a script and also be able to debug it if required. Therefore it is important to understand these concepts as they would form a base for subsequent chapters. 2.0 Objectives After going through this lesson, you will be able to:  Write a simple script  Define the shell type that should execute the script  Put comments in a script  Change permissions on a script  Execute and debug a script 2.1 Introduction This chapter is to enable the student to indulge in writing scripts with low complexity. It is also pointed out that debugging is also needed at times. The student would be enabled to debug effectively using the methodology described in this chapter. 2.2 Creating and running a script 2.2.1 myScript.sh In this example we use the echo Bash built-in to inform the user about what is going to happen, before the task that will create the output is executed. The script welcomes the user, gives current date and time, lists the directory contents and searches for the text ―Blue‖ in all files starting with the name ―demo‖ and stores the result in the file - searchResult .txt. For the scripts in this chapter we are assuming they are created in the following directory: ~/scripts 85
  • 86. COE Unit 2, Lesson 2 myScript.sh #!/bin/bash echo "" echo "This is my first shell script." USERNAME=`whoami` echo "Welcome $USERNAME" echo "" CURRENT_TIME=`date +%T` CURRENT_DATE=`date +%D` echo "Date: $CURRENT_DATE Time: $CURRENT_TIME" echo "" echo "" echo "Here are the files in your current directory." echo "" ls grep Blue demo* > searchResult.txt 2.2.2 Writing and naming To create a shell script:  Open a new empty file in your editor (vi, vim, gvim, emacs, gedit, dtpad etc.).  Put UNIX commands in the new empty file, like you would enter them on the command line. As discussed in the previous chapter, commands can be shell functions, shell built-ins, UNIX commands and other scripts.  Give your script a sensible name that gives a hint about what the script does. Make sure that your script name does not conflict with existing commands. In order to ensure that no confusion can rise, script names often end in .sh; even so, there might be other scripts on your system with the same name as the one you chose.  Check using which, where is and other commands for finding information about programs and files: which −a script_name whereis script_name locate script_name 2.2.3 Executing the Script The script can run like any other command: 86
  • 87. COE Unit 2, Lesson 2 The script should have execute permissions for the correct owners in order to be runnable. bash> chmod u+x myScript.sh bash> ls −l myScript.sh −rwxrw−r−− 1 salil salil 456 Dec 24 17:11 myScript.sh bash> myScript.sh Check that you really obtained the permissions This is mywantshell script. that you first Welcome salil Date: 12/21/07 Time: 12:26:40 Here are the files in your current directory. demo.txt demo2.txt demo3.txt lab myScript.sh newfile.txt output.txt update.ppt The above mentioned scheme is the most common way to execute a script. It is preferred to execute the script like this in a sub shell. The variables, functions and aliases created in this sub shell are only known to the particular bash session of that sub shell. When that shell exits and the parent shell regains control, everything is cleaned up. Remember to add the directory to the contents of the PATH variable. It is essentially a colon separated list of directories. When you execute a command, the shell searches through each of these directories, one by one, until it finds a directory where the executable exists. export PATH="$PATH:~/scripts" If you did not put the scripts directory in your PATH, and the current directory is not in the PATH either, you need to specify the path of the script and activate it. If it is in the current directory activate the script like this: ./script_name.sh A script can also explicitly be executed by a given shell, but generally we only do this if we want to obtain special behavior, such as checking if the script works with another shell or printing traces for debugging: rbash script_name.sh 87
  • 88. COE Unit 2, Lesson 2 sh script_name.sh bash −x script_name.sh The specified shell will start as a sub shell of your current shell and executes the script. This is done when you want the script to start up with specific options or under specific conditions which are not specified in the script. If you don't want to start a new shell but execute the script in the current shell, you source it: source script_name.sh The script does not need execute permission in this case. Commands are executed in the current shell context, so any changes made to your environment will be available when the script finishes execution 2.3 Script Basics 2.3.1 Which shell will Run the Script? When running a script in a subshell, you should define which shell should run the script. Consider for example that your login shell may be C – Shell but your script may be containing bash comma nds. The shell type in which you wrote the script might not be the default on your system, so commands you entered might result in errors when executed by the wrong shell. The first line of the script determines the shell in which the script will run. The first two characters of the first line should be #!, then follows the path to the shell that should interpret the commands that follow. Blank lines are also considered to be lines, so don't start your script with an empty line. For the purpose of this course, all scripts will start with the line #!/bin/bash 2.3.2 Adding comments It is a good practice to add comments into your scripts. Comments help in future when you will need to enhance or fix the script. Comments also make the scripts more readable. 88
  • 89. COE Unit 2, Lesson 2 The first line of the script determines the shell to start – BASH in this case commented_script1.sh #!/bin/bash # This script clears the terminal, displays a greeting and gives information # about currently connected users. The current directory contents are # displayed too This is a Comment. Everything the shell encounters after a hash mark on a line is ignored. clear # clear terminal window echo "The script starts now." echo "Hi, $USER!" # dollar sign is used to get content of variable echo echo "List of connected users:" echo w # show who is logged on echo echo "Displaying the contents of this directory" ls # To list the contents of this directory Usually, the initial few lines of script should indicate about the purpose of the script. And then you should put comments in the code too. 2.4 Debugging Bash Scripts 2.4.1 Debugging On the Entire Script Bash provides extensive debugging features. The most common is to start up the sub shell with the −x option, which will run the entire script in debug mode. Traces of each command plus its arguments are printed to standard output after the commands have been expanded but before they are executed. Following is the commented_script1.sh script ran in debug mode. Note again that the added comments are not visible in the output of the script. 89
  • 90. COE Unit 2, Lesson 2 bash> bash −x commented_script1.sh + clear + echo 'The script starts now.' The script starts now. + echo 'Hi, salil!' Hi, salil! + echo + echo 'List of connected users:' List of connected users: + echo +w 4:50pm up 18 days, 6:49, 4 users, load average: 0.58, 0.62, 0.40 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root tty2 − Sat 2pm 5:36m 0.24s 0.05s −bash salil :0 − Sat 2pm ? 0.00s ? − salil pts/2 − Sat 2pm 43:13 0.13s 0.06s /usr/bin/screen + echo + echo 'Displaying the contents of this directory' 2.4.2 Debugging the contents ofthe Script Displaying On Part(s) Of this directory + ls Using the set Bash built-in you can run in normal mode those portions of the demo1.txt demo2.txt myScript.sh script of which you are sure they are without fault, and display debugging information only for troublesome zones. Say we are not sure what the w command will do in the example commented−script1.sh, then we could enclose it in the script like this: set −x # activate debugging from here w set +x # stop debugging from here 90
  • 91. COE Unit 2, Lesson 2 Output then looks like this: bash> script1.sh The script starts now. Hi, salil! List of connected users: +w 5:00pm up 18 days, 7:00, 4 users, load average: 0.79, 0.39, 0.33 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT Root tty2 − Sat 2pm 5:47m 0.24s 0.05s −bash salil :0 − Sat 2pm ? 0.00s ? − salil pts/2 − Sat 2pm 54:02 0.13s 0.06s /usr/bin/screen + set +x Displaying the contents of this directory demo1.txt demo2.txt myScript.sh bash> The table below gives an overview of other useful Bash options: Table – Overview of set debugging options Short Long notation Result notation Disable file name generation using set –f set –o noglob metacharacters (globbing). Prints shell input lines as they are set –v set –o verbose read. Print command traces before set –x set –o xtrace executing command. The dash is used to activate a shell option and a plus to deactivate it. In the example below, we demonstrate these options on the command line: Alternatively, these modes can be specified in the script itself, by adding the desired options to the first line shell declaration. Options can be combined, as is usually the case with UNIX commands: #!/bin/bash −xv 91
  • 92. COE Unit 2, Lesson 2 bash> set −v bash> ls ls commented−scripts.sh script1.sh bash> set +v set +v bash> ls * commented−scripts.sh script1.sh bash> set −f bash> ls * ls: *: No such file or directory bash> touch * bash> ls * commented−scripts.sh script1.sh bash> rm * bash> ls commented−scripts.sh script1.sh Once you found the buggy part of your script, you can add echo statements before each command of which you are unsure, so that you will see exactly where and why things don't work. In the example commented−script1.sh script, it could be done like this, still assuming that the displaying of users gives us problems: echo "debug message: now attempting to start w command"; w In more advanced scripts, the echo can be inserted to display the content of variables at different stages in the script, so that flaws can be detected: echo "Variable VARNAME is now set to $VARNAME." 92
  • 93. COE Unit 2, Lesson 2 2.5 Quoting Quoting is used to remove the special meaning of certain characters or words to the shell. Quoting can be used to disable special treatment for special characters (to preserve their literal meaning), to prevent reserved words from being recognized as such, and to prevent parameter expansion. The application should quote the following characters if they are to represent themselves: | & ; < > ( ) $ ` " ' <space> <tab> <newline> There are three quoting mechanisms: 1. The escape character 2. Single quotes 3. Double quotes 2.5.1 Escape Character A non-quoted backslash ‗‘ is the Bash escape character. It preserves the literal value of the next character that follows, with the exception of newline. If a newline pair appears, and the backslash itself is not quoted, the newline is treated as a line continuation (that is, it is removed from the input stream and effectively ignored). bash> date=26122007 bash> echo $date Variable date is created and set to 26122007 hold a value. The first echo displays the value of the variable, bash> echo $date but for the second, the dollar sign $date is escaped. The following script shows the effect of backslash on ne wline escape.sh #!/bin/bash echo "Statement 1: This will print as two lines." echo "Statement 2: This will print as one line." 93
  • 94. COE Unit 2, Lesson 2 On running this script: bash> escape.sh Statement 1: This will print as two lines Statement 2: This will print as one line 2.5.2 Single Quotes Enclosing characters in single quotes (' ') preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash. Example: bash> echo '$date' $date 2.5.3 Double-Quotes Enclosing characters in double-quotes ( " " ) shall preserve the literal value of all characters within the double-quotes, with the exception of the characters dollar sign ‗$‘, backquote ‗`‘ and ‗‘. The characters ‗$‘ and ‗`‘ retain their special meaning within double quotes. The backslash retains its special meaning only when followed by one of the following characters: ‗$‘, ‗`‘, ‗"‘, ‗‘, or newline. bash> echo "$date" 20021226 bash> echo "`date`" Sun Apr 20 11:22:06 CEST 2003 bash> echo "I'd say: "Go for it!"" I'd say: "Go for it!" bash> echo "In DOS directories are separated by character" In DOS directories are separated by character 94
  • 95. COE Unit 2, Lesson 2 2.6 Special Variables There are some variables which are set internally by the shell and which are available to the user. The following table lists some of them: Variable Definition Expands to the name of the shell script or command $0 currently being executed or the name of the shell Positional parameter #1. Similarly for 2,3..9. For 10 $1 use ${10} Expands to the positional parameters, starting from one ($1). When the expansion occurs within double $* quotes, it expands to a single word with the value of each parameter separated by the first character of the IFS (Refer note below) special variable. Expands to the positional parameters, starting from one ($1). When the expansion occurs within double $@ quotes, each parameter expands to a separate word. Expands to the total number of positional $# parameters in decimal. The exit status of the last command executed is $? given as a decimal string. $- Flags passed to script (using set) $$ Expands to the process ID of the shell. Expands to the process ID of the most recently $! executed background command. Note: $IFS or the internal field separator is a variable which determines how Bash recognizes fields, or word boundaries, when it interprets character strings. $IFS defaults to whitespace. A positional parameter is a variable within a shell script whose value is set from an argument specified on the command line that invokes the script. Positional parameters are numbered and are referred to with a preceding ``$'': $1, $2, $3, and so on. A shell program may reference up to nine positional parameters. If a shell program is invoked with a command line that appears like this: my_script.sh pp1 pp2 pp3 pp4 pp5 pp6 pp7 pp8 pp9 then positional parameter $1 within the script is assigned the value pp1, positional parameter $2 is assigned the value pp2, and so on, at the time the shell script is invoked. 95
  • 96. COE Unit 2, Lesson 2 #!/bin/bash # positional.sh # This script reads 3 positional parameters and prints them out. PAR1="$1" PAR2="$2" PAR3="$3" echo "$1 is the first positional parameter, $1." echo "$2 is the second positional parameter, $2." echo "$3 is the third positional parameter, $3." echo echo "The total number of positional parameters is $#." Upon execution one could give any numbers of arguments: bash> positional.sh one two three four five one is the first positional parameter, $1. two is the second positional parameter, $2. three is the third positional parameter, $3. The total number of positional parameters is 5. bash> positional.sh one two one is the first positional parameter, $1. two is the second positional parameter, $2. is the third positional parameter, $3. $3 is empty The total number of positional parameters is 2. When a UNIX command runs, it can return a numeric exit status value to the process that called (started) it. The status can tell the calling process whether the command succeeded or failed. Many (but not all) UNIX commands return a status of zero if everything was okay or non-zero (1, 2, etc.) if something went wrong. A few commands, like grep and diff, return a different non-zero status for different kinds of problems. See your online manual pages to find out. 96
  • 97. COE Unit 2, Lesson 2 More examples: bash> grep dictionary /usr/share/dict/words dictionary User rahul starts entering the bash> echo $$ grep command. 10662 The process ID of his shell is bash> mozilla & 10662. After putting a job in the [1] 11064 background, the ! holds the process ID of the backgrounded bash> echo $! job. 11064 bash> echo $0 bash bash> echo $? The shell running is bash. 0 When a mistake is made, ? bash> ls abc holds an exit status ls: abc: No such file or directory different from 0 (zero). Else the status is 0. bash> echo $? 1 The following script shows the use of ―$*‖ special variable: spl_var_eg.sh #!/bin/bash echo ―My Process ID is: $$‖ echo ―The number of Arguments is $#‖ echo ―The Arguments are $*‖ grep ―$1‖ $2 echo ―Job Over‖ Upon execution: bash> spl_var_eg.sh Blue demo1.txt My Process ID is: 23465 The number of Arguments is 2 The Arguments are Blue demo1.txt My favourite colour is Blue. Job Over 97
  • 98. COE Unit 2, Lesson 2 2.7 Summing Up A shell script is a reusable series of commands put in an executable text file. Any text editor can be used to write scripts. Scripts start with #! followed by the path to the shell executing the commands from the script. Comments are added to a script for your own future reference, and also to make it understandable for other users. It is better to have too many explanations than not enough. Debugging a script can be done using shell options. Shell options can be used for partial debugging or for analyzing the entire script. Inserting echo commands at strategic locations is also a co mmon troubleshooting technique. Self-check Questions 1. What do you need to add to the first line of the script to indicate Bash shell? 2. Why are comments needed and how do you add them? 3. What happens when a script is executed with the option "bash -x" option? 2.8 Answers to the Self-Check questions 1. #!/bin/bash 2. Comments are useful to enlighten the reader about the script and make it comprehendible. A comment is added in the format: # <the comment> 3. It will run the entire script in debug mode 2.9 Terminal Questions 1. What are the different steps for creating a shell script? 2. How would you debug a part of the script? 3. What are the different shell debugging options? 4. Why is Quoting used? Give examples. 98
  • 99. COE Unit 2, Lesson 3 LESSON 3 CONDITIONAL STATEMENTS
  • 101. COE Unit 2, Lesson 3 3. Conditional statements One of the advanced concepts, conditional statements are very frequently used in scripts. A clear understanding of this concept is very important. 3.0 Objectives After going through this lesson, you will learn about:  The if statement  Using the exit status of a command  Comparing and testing input and files  If-then-else constructs  If-then-elif-else constructs  Using and testing the positional parameters  Nested if statements  Using case statements 3.1 Introduction This chapter introduces the use of conditionals in Bash scripts. This would enable the student to write scripts that are more powerful and cater to different conditions. 3.2 Introduction to if 3.2.1 General At times you need to specify different courses of action to be taken in a shell script, depending on the success or failure of a command. The if construction allows you to specify such conditions. The most compact syntax of the if command is: if TEST−COMMANDS; then CONSEQUENT−COMMANDS; fi Example: For Checking shell options 101
  • 102. COE Unit 2, Lesson 3 # These lines will print a message if the noclobber option is set if [ −o noclobber ] then echo "Your files are protected against accidental overwriting using redirection." fi The TEST−COMMAND list is executed, and if its return status is zero, the CONSEQUENT−COMMANDS list is executed. The return status is the exit status of the last command executed, or zero if no condition tested true. The TEST−COMMAND often involves numerical or string comparison tests, but it can also be any command that returns a status of zero when it succeeds and some other status when it fails. Unary expressions are often used to examine the status of a file. If the FILE argument to one of the primaries is of the form /dev/fd/N, then file descriptor "N" is checked. stdin, stdout and stderr and their respective file descriptors may also be used for tests.  Expressions used with if The table below contains an overview of the so−called "primaries" that make up the TEST−COMMAND command or list of commands. These primaries are put between square brackets to indicate the test of a conditional expression. Table − Primary expressions Primary Meaning [ -a FILE ] True if FILE exists [ -o True if shell option ―OPTIONNAME‖ is OPTIONNAME ] enabled [ -z STRING ] True of the length of ―STRING‖ is non- zero. [ -n STRING ]or [ True of the length of ―STRING‖ is non-Zero STRING] [ STRING1 == True if the strings are equal. ―=‖may be STRING2 ] used instead of ‖==‖ for strict POSIX compliance [STRING1! = True if the strings are not equal STRING2] [ STRING1< True if ―STRING1‖ sorts before ―STRING2‖ STRING2 ] lexicographically in the current locale. [ STRING1> True if ―STRING1‖ sorts after ―STRING2‖ STRING2 ] lexicographically in the current locale. [ ARG1 OP ―OP‖ is one of – eq, -ne, -lt, -le,-gt or –ge. ARG2 ] These arithmetic binary operators return true if ―ARG1‖ is equal to, not equal to, less than, less than or equal to, greater than, or greater than or equal to ―ARG2‖. ―ARG1‖ and ―ARG2‖ are integers. 102
  • 103. COE Unit 2, Lesson 3 Expressions may be combined using the following operators, listed in decreasing order of precedence: Table – Combining expressions Operation Effect [ ! EXPR ] True if EXPR is false Returns the value of EXPR. This may be [ (EXPR) ] used to override the normal precedence of operators. [ EXPR1 –a True if both EXPR1 and EXPR2 are True EXPR2 ] [ EXPR1 –o True if either EXPR1 and EXPR2 is true. EXPR2 ] The [ (or test) built−in evaluates conditional expressions using a set of rules based on the number of arguments. More information about this subject can be found in the Bash documentation. Just like, the if is closed with fi, the opening angular bracket should be closed after the conditions have been listed.  Commands following the then statement The CONSEQUENT−COMMANDS list that follows the then statement can be any valid UNIX command, any executable program, any executable shell script or any shell statement, with the exception of the closing fi. It is important to remember that the then and fi are considered to be separated statements in the shell. Therefore, when issued on the command line, they are separated by a semi−colon. In a script, the different parts of the if statement are usually well−separated. Below are a couple of simple examples.  Checking files The first example checks for the existence of a file: 103
  • 104. COE Unit 2, Lesson 3 filecheck.sh #!/bin/bash echo "This scripts checks the existence of the demo file." echo "Checking..." if [ −f /usr/guest/demo.txt ] then echo "/usr/guest/demo.txt file exists." fi echo echo "...done." bash> ./filecheck.sh This scripts checks the existence of the messages file. Checking... /usr/guest/demo.txt file exists. ...done. 3.2.2 Simple applications of if  Testing exit status bash> if [ $? −eq 0 ] > then echo 'That was a good job!' > fi That was a good job! bash>  Numeric comparisons bash> num=`wc −l demo1.txt` : bash> echo $num 201 bash> if [ "$num" −gt "150" ] > then echo ; echo "This is a big file." > echo ; fi This is a big file. bash> 104
  • 105. COE Unit 2, Lesson 3  String comparisons dir=`pwd` # /tmp/proc updir=`basename $dir` # /tmp if [ "$updir"‖X‖ != ―/tmpX'' ]; then echo "You need to be in a subdirectory of /tmp." exit 1; fi 3.3 More advanced if usage 3.3.1 if-then-else constructs Like the CONSEQUENT−COMMANDS list following the then statement, the ALTERNATE−CONSEQUENT−COMMANDS list following the else statement can hold any UNIX−style command that returns an exit status.  Example 1 On executing the script we get: fun_weigh.sh bash> bash −x fun_weigh.sh 55 169 + weight=55 fun_weigh.sh + height=169 + idealweight=59 + '[' 55 −le 59 ']' + echo 'You should eat a bit more fat.' You should eat a bit more fat.  Example 2 #!/bin/bash # This script prints a message about your weight if you give it your # weight in kilos and hight in centimeters. weight="$1" height="$2" idealweight=$[$height − 110] if [ $weight −le $idealweight ] ; then echo "You should eat a bit more fat." else echo "You should eat a bit more fruit." fi 105
  • 106. COE Unit 2, Lesson 3 Testing the number of arguments - The previous script is modified so that it prints a message if more or less than 2 arguments are given: fun_weigh.sh #!/bin/bash # This script prints a message about your weight if you give it your # weight in kilos and hight in centimeters. if [ ! $# == 2 ]; then echo "Usage: $0 weight_in_kilos length_in_centimeters" exit fi weight="$1" height="$2" idealweight=$[$height − 110] if [ $weight −le $idealweight ] ; then echo "You should eat a bit more fat." else echo "You should eat a bit more fruit." fi bash> fun_weigh.sh 70 150 You should eat a bit more fruit. bash> fun_weigh.sh 70 150 33 Usage: ./weight.sh weight_in_kilos length_in_centimeters The first argument is referred to as $1, the second as $2 and so on. The total number of arguments is stored in $#. 3.3.2 if-then-elif-else constructs This is the full form of the if statement: if TEST−COMMANDS; then CONSEQUENT−COMMANDS; elif MORE−TEST−COMMANDS; then MORE−CONSEQUENT−COMMANDS; else ALTERNATE−CONSEQUENT−COMMANDS; fi 106
  • 107. COE Unit 2, Lesson 3 testleap.sh #!/bin/bash Also note nested # This script will test if we're in a leap year or not. ifs here. You may use as many year=`date +%Y` levels of nested ifs as you can if [ $[$year % 400] −eq "0" ]; then logically manage. echo "This is a leap year. February has 29 days." elif [ $[$year % 4] −eq 0 ]; then if [ $[$year % 100] −ne 0 ]; then echo "This is a leap year, February has 29 days." else echo "This is not a leap year. February has 28 days." fi else echo "This is not a leap year. February has 28 days." fi bash> date Fri Dec 21 17:14:28 IST 2007 bash> testleap.sh This is not a leap year. 3.3.3 Returning the exit status using if Sometimes, you test for a condition and find that it fails. You would rather like the program to terminate since there is no point in continuing further if an essential resource is missing—say the file you want to search. The exit statement is used to prematurely terminate a program. The exit statement takes an optional argument. This argument is the integer exit status code, which is passed back to the parent and stored in the $? variable. #!/bin/bash if [ $# -ne 2 ]; then echo "Usage $0 <file1> <file2>"; exit 2 fi ...<rest of script> In this example if the number of arguments is not 2 then the execution is exited (with a code 2) and a message about the usage is printed. 107
  • 108. COE Unit 2, Lesson 3 3.4 Using case statements Nested if statements might be nice, but as soon as you are confronted with a couple of different possible actions to take, they tend to confuse. For the more complex conditionals, use the case syntax: case EXPRESSION in CASE1) COMMAND−LIST;; CASE2) COMMAND−LIST;; ... CASEN) COMMAND−LIST;; esac Each case is an expression matching a pattern. The commands in the COMMAND−LIST for the first match are executed. The "|" symbol may be used for separating multiple patterns, a nd the ")" operator terminates a pattern list. Each case plus its according commands are called a clause. Each clause must be terminated with ";;". Each case statement is ended with the esac statement. In the example, we demonstrate use of case for getting the disk usage. disk_utility.sh #!/bin/bash echo ―n 1. The free disk spacen 2. Space consumed by this user 3. Exitnn SELECTION: c‖ read selection case $selection in 1) df ;; 2) du –s $HOME ;; 3) exit ;; *) echo ― Not a valid option‖ esac Echo interprets and treats the character c as special because of the backslash. The c here represents an escape sequence, which positions the cursor immediately after the argument instead of the next line. The read statement takes input from the user, thereby making the script interactive. The input is read into a variable (selection in this case). The output is as follows: 108
  • 109. COE Unit 2, Lesson 3 bash> disk_utility.sh 1. The free disk space 2. Space consumed by this user 3. Exit SELECTION: 2 456100 /home/pallavi 3.5 Summary In this chapter we learned how to build conditions into our scripts so that different actions can be undertaken upon success or failure of a command. The actions can be determined using the if statement. This allows you to perform arithmetic and string comparisons, and testing of exit code, input and files needed by the script. A simple If-then-fi test often precedes commands in a shell script in order to prevent output generation, so that the script can easily be run in the background or through the cron facility. More complex definitions of conditions are usually put in a case statement. Self-check Questions 1. What is the use of the "if" statement? 2. What is the exit status of a command? What is its normal value and where is the value stored? 3.6 Answers to the Self-Check questions 1. The "if" statement takes two-way decisions depending on the fulfillment of a certain condition. 2. The exit status is an integer that represents the success or failure of a command. It has the value 0 when the command executes successfully and is stored in the parameter $? 3.8 Terminal Questions 1. List some applications of the ―if-then-elif-else‖ statement. 2. Give an example of ―Case‖ usage. 109
  • 111. COE Unit 2, Lesson 4 LESSON 4 REPETITIVE T ASKS 4. REPETITIVE TASKS .................................................................................................... 113 4.0 OBJECTIVES .......................................................................................................... 113 4.1 INTRODUCTION ...................................................................................................... 113 4.2 THE FOR LOOP....................................................................................................... 113 4.2.1 How does it work? .......................................................................................... 113 4.2.2 Examples ......................................................................................................... 114 4.3 THE WHILE LOOP ................................................................................................... 115 4.3.1 What is it? ........................................................................................................ 115 4.3.2 Examples ........................................................................................................... 115 4.4 THE UNTIL LOOP .................................................................................................... 117 4.4.1 What is it? ........................................................................................................ 117 4.4.2 Example ........................................................................................................... 118 4.5 I/O REDIRECTION AND LOOPS ............................................................................... 118 4.5.1 Input redirection .............................................................................................. 119 4.5.2 Output redirection ........................................................................................... 119 4.6 BREAK AND CONTINUE ................................................................................................ 119 4.6.1 The break built−in........................................................................................... 120 4.6.2 The continue built−in...................................................................................... 121 4.6.3 Examples ......................................................................................................... 121 4.7 MAKING MENUS WITH THE SELECT BUILT−IN ........................................................ 123 4.7.1 General ............................................................................................................ 123 4.7.2 Submenus ....................................................................................................... 126 4.8 THE SHIFT BUILT− IN .............................................................................................. 126 4.8.1 What does it do?............................................................................................. 126 4.8.2 Examples ......................................................................................................... 126 4.9 SUMMARY.............................................................................................................. 127 4.10 ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................ 128 4.11 TERMINAL QUESTIONS .......................................................................................... 128
  • 113. COE Unit 2, Lesson 4 4. Repetitive tasks It is important to appreciate the need of loops in scripts. It takes scripting to the next level and comes very handy in a wide variety of applications. 4.0 Objectives Upon completion of this chapter, you will be able to  Use for, while and until loops, and decide which loop fits which occasion.  Use the break and continue Bash built−ins.  Write scripts using the select statement.  Write scripts that take a variable number of arguments. 4.1 Introduction This chapter teaches the student to write different types of loops as per any application that requires repetitive tasks. This is very helpful in writing useful scripts that require something to be done repeatedly. 4.2 The for loop 4.2.1 How does it work? The for loop is the first of the three shell looping constructs. This loop allows for specification of a list of values. A list of commands is executed for each value in the list. The syntax for this loop is: for NAME [in LIST ]; do COMMANDS; done If [in LIST] is not present, it is replaced with $@ and for executes the COMMANDS once for each positional parameter that is set. The return status is the exit status of the last command that executes. If no commands are executed because LIST does not expand to any items, the return status is zero. NAME can be any variable name, although it is used very often. LIST can be any list of words, strings or numbers, which can be literal or generated by any command. The COMMANDS to execute can also be any operating system 113
  • 114. COE Unit 2, Lesson 4 commands, script, program or shell statement. The first time through the loop, NAME is set to the first item in the LIST. The second time, its value is set to the second item in the list, and so on. The loop terminates when NAME has taken on each of the values from LIST and no items are left in the LIST. 4.2.2 Examples  Using command substitution for specifying LIST items The first is a command line example, demonstrating the use of a for loop that makes a backup copy of each .xml file. After issuing the command, it is safe to start working on your sources: bash> ls *.xml file1.xml file2.xml file3.xml bash> ls *.xml > list bash> for i in `cat list`; do cp "$i" "$i".bak ; done bash> ls *.xml* file1.xml file1.xml.bak file2.xml file2.xml.bak file3.xml file3.xml.bak This one lists the files in /sbin that are just plain text files, and possibly scripts: for i in `ls /sbin`; do file /sbin/$i | grep ASCII;done  Using the content of a variable to specify LIST items The following is a specific application script for converting HTML files, compliant with a certain scheme, to PHP files. The conversion is done by taking out the first 25 and the last 21 lines, replacing these with two PHP tags that provide header and footer lines: html2php.sh #!/bin/bash # specific conversion script for my html files to php LIST="$(ls *.html)" for i in "$LIST"; do NEWNAME=$(ls "$i" | sed −e 's/html/php/') cat beginfile > "$NEWNAME" cat "$i" | sed −e '1,25d' | tac | sed −e '1,21d'| tac >> "$NEWNAME" cat endfile >> "$NEWNAME" done 114
  • 115. COE Unit 2, Lesson 4 Since we don't do a line count here, there is no way of knowing the line number from which to start deleting lines until reaching the end. The problem is solved using tac, which reverses the lines in a file. 4.3 The while loop 4.3.1 What is it? The while construct allows for repetitive execution of a list of commands, as long as the command controlling the while loop executes successfully (exit status of zero). The syntax is: while CONTROL−COMMAND; do CONSEQUENT−COMMANDS; done CONTROL−COMMAND can be any command(s) that can exit with a success or failure status. The CONSEQUENT−COMMANDS can be any program, script or shell construct. As soon as the CONTROL−COMMAND fails, the loop exits. In a script, the command following the done statement is executed. The return status is the exit status of the last CONSEQUENT−COMMANDS command, or zero if none was executed. 4.3.2 Examples  Simple example using while Here is an example for the impatient: #!/bin/bash # This script opens 4 terminal windows. i="0" while [ $i −lt 4 ] do xterm & i=$[$i+1] done  Nested while loops The example below was written to copy pictures that are made with a webcam to a web directory. Every five minutes a picture is taken. Every hour, a new directory is created, holding the images for that hour. E very day, a new 115
  • 116. COE Unit 2, Lesson 4 directory is created containing 24 subdirectories. The script runs in the background. #!/bin/bash # This script copies files from my homedirectory into the webserver directory. # (use scp and SSH keys for a remote directory) # A new directory is created every hour. PICSDIR=/home/mohan/pics WEBDIR=/var/www/mohan/webcam while true; do DATE=`date +%Y%m%d` HOUR=`date +%H` mkdir $WEBDIR/"$DATE" while [ $HOUR −ne "00" ]; do DESTDIR=$WEBDIR/"$DATE"/"$HOUR" mkdir "$DESTDIR" mv $PICDIR/*.jpg "$DESTDIR"/ sleep 3600 HOUR=`date +%H` done done Note the use of the true statement. This means: continue execution until we are forcibly interrupted (with kill or Ctrl+C). This small script can be used for simulation testing; it generates files: #!/bin/bash # This generates a file every 5 minutes while true; do touch pic−`date +%s`.jpg sleep 300 done Note the use of the date command to generate all kinds of file and directory names. See the man page for more information on date command  Calculating an average 116
  • 117. COE Unit 2, Lesson 4 This script calculates the average of user input, which is tested before it is processed: if input is not within range, a message is printed. If q is pressed, the loop exits: #!/bin/bash # Calculate the average of a series of numbers. SCORE="0" AVERAGE="0" SUM="0" NUM="0" while true; do echo −n "Enter your score [0−100%] ('q' for quit): "; read SCORE; if (("$SCORE" < "0")) || (("$SCORE" > "100")); then echo "Be serious. Common, try again: " elif [ "$SCORE" == "q" ]; then echo "Average rating: $AVERAGE%." break else SUM=$[$SUM + $SCORE] NUM=$[$NUM + 1] AVERAGE=$[$SUM / $NUM] fi done echo "Exiting." Note how the variables in the last lines are left unquoted in order to do arithmetic. 4.4 The until loop 4.4.1 What is it? The until loop is very similar to the while loop, except that the loop executes until the TEST−COMMAND executes successfully. As long as this command fails, the loop continues. The syntax is the same as for the while loop: until TEST−COMMAND; do CONSEQUENT−COMMANDS; done The return status is the exit status of the last command executed in the CONSEQUENT−COMMANDS list, or zero if none was executed. TEST−COMMAND can, again, be any command that can exit with a success or failure status, and CONSEQUENT−COMMANDS can be any UNIX command, script or shell construct. 117
  • 118. COE Unit 2, Lesson 4 As was previously explained, the ";" may be replaced with one or more newlines wherever it appears. 4.4.2 Example An improved picturesort.sh script (see Section 4.2.2.2), which tests for available disk space. If disk space is not enough, remove pictures from the previous months: #!/bin/bash # This script copies files from my homedirectory into the webserver directory. # A new directory is created every hour. # If the pics are taking up too much space, the oldest are removed. while true; do DISKFUL=$(df −h $WEBDIR | grep −v File | awk '{print $5}' | cut −d "%" −f1 −) until [ $DISKFUL −ge "90" ]; do DATE=`date +%Y%m%d` HOUR=`date +%H` mkdir $WEBDIR/"$DATE" while [ $HOUR −ne "00" ]; do DESTDIR=$WEBDIR/"$DATE"/"$HOUR" mkdir "$DESTDIR" mv $PICDIR/*.jpg "$DESTDIR"/ sleep 3600 HOUR=`date +%H` done DISKFULL=$(df −h $WEBDIR | grep −v File | awk '{ print $5 }' | cut −d "%" −f1 −) done TOREMOVE=$(find $WEBDIR −type d −a −mtime +30) for i in $TOREMOVE; do rm −rf "$i"; done Note the initialization of the HOUR and DISKFUL variables and the use of options with ls and date in order to obtain a correct listing for TOREMOVE. done (Not Clear) 4.5 I/O redirection and loops 118
  • 119. COE Unit 2, Lesson 4 4.5.1 Input redirection Instead of controlling a loop by testing the result of a command or by user input, you can specify a file from which to read input that controls the loop. In such cases, read is often the controlling command. As long as input lines are fed into the loop, execution of the loop commands continues. As soon as, all the input lines are read the loop exits. Since the loop construct is considered to be one command structure (such as while TEST−COMMAND; do CONSEQUENT−COMMANDS; done), the redirection should occur after the done statement, so that it complies with the form command < file This kind of redirection also works with other kinds of loops. 4.5.2 Output redirection In the example below, output of the find command is used as input for the read command controlling a while loop: archiveoldstuff.sh #!/bin/bash # This script creates a subdirectory in the current directory, to which old # files are moved. # Might be something for cron (if slightly adapted) to execute weekly or # monthly. ARCHIVENR=`date +%Y%m%d` DESTDIR="$PWD/archive−$ARCHIVENR" mkdir $DESTDIR find $PWD −type f −a −mtime +5 | while read file do gzip "$file"; mv "$file".gz "$DESTDIR" Filesechocompressed by gzip command before they are moved into the are "$file archived" done archive directory. 4.6 Break and continue 119
  • 120. COE Unit 2, Lesson 4 4.6.1 The break built−in The break statement is used to exit the current loop before its normal ending. This is done when you don't know in advance how many times the loop will have to execute, for instance because it is dependent on user input. The example below demonstrates a while loop that can be interrupted. This is a slightly improved version of the wisdom.sh script from Section 4.3.2 #!/bin/bash # This script provides wisdom # You can now exit in a decent way. FORTUNE=/usr/games/fortune while true; do echo "On which topic do you want advice?" echo "1. politics" echo "2. startrek" echo "3. kernelnewbies" echo "4. sports" echo "5. bofh−excuses" echo "6. magic" echo "7. love" echo "8. literature" echo "9. drugs" echo "10. education" echo echo −n "Enter your choice, or 0 for exit: " read choice echo case $choice in 1) $FORTUNE politics ;; 2) $FORTUNE startrek ;; 3) $FORTUNE kernelnewbies ;; 4) echo "Sports are a waste 120 of time, energy and money." echo "Go back to your
  • 121. COE Unit 2, Lesson 4 5) $FORTUNE bofh−excuses ;; 6) $FORTUNE magic ;; 7) $FORTUNE love ;; 8) $FORTUNE literature ;; 9) $FORTUNE drugs ;; 10) $FORTUNE education ;; 0) echo "OK, see you!" break ;; *) echo "That is not a valid choice, try a number from 0 to 10." ;; esac done Mind that break exits the loop, not the script. This can be demonstrated by adding an echo command at the end of the script. This echo will also be executed upon input that causes break to be executed (when the user types "0"). In nested loops, break allows for specification of which loop to exit. See the Bash info pages for more. 4.6.2 The continue built−in The continue statement resumes iteration of an enclosing for, while, until or select loop. When used in a for loop, the controlling variable takes on the value of the next element in the list. When used in a while or until construct, on the other hand, execution resumes with TEST−COMMAND at the top of the loop. 4.6.3 Examples In the following example, file names are converted to lower case. If no conversion needs to be done, a continue statement restarts execution of the loop. These commands don't eat much system resources, and most likely, 121
  • 122. COE Unit 2, Lesson 4 similar problems can be solved using sed and a wk. However, it is useful to know about this kind of construction when executing heavy jobs, that might not even be necessary when tests are inserted at the correct locations in a script, sparing system resources. tolower.sh #!/bin/bash # This script converts all file names containing upper case characters into file # names containing LIST="$(ls)" for name in "$LIST"; do if [[ "$name" != *[[:upper:]]* ]]; then continue fi ORIG="$name" NEW=`echo $name | tr 'A−Z' 'a−z'` mv "$ORIG" "$NEW" echo "new name for $ORIG is $NEW" done This script has at least one disadvantage: it overwrites existing files. The noclobber option to Bash is only useful when redirection occurs. The −b option to the mv command provides more security, but is only safe in case of one accidental overwrite, as is demonstrated in this test: bash> rm * bash> touch test Test TEST bash> bash −x tolower.sh ++ ls + LIST=test Test TEST + [[ test != *[[:upper:]]* ]] + continue + [[ Test != *[[:upper:]]* ]] + ORIG=Test 122
  • 123. COE Unit 2, Lesson 4 ++ echo TEST ++ tr A−Z a−z + NEW=test + mv −b TEST test + echo 'new name for TEST is test' new name for TEST is test bash> ls −a ./ ../ test test~ The tr is part of the textutils package; it can perform all kinds of character transformations. 4.7 Making menus with the select built−in 4.7.1 General  Use of select The select construct allows easy menu generation. The syntax is quite similar to that of the for loop: select WORD [in LIST]; do RESPECTIVE−COMMANDS; done LIST is expanded, generating a list of items. The expansion is printed to standard error; each item is preceded by a number. If in LIST is not present, the positional parameters are printed, as if in $@ would have been specified. LIST is only printed once. Upon printing all the items, the PS3 prompt is printed and one line from standard input is read. If this line consists of a number corresponding to one of the items, the value of WORD is set to the name of that item. If the line is empty, the items and the PS3 prompt are displayed again. If an EOF (End Of 123
  • 124. COE Unit 2, Lesson 4 File) character is read, the loop exits. Since most users don't have a clue which key combination is used for the EOF sequence, it is more user−friendly to have a break command as one of the items. Any other value of the read line will set WORD to be a null string. The read line is saved in the REPLY variable. The RESPECTIVE−COMMANDS are executed after each selection until the number representing the break is read. This exits the loop.  Examples This is a very simple example, but as you can see, it is not very user−friendly: 124
  • 125. COE Unit 2, Lesson 4 private.sh #!/bin/bash echo "This script can make any of the files in this directory private." echo "Enter the number of the file you want to protect:" select FILENAME in *; do echo "You picked $FILENAME ($REPLY), it is now only accessible to you." chmod go−rwx "$FILENAME" done bash>./private.sh This script can make any of the files in this directory private. Enter the number of the file you want to protect: 1) archive−20030129 2) bash 3) private.sh #? 1 You picked archive−20030129 (1) #? Setting the PS3 prompt and adding a possibility to quit makes it better: #!/bin/bash echo "This script can make any of the files in this directory private." echo "Enter the number of the file you want to protect:" PS3="Your choice: " QUIT="QUIT THIS PROGRAM − I feel safe now." touch "$QUIT" select FILENAME in *; do case $FILENAME in "$QUIT") echo "Exiting." break ;; *) echo "You picked $FILENAME ($REPLY)" chmod go−rwx "$FILENAME" ;; esac done 125 rm "$QUIT"
  • 126. COE Unit 2, Lesson 4 4.7.2 Submenus Any statement within a select construct can be another select loop, enabling (a) submenu(s) within a menu. By default, the PS3 variable is not changed when entering a nested select loop. If you want a different prompt in the submenu, be sure to set it at the appropriate time(s). 4.8 The shift built−in 4.8.1 What does it do? The shift command is one of the Bourne shell built−ins that comes with Bash. This command takes one argument, a number. The positional parameters are shifted to the left by this number, N. The positional parameters from N+1 to $# are renamed to variable names from $1 to $# − N+1. Say you have a command that takes 10 arguments, and N is 4, then $4 becomes $1, $5 becomes $2 and so on. $10 becomes $7 and the original $1, $2 and $3 are thrown away. If N is zero or greater than $# (the total number of arguments, see Section 7.2.1.2). If N is not present, it is assumed to be 1. The return status is zero unless N is greater than $# or less than zero; otherwise it is non−zero. 4.8.2 Examples A shift statement is typically used when the number of arguments to a command is not known in advance, for instance when users can give as many arguments as they like. In such cases, the arguments are usually processed in a while loop with a test condition of (($# )). This condition is true as long as the number of arguments is greater than zero. The $1 variable and the shift statement process each argument. The number of arguments is reduced each time shift is executed and eventually becomes zero, upon which the while loop exits. The example below, cleanup.sh, uses shift statements to process each file in the list generated by find: 126
  • 127. COE Unit 2, Lesson 4 #!/bin/bash # This script can clean up files that were last accessed over 365 days ago. USAGE="Usage: $0 dir1 dir2 dir3 ... dirN" if [ "$#" == "0" ]; then echo "$USAGE" exit 1 fi while (( "$#" )); do if [[ "$(ls $1)" == "" ]]; then echo "Empty directory, nothing to be done." else find $1 −type f −a −atime +365 −exec rm −i {} ; fi shift done The above find command can be replaced with the following: find options | xargs [commands_to_execute_on_found_files] The xargs command builds and executes command lines from standard input. This has the advantage that the command line is filled until the system limit is reached. Only then will the command to execute be called, in the above example this would be rm. If there are more arguments, a new command line will be used, until that one is full or until there are no more arguments. The same thing using find −exec calls on the command to execute on the found files every time a file is found. Thus, using xargs greatly speeds up your scripts and the performance of your machine. 4.9 Summary In this chapter, we discussed how repetitive commands can be incorporated in loop constructs. Most common loops are built using the for, while or until statements, or a combination of these commands. The for loop executes a task a defined number of times. If you don't know how many times a command should execute, use either until or while to specify when the loop should end. Loops can be interrupted or reiterated using the break and continue statements. A file can be used as input for a loop using the input redirection 127
  • 128. COE Unit 2, Lesson 4 operator, loops can also read output from commands that is fed into the loop using a pipe. The select construct is used for printing menus in interactive scripts. Looping through the command line arguments to a script can be done using the shift statement. Self-check Questions 1. What is the use of Loops? 2. List the different types of Loops in shell? 3. What is the use of the "break" statement? 4. What will the following construct do and why? while [ 5 ] 4.10 Answers to the Self-Check questions 1. Loops let the user perform a set of instructions repeatedly. 2. For, While and Until. 3. The break statement is used to exit the current loop before its normal ending. 4. This sets up an infinite loop since a value greater than 0 is considered to be true. 4.11 Terminal Questions 1. How would you decide which type of loop to use? 2. Explain why it is so important to put the variables in between double quotes in the example from Section 4.4.2? 3. Describe the ―shift‖ built-in command. 4. There are at least 6 syntactical mistakes in the following program. Locate them. 128
  • 129. COE Unit 2, Lesson 4 1 ppprunning = yes 2 while $ppprunning = yes ; do 3 echo ― INTERNET MENUn 4 1. Dial out 5 2. Exit 6 7 Choice: 8 read choice 9 case choice in 10 1) if [ -z ―$ppprunning‖ ] 11 echo ―Enter your username and password‖ 12 else 13 chat.sh 14 endif ; 15 *) ppprunning=no 16 endcase 17 done 129
  • 131. COE Unit 2, Lesson 5 LESSON 5 REGULAR EXPRESSIONS 5. REGULAR EXPRESSIONS......................................................................................... 133 5.0 OBJECTIVES .......................................................................................................... 133 5.1 INTRODUCTION ...................................................................................................... 133 5.2 REGULAR EXPRESSIONS ....................................................................................... 133 5.2.1 What are regular expressions? .................................................................... 133 5.2.2 The Structure of a Regular Expression....................................................... 134 5.2.3 Regular expression metacharacters ........................................................... 135 5.2.4 Creating complex regular expressions by concatenating other regEx .. 136 5.2.5 Using metacharacters on regEx to create complex regEx ..................... 136 5.3 THE GREP.............................................................................................................. 137 5.3.1 Grep and regular expressions ...................................................................... 138 5.4 PATTERN MATCHING USING SHELL........................................................................ 140 5.4.1 Character ranges............................................................................................ 140 5.4.2 Character classes........................................................................................... 141 5.5 SUMMARY.............................................................................................................. 141 5.6 ANSWERS TO THE SELF-CHECK QUESTIONS ........................................................ 142 5.7 TERMINAL QUESTIONS .......................................................................................... 142
  • 133. COE Unit 2, Lesson 5 5. Regular Expressions Regular expressions are very helpful in creating powerful scripts. Regula r expressions are also used heavily in advanced Unix utilities that we will be studying further, like sed, AWK and perl language. 5.0 Objectives After going through this lesson, you will learn about:  Using regular expressions  Regular expression metacharacters  Finding patterns in files or output  Character ranges and classes in Bash 5.1 Introduction This chapter introduces the concept of regular expressions. A regular expression is a pattern that describes a set of strings. This is a very powerful concept and can be used effectively in scripting. 5.2 Regular expressions 5.2.1 What are regular expressions? Often you will encounter conditions where you need to match specific patterns in scripts. For example, given a list of cricket players you may need to find out all those players whose names begin with A or B. In other words, you need to match with a pattern set. A regular expression helps you define a pattern space in a terse way. For example, if you want to match any number where no other digit used other than 9 (e.g., 9, 99, 999, 9999, …), then it is impossible to write out the entire pattern set. But a regular expression can express the same set very easily. Lets see what is a regular expression and how are they used. Here are few examples of regular expressions. You will begin to understand how they represent their patter set as you study this chapter. 9* => Any number that contains only digit 9 (e.g., 99, 9999, etc.) India.* => Any string beginning with India (e.g., India, Indian, Indiana, etc.) 133
  • 134. COE Unit 2, Lesson 5 A regular expression is a sequence of characters that represents patterns. The pattern can be a simple word, like, ―India‖, or can describe more general set of patterns like ―India‖, ―Indian‖, ―Indiana‖, etc. Using regular expression you can create general patterns like any 3 digit number that does not contain the digit 2. What is meant by ―regular‖ in the term regular expression? The term ―regular‖ refers to the fact that there is a pre-defined repetition that it denotes. If the repetitions are irregular, then you cannot denote the pattern with a regular expression. For example, a set of all the prime numbers cannot be denoted using a regular expression! What is meant by ―expression‖ in the term regular expression? The ―expression‖ in regular expression refers to the fact that, just like mathematical expressions, regular expressions can be combined together to form new and more complex regular expression. By the way, regular expressions are often referred to as regEx by developers. 5.2.2 The Structure of a Regular Expression All single characters, including characters like ‗a‘, ‗=‘, ‗3‘, etc., are fundamental regular expressions. They match the single character they represent. Most characters, including all letters and digits, are regular expressions that match themselves. The fundamental regular expressions can be combined to create more complex regular expressions. Lets see how we create more complex regEx. There are three important parts to a regular expression:  Anchors  Character sets  Modifiers Anchors are used to specify the position of the pattern in relation to a line of text. Character Sets match one or more characters in a single position. Modifiers specify how many times the previous character set is repeated. A simple example that demonstrates all three parts is the regular expression "^#*." The up arrow is an anchor that indicates the beginning of the line. The character "#" is a simple character set that matches the single character "#". The asterisk is a modifier. In a regular expression asterisk specifies that the character set can appear any number of times. 134
  • 135. COE Unit 2, Lesson 5 5.2.3 Regular expression metacharacters There are few special characters that specify repetition styles for the preceding character or the preceding expression. These special characters that denote the repetition types are called Meta Characters. The table below lists various metacharacters and their meanings. Table – Regular expression metacharacters Operator Effect . (single dot) Matches any single character The preceding item is optional and will be matched, ? at most, once. The preceding item will be matched zero or more * times. The preceding item will be matched one or more + times. {N} The preceding item will be matched exactly N times. The preceding item will be matched exactly N or {N,} more times. The preceding item will be matched at least N times, {N,M} but not more than M times. Represents the range if it‘s not first or last in a list or - the ending point of a range in a list. Matches the empty string at the beginning of a line; ^ also represents the characters in the range of a list. $ Matches the empty string at the end of a line. b Matches the empty string at the edge of a word. Matches the empty string provided it‘s not at the B edge of word. < Match the empty string at the beginning of a word. > Match the empty string at the end of word. In the example below, the * indicates zero or more repetitions of 9. 9* => Any number that contains only digit 9 (e.g., 99, 9999, etc.) In the example below, the . indicates any character and therefore .* indicates any number of repetitions of any characters. India.* => Any string beginning with India (e.g., India, Indian, Indiana, etc.) So, for example, India.* will also match India123, IndiaZZZ, etc. 135
  • 136. COE Unit 2, Lesson 5 5.2.4 Creating complex regular expressions by concatenating other regEx Suppose you want to use a regular expression to match any string in which letter ‗A‘ repeats one or number of times. (e.g., A, AA, AAA, etc.). Then the regular expression for this is A+ => will match A, AA, AAA, etc. but will not match empty string. Now suppose you want to use a regular expression to match any string in which the digit 4 repeat any number of times. 4* => will match 4, 44, 444, etc. and will also match an empty string. Now, suppose you want to create a regular expression to match any string in which first the letter ‗A‘ repeats one ore more number of times and then the digit 4 repeats any number of times (e.g., A4, A444, AA4, etc). So, you can combine the regular expression created earlier: A+4* => will match A4, A44, AA4, etc. but will not match 4AA. 5.2.5 Using metacharacters on regEx to create complex regEx Now, suppose you want to create a regular expression that denotes an unsigned real number. You can use the following regEx for it: [0-9]+(.[0-9]+)? => will match 4, 0.32, 4, etc. but will not match -5, .33 or 7e-3. Lets dissect this example to understand better: First [0-9]+ will match one or more occurrence of a digit. To make the fractional part, we need to allow a dot (e.g., dot in .32) . So we have . there. The fractional part, if present needs to again have at least one digit, so have the complete fractional part written as .[[0-9]+ there. However we need to make sure that the fractional part should be optional (it should match numbers without the fractional parts too). So, the fractional part is made optional by putting a question mark for it. Thus making the entire regEx as [0-9]+(.[0-9]+)? 136
  • 137. COE Unit 2, Lesson 5 5.3 The grep command Unix has a command to that performs regular expressions based search. This command is called grep. grep searches the input for lines containing a match to a given pattern list. When it finds a match in a line, it prints the line. Note that grep command does not match patterns across multiple lines. Here are few examples on grep. bash> grep root /etc/passwd root:x : 0 : 0 : root:/root:/bin/bash operator:x : 11 : 0 : operator:/root:/sbin/nologin bash> grep −n root /etc/passwd # prints line numbers of matches 1: root:x : 0 : 0 : root:/root:/bin/bash 12 : operator:x : 11 : 0 : operator:/root:/sbin/nologin bash> grep −v bash /etc/passwd | grep −v nologin # matching reverted sync : x : 5 : 0 : sync : /sbin:/bin/sync shutdown : x : 6 : 0 : shutdown : /sbin:/sbin/shutdown halt:x : 7: 0 : halt:/sbin:/sbin/halt news : x : 9 : 13 : news : /var/spool/news: apache : x : 48 : 48 : Apache : /var/www : /bin/false bash> grep −c false /etc/passwd # returns number of matches 7 bash> grep −i root /etc/pass wd # match regardless of the case Root:0:0:/root root:0:0:/sysadm With the first command, user displays the lines from /etc/passwd containing the string root. Then displays the line numbers containing this search string. With the third command the user checks which users are not using bash, but accounts with the nologin shell are not displayed. Then the user counts the number of accounts that have /bin/false as the shell. The last command displays the lines contining root or Root or ROOT, etc.. Now let's see what else we can do with grep, using regular expressions. 137
  • 138. COE Unit 2, Lesson 5 5.3.1 Grep and regular expressions a. Line and word anchors From the previous example, we now exclusively want to display lines starting with the string "root": bash> grep ^root /etc/passwd root:x:0:0:root:/root:/bin/bash If we want to see which accounts have no shell assig ned whatsoever, we search for lines ending in ":": bash> grep :$ /etc/passwd news:x:9:13:news:/var/spool/news: To check that PATH is exported in ~/.bashrc, first select "export" lines and then search for lines starting with the string "PATH", so as not to display MANPATH and other possible paths: bash> grep export ~/.bashrc | grep „'<PATH' export PATH="/bin:/usr/lib/mh:/lib:/usr/bin:/usr/local/bin:/usr/ucb:/ usr/dbin:$PATH" If you want to find a string that is a separate word (enclosed by spaces), it is better to use the −w, as in this example where we are displaying information for the root partition: bash> cat myFile.txt Neil Armstrong was the first man to walk on the moon. He had said, ―this is a small step for me but a huge step for mankind‖. bash> grep –w man myFile.txt Neil Armstrong was the first man to walk on the moon. Note here that the other line is not matched because Ifmankind is a single word hence will from the file system table will be this option is not used, all the lines not match for the word displayed. man because –w option is used. b. Character classes A bracket expression is a list of characters enclosed by "[" and "]". It matches any single character in that list; if the first character of the list is the caret, "^", then it matches any character NOT in the list. For example, the regular expression "[0123456789]" matches any single digit. You can also write it like [0-9]. 138
  • 139. COE Unit 2, Lesson 5 Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, "[a−d]" is equivalent to "[abcd]". Many locales sort characters in dictionary order, and in these locales "[a−d]" is typically not equivalent to "[abcd]"; it might be equivalent to "[aBbCcDd]", for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value "C". Finally, certain named classes of characters are predefined within bracket expressions. See the grep man or info pages for more information about these predefined expressions. bash> grep [yf] /etc/group sys:x : 3 : root,bin,adm tty : x : 5 : mail : x : 12 :mail,postfix ftp : x : 50 : nobody : x : 99 : floppy:x : 19 : xfs : x : 43 : nfsnobody : x : 65534 : postfix : x : 89 : bash> ls *[1−9].xml app1.xml chap1.xml chap2.xml chap3.xml chap4.xml In the example, all the lines containing either a "y" or "f" character are first displayed, followed by an example of using a range with the ls command. c. Wildcards Use the "." for a single character match. If you want to get a list of all five−character English dictionary words starting with "c" and ending in "h" (handy for solving crosswords): bash> grep „'<c...h>' /usr/share/dict/words catch clash cloth coach couch cough crash crush If you want to display lines containing the literal dot character, use the −F option to grep. 139
  • 140. COE Unit 2, Lesson 5 For matching multiple characters, use the asterisk. This example selects all words starting with "c" and ending in "h" from the system's dictio nary: bash> grep „'<c.*h>' /usr/share/dict/words caliph cash catch cheesecloth cheetah 5.4 Pattern matching using shell 5.4.1 Character ranges Apart from grep and regular expressions, there's a good deal of pattern matching that you can do directly in the shell, without having to use an external program. As you already know, the asterisk (*) and the question mark (?) match any string or any single character, respectively. Quote these special characters to match them literally: bash> ls "*" This will not list all the files. It will list the file named *. But you can also use the square braces to match any enclosed character o r range of characters, if pairs of characters are separated by a hyphen. An example: bash> ls −ld [a−cx−z]* drwxr−xr−x 2 radha radha 4096 Jul 20 2002 app−defaults/ drwxrwxr−x 4 radha radha 4096 May 25 2002 arabic/ drwxrwxr−x 2 radha radha 4096 Mar 4 18:30 bin/ drwxr−xr−x 7 radha radha 4096 Sep 2 2001 crossover/ Lists all files in radha's home directory, starting with "a", "b", "c", "x", "y" or "z". drwxrwxr−x 3 radha radha 4096 Mar 22 2002 xml If the first character within the braces is "!" or "^", any character not enclosed will be matched. To match the dash ("−"), include it as the first or last character in the set. The sorting depends on the current locale and of the value of the LC_COLLATE variable, if it is set. Mind that other locales might interpret "[a−cx−z]" as "[aBbCcXxYyZz]" if sorting is done in dictionary order. If you want to be sure to have the traditional interpretation of ranges, force this behavior by setting LC_COLLATE or LC_ALL to "C". 140
  • 141. COE Unit 2, Lesson 5 5.4.2 Character classes Character classes can be specified within the square braces, using the syntax [:CLASS:], where CLASS is defined in the POSIX standard and has one of the values "alnum", "alpha", "ascii", "blank", "cntrl", "digit", "graph", "lower", "print", "punct", "space", "upper", "word" or "xdigit". bash> ls −ld [[:digit:]]* drwxrwxr−x 2 radha radha 4096 Apr 20 13:45 2/ bash> ls −ld [[:upper:]]* drwxrwxr−− 3 radha radha 4096 Sep 30 2001 Nautilus/ drwxrwxr−x 4 radha radha 4096 Jul 11 2002 OpenOffice.org1.0/ −rw−rw−r−− 1 radha radha 997376 Apr 18 15:39 Schedule.sdc When the extglob shell option is enabled (using the shopt built−in), several extended pattern matching operators are recognized. 5.5 Summary Regular expressions are powerful tools for selecting particular lines from files or output. A lot of UNIX commands use regular expressions: vim, perl, the PostgreSQL database and so on. They can be made available in any language or application using external libraries, and they even found their way to non−UNIX systems. For instance, regular expressions are used in the Excell spreadsheet that comes with the MicroSoft Windows Office suite. In this chapter we got the feel of the grep command, which is indispensable in any UNIX environment. Bash has built−in features for matching patterns and can recognize character classes and ranges. Self-check Questions 1. What are regular expressions 2. What will be the result of ls -l | grep '^.....w' 3. What does the expression gg* signify? 4. How do you locate lines in a file foo containing ram and raman using grep? 141
  • 142. COE Unit 2, Lesson 5 5.6 Answers to the Self-Check questions 1. A regular expression is a pattern that describes a set of strings. 2. This locates all files which have write permission for the group (e.g. drwxrw -r- x) 3. One or more occurrences of g. 4. Use grep ―rama*n*‖ foo 5.7 Terminal Questions 1. Describe the structure of a regular expression. 2. Describe some regular expression operators. 3. What is the difference between a wild card and a regular expression? 4. What is the difference between basic and extended regular expression 142
  • 143. UNIT 3: Advanced Shell Scripting, sed, and awk 1. FUNCTIONS IN SHELL SCRIPTS .................................................................. 147 2. SED – STREAM EDITOR.................................................................................... 159 3. AWK BASICS ........................................................................................................... 169 4. AWK PROGRAMMING ........................................................................................ 177
  • 145. COE Unit 3, Lesson 1 LESSON 1 FUNCTIONS IN SHELL SCRIPTS 1. FUNCTIONS IN SHELL SCRIPTS ............................................................................. 147 1.0 OBJECTIVES .......................................................................................................... 147 1.1 INTRODUCTION TO SHELL FUNCTIONS................................................................... 147 1.1.1 When to use functions? ................................................................................. 147 1.1.2 Benefits of using functions ............................................................................ 149 1.1.3 Where you cannot create functions?........................................................... 150 1.2 WRITING A SHELL FUNCTION ................................................................................. 150 1.2.1 Function header.............................................................................................. 150 1.2.2 Function body ................................................................................................. 151 1.2.3 Returning from a function.............................................................................. 152 1.2.4 Function arguments ....................................................................................... 152 1.2.5 IFS (internal field separators) ....................................................................... 153 1.2.6 Creating a utility library of shell functions ................................................... 154 1.2.7 Things to keep in mind while writing shell functions ................................. 154 1.3 SUMMARY.............................................................................................................. 155 1.4 ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 155 1.5 TERMINAL QUESTIONS .......................................................................................... 155
  • 147. COE Unit 3, Lesson 1 1. Functions in Shell Scripts So far you have learnt various unix commands, plumbing commands together using pipes and creating shell scripts for programming to carry out useful and routine, repetitive tasks. Shell scripting in unix can never be complete without knowing how to write and use functions. 1.0 Objectives After going through these lessons you will know  When to use functions in shell scripts  How to write and use functions in shell scripts 1.1 Introduction to shell functions Often there are few lines of code that need to be used at several places in the shell scripts. For example, if you are creating a shell script that will read a 3 digit STD code and 7 digit phone number and you need to ensure that user types in exactly 3 numeric characters for STD code and exactly 7 numeric characters for phone number, then it will be better to create and use a function instead of replicating the same code at multiple places. A function is like a mini script. It can take parameters, can define its own variables, can return a value, etc. Unlike a script‘s call, a function executes in the same shell. Functions in shell scripts look and work similar to functions in C language. 1.1.1 When to use functions? Consider the example listed in 1.1 above, without using functions: 147
  • 148. COE Unit 3, Lesson 1 #!/bin/bash stdOK=0 do echo ―Please enter 3 digit STD code: ― read std chkSTD=`echo $std | grep ―^[0-9][0-9][0- 9]$‖` if [ ―$chkSTD‖‖X‖ != ―X‖ ]; then stdOK=1 else echo ―Please enter exactly 3 digit STD code here‖ fi while [ $stdOK –neq 1 ] phoneOK=0 do echo ―Please enter 7 digit phone number: ― read phone chkPH=`echo $phone | grep ―^[0-9][0-9][0- 9][0-9][0-9][0-9][0-9]$‖` if [ ―$chkPH‖‖X‖ != ―X‖ ]; then phoneOK=1 else echo ―Please enter exactly 7 digit phone number here‖ fi while [ $phoneOK –neq 1 ] callup $std $phone Script 1 You will find that apart from the marked text below, the rest of the code is repeated. 148
  • 149. COE Unit 3, Lesson 1 #!/bin/bash stdOK=0 do echo ―Please enter 3 digit STD code: ― read std chkSTD=`echo $std | grep ―^[0-9][0-9][0- 9]$‖` if [ ―$chkSTD‖‖X‖ != ―X‖ ]; then stdOK=1 else echo ―Please enter exactly 3 digit STD code here‖ fi while [ $stdOK –neq 1 ] phoneOK=0 do echo ―Please enter 7 digit phone number: ― read phone chkPH=`echo $phone | grep ―^[0-9][0-9][0- 9][0-9][0-9][0-9][0-9]$‖` if [ ―$chkPH‖‖X‖ != ―X‖ ]; then phoneOK=1 else echo ―Please enter exactly 7 digit phone number here‖ fi while [ $phoneOK –neq 1 ] callup $std $phone Script 2 See how much simpler it would be if you had a function that got you the desired numbers!! #!/bin/bash std=`getNumber 3 ―STD code‖` phone=`getNumber 7 ―phone number‖ callup $std $phone Script 3 1.1.2 Benefits of using functions Functions provide several benefits as listed below: 149
  • 150. COE Unit 3, Lesson 1  Functions simplify and modularize your scripts. Your scripts become better readable (compare script 1 and script 3 above).  Modularize scripts are easier to maintain and enhance.  Functions provide you easier debugging.  Once you enhance a function, the enhanced effect is automatically available at all places where the function is used.  You can even create a utility file containing functions and source it in your other scripts so that utility functions are directly available for use, instead of writing them over and over again. 1.1.3 Where you cannot create functions? Be aware that not all shells provide support for functions. For example csh (C - shell) does not provide support for functions. But most other shells have this support, including sh (Bourne shell), ksh (korn shell), tsh, bash (born again Bourne shell), etc. Self Check Questions 1. When few lines of code needs to be repeated at several places a ______ should be created for it (select one): a. script b. program c. function 2. A function helps in improving the script by making it (select one or many as apply): d. more readable e. more debug gable f. modular g. more maintainable 1.2 Writing a shell function A shell function in bash has the following syntax. Text in bold indicates keywords. <yourFunctionName>() { <commands>; } Or function <yourFunctionName> { <commands>; } 1.2.1 Function header 150
  • 151. COE Unit 3, Lesson 1 You can define a function by using the function keywords or you can define a function by putting braces after the function name. For example: Following defines a function named ―aaa‖. function aaa { a =1 } Following defines a function named ―bbb‖. bbb() { a=1 } Note that parameters to functions are not passed like C. Therefore, in function‘s header you will not declare any parameters. See the definition of the function ―bbb‖ above. No parameters are ever listed within the braces. 1.2.2 Function body Set of commands comprise of the function body. A function can contain any set of shell scripting commands, including flow control commands like while and conditional commands like if, etc. Commands can also contain calls to other functions and even other shell scripts. For example: getDateString() { echo ―Date format is dd/mm/yy ?:‖ read x if [ ―$x‖‖X‖ = ―yX‖ ];then str=`date ‗+%dd%mm%yy‘` else str=`date ‗+%yy%mm%dd‘` fi echo $str } The above script uses the call to ―date‖ shell command. 151
  • 152. COE Unit 3, Lesson 1 Self-Check Questions 3. The function keyword is must for writing a function (true /false). 4. You must declare arguments to a function in the function header (true/false). 5. You cannot declare arguments to a function in the function header (true/false). 6. Function body can contain any of the shell commands (true/false). 1.2.3 Returning from a function If your function reaches the end of its body and it has an echo command, it echoes the return value. Alternatively, you can return without completing the execution of the function body by using the return keyword. For example: aaa() { a=1 b=2 echo $a } ret=`aaa` # ret will be 1 bbb() { a=1 b=2 return $a } ret=`aaa` # ret will be 1 1.2.4 Function arguments Parameters can be passed when calling a function by listing them in front of the function. When inside the function, these parameters can be accessed as shell variables, $1, $2, etc. Even $# (number of arguments passed) is available inside the function. Example: 152
  • 153. COE Unit 3, Lesson 1 addTwoNums() { sum=0 sum=`expr $1 + $2` return $sum } addAllNums() { sum=0 if [ ―$1‖‖X‖ = ―X‖ ];then return $sum else sum=`$sum + $1` fi } 1.2.5 IFS (internal field separators) You need to be careful while passing arguments to a function or a shell command in a shell script. Shell interprets the values that you supply. As a result, a string passed as a parameter can get interpreted as multiple parameters if it contains spaces. For example: paintObj “greenish blue” Here you would expect to see $1 inside the function as ―greenish blue‖ but you will get $1 as ―greenish‖ and $2 as ―blue‖. You can tell shell to interpret newline as a field separator by declaring in your script IFS=” ― # Yes, the closing quote is on the next line! Therefore, if you use the following: IFS=” “ paintObj “greenish blue” Now, here you will get $1 inside the function as ―greenish blue‖. Self-Check Questions 7. Parameters passed to a function are accessible using $1, $2, variables. (true/false). 8. The $# inside a function indicates the number of parameters passed to the script (true/false). 153
  • 154. COE Unit 3, Lesson 1 1.2.6 Creating a utility library of shell functions When you create shell functions, you would typically want to make them somewhat generic so that they can be reused in other shell scripts as well. In such cases, you can simply collect your shell functions into a single file. Such a file containing utility shell functions can be used as a library and can be sourced in other shell scripts. For example: bash>cat a_simple_utility_library.sh #!/bin/sh #--------------------------------- a_simple_utility_library --------------------------- IFS=‖ ― # myecho function echoes the input and also writes it into multiple files myecho() { for i in $FILE_LIST do echo $* >> $i done echo $* } mykill() { pid=`ps –ef | grep $1 | grep –v ―grep‖ | awk ‗{print $2}‘` kill $pid } #------------------------- a_simple_utility_library ends --------------------------- bash>cat my_application.sh 1.2.7 Things to keep in mind while writing shell functions Just like #!/bin/sh commands, there are restrictions when writing shell other shell functions.. a_simple_utility_library.sh # The dot in the  beginningbracket must be right on the same line as the function The starting curly sources it header. mykill junkjob # will kill the process running  ―junkjob‖ There must be spaces on both sides of curly brackets.  There must be either a semi colon or a new line before the closing curly bracket. 154
  • 155. COE Unit 3, Lesson 1 1.3 Summary Functions help in modularizing the scripts for repetitive tasks. If you use functions, scripts become better readable and maintainable. 1.4 Answers to the self check questions 1. (c) 2. all. 3. false. 4. false. 5. true. 6. false. 7. true. 8. false. 1.5 Terminal Questions 1. Discuss among your peers how functions are different from aliases. 2. Write a function that gets you a non-empty string. 3. Write a function that uses the function created in assignment 2 above to read and convert a string into all uppercase. 4. Write a script that takes name, middle name and family name of a person and prints them out in all uppercase or all lowercase depending on a shell variable‘s value. 5. Write a function that takes number of digits as an input and gets a number containing those many digits. The function must check that user has to provide a number. 6. Write a function that takes number of digits as an input and gets a number containing at most those many digits. 155
  • 157. COE Unit 3, Lesson 2 LESSON 2 SED – STREAM EDITOR 2. SED – STREAM EDITOR ............................................................................................ 159 2.0 OBJECTIVES .......................................................................................................... 159 2.1 INTRODUCTION TO SED ......................................................................................... 159 2.2 HOW SED OPERATES ............................................................................................. 159 2.3 SYNTAX OF THE SED COMMAND............................................................................ 160 2.3.1 Options for the sed ......................................................................................... 160 2.4 COMMANDS IN SED................................................................................................ 161 2.4.1 Syntax of the commands in sed................................................................... 162 2.5 SUMMARY.............................................................................................................. 164 2.6 ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 164 2.7 TERMINAL QUESTIONS .......................................................................................... 165
  • 159. COE Unit 3, Lesson 2 2. SED – Stream Editor Sed (Stream editor) is a utility program available in unix. sed is a powerful utility that can be used to transform the input, line-by-line. sed is commonly used in scripting. 2.0 Objectives After going through these lessons you will know  What is sed? Its options and commands  What are regular expressions  Interactive use of sed  Using sed commands in scripts 2.1 Introduction to sed A Stream Editor is used to perform transformations on text read from a file or a pipe. Sed sends the result to the standard output which can be redirected and collected into another file, if needed. Sed does not modify the original input file. Unlike other editors, vi and ed, which are interactive editors, sed works on an input stream. Sed therefore is suitable in scripts when you need text transformations, like in conversion programs. For example: If you have a file where ―error‖ is misspelt as ―erorr‖, you can correct them by using sed command: sed „s/erorr/error/g‟ myfile > myfile_corrected 2.2 How sed operates It often comes handy to know how a utility works. Here is the detail on how sed works:  A line of input is copied into a pattern space.  All editing commands in a sed script are applied in order to the copied line.  The copied (and modified) line is sent to standard output.  By default, sed works on all the lines of input. However, its scope can be controlled by line addressing.  Editing commands are applied to all lines (globally) unless line addressing restricts the lines affected. 159
  • 160. COE Unit 3, Lesson 2  If a command changes the input, subsequent command -addresses will be applied to the current line in the pattern space, not the original input line. 2.3 Syntax of the sed command Sed can be invoked in one of the following forms: sed [options ] 'c ommand ' file(s) Or sed [options] -f scriptfile file(s) The first form allows you to specify an editing command on the command line, surrounded by single quotes. The second form allows you to specify a scriptfile , a file containing sed commands. If no files are specified, sed reads from standard input. 2.3.1 Options for the sed  The –e option -e <script> option tells sed to add the commands in <script> to the set of commands to run. You can give a series of commands using –e option. For example: sed -e 's/a/A/' -e 's/b/B/' < oldFile >newFile  The –f option -f <scriptFile> : Tells sed to add the commands from <scriptFile> to the set of commands to run. For example, instead of just replacing ‗a‘ and ‗b‘, it you want to uppercase all vowels in the input, you can write an sed script file: bash>cat sed_script # sed comment - This script changes lower case vowels to upper case s/a/A/g s/e/E/g s/i/I/g s/o/O/g sed -f sed_script < oldFile > newFile s/u/U/g will uppercase all vowels. Note that in sed script files, each command must be on a separate line. No trailing white spaces can exist at the end of lines. No quotes can be used.  The –n option 160
  • 161. COE Unit 3, Lesson 2 -n : This option tells sed not to print by default. Only when specific sed commands for print are used, those specific items will be printed. For example, sed –n „s/pattern/&/p‟ file will act like grep looking for ―pattern‖. Self-Check Questions 1. sed is an interactive editor like vi (true/false) 2. sed can be used in scripts (true/false) 2.4 Commands in sed Sed supports grep like regular expressions to find the text for pattern substitution and deletion. Sed uses vi like commands: a appends text below the current line i Insert text above the current line c change text in the current line with new text s search and replace text d Delete text p Prints text For example, if there is a file that lists tasks like: bash>cat tasks DONE: functions TODO: sed TODO: awk DONE: password change To delete all lines in a file that are marked DONE, you can use sed ‗/DONE/d‘ tasks > new_tasks bash>cat new_tasks TODO: sed TODO: awk 161
  • 162. COE Unit 3, Lesson 2 2.4.1 Syntax of the commands in sed The sed commands have the general form as listed below: [address][,address][!]operation [arguments] Sed commands consist of addresses and operation. Each operation consists of a single letter. Let‘s take the following input file for the examples given below: bash>cat input_file This is the first line This is the second line of text This is the third line of input_file This is the fourth and the last line 1. If no address is specified, the operation is applied to each line. For example: sed ‗s/This/this/g‘ < input_file > output_file bash>cat output_file this is the first line this is the second line of text this is the third line of input_file this is the fourth and the last line 2. Only the first pattern is matched by default. For example, sed ‗s/the/a/‘ < input_file > output_file bash>cat output_file This is a first line This is a second line of text This is a third line of input_file This is a fourth and the last line The second “the” is not modified. To tell sed to work on all the matched patterns on a line, use ―g‖. sed ‗s/the/a/g‘ < input_file > output_file bash>cat output_file This is a first line This is a second line of text This is a third line of input_file This is a fourth and a last line 162
  • 163. COE Unit 3, Lesson 2 The second “the” is also modified now. 3. Only one address can be given. For example: sed ‗2s/second/SECOND/g‘ < input_file > output_file bash>cat output_file This is the first line This is the SECOND line of text This is the third line of input_file This is the fourth and the last line 4. Two addresses can be given to make a block. For example: sed ‗1,2s/line/input/g‘ < input_file > output_file bash>cat output_file This is the first input This is the second input of text This is the third line of input_file This is the fourth and the last line 5. $ can be used to denote end of file in specifying addresses For example: sed ‗3,$d‘ < input_file > output_file bash>cat output_file This is the first input This is the second input of text 6. Address can also be given using patterns. For example: sed ‗/input_file/d‘ < input_file > output_file bash>cat output_file This is the first line This is the second line of text This is the fourth and the last line 7. Address can also be inverted. sed „/SAVE/!d‟ this will delete all lines that do not have SAVE on them sed „/BEGIN/,/MID/s/error/error/g‟ 163
  • 164. COE Unit 3, Lesson 2 this will replace erorr by error from BEGIN to MID. sed „/^BEGIN/,/^END/!s/done//g‟ will delete the word done for all lines except for those lines between BEGIN and END. Address and patterns can include grep like regular expressions as well. For example: sed ‗/This.*first/p‘ input_file > output_file bash>cat output_file This is the first line Self-Check Questions 3. What argument can be used to tell sed to apply operations o n all the matched patterns on a line: a. none. Sed already does that by default. b. g c. i 4. What character can be used to invert the address in sed? a. none b. i c. x 2.5 Summary The sed stream editor is a powerful command line tool, which can handle streams of data: it can take input lines from a pipe. This makes it fit for non−interactive use. The sed editor uses vi−like commands and accepts regular expressions. The sed tool can read commands from the command line or from a script. It is often used to perform find−and−replace actions on lines containing a pattern. 2.6 Answers to the self check questions 1. false. 2. true. 3. (b) 4. (b) 164
  • 165. COE Unit 3, Lesson 2 2.7 Terminal Questions 1. Use sed to implement a head like utility of unix (prints only first 5 lines). 2. Use sed to implement tail like utility of unix (prints only the last 5 lines). 3. Print a list of files in your scripts directory, ending in ".sh". Mind that you might have to unalias ls. Put the result in a temporary file. 4. Make a list of files in /usr/bin that have the letter "a" as the second character. Put the result in a temporary file. 5. Delete the first 3 lines of each temporary file. 6. Print to standard output only the lines containing the pattern "an". 7. Create a file holding sed commands to perform the previous two tasks. Add an extra command to this file that adds a string like "*** This might have something to do with man and man pages ***" in the line preceding every occurrence of the string "man". Check the results. 8. A long listing of the root directory, /, is used for input. Create a file holding sed commands that check for symbolic links and plain files. If a file is a symbolic link, precede it with a line like "−−This is a symlink−−". If the file is a plain file, add a string on the same line, adding a comment like "<−−− this is a plain file". 9. Create a script that shows lines containing trailing white spaces from a file. This script should use a sed script and show sensible information to the user 10. Can sed be used to create tail –f kind of utility? 11. Search the internet to find how newline can be replaced. 12. Top 4 lines of a file contain names of students and rest 4 lines contain their marks : bash>cat file Mohit verma Sushobhit sinha Mukul Khan Naina Suman 20 25 35 28 Using sed and paste command, create another file that will have Mohit verma 20 Sushobhit sinha 25 Mukul Khan 35 Naina Suman 28 165
  • 167. COE Unit 3, Lesson 3 LESSON 3 AWK BASICS 3. AWK BASICS ................................................................................................................ 169 3.0 OBJECTIVES .......................................................................................................... 169 3.1 INTRODUCTION AND BRIEF HISTORY ..................................................................... 169 3.2 THE SYNTAX OF AWK........................................................................................... 169 3.3 USING AWK .......................................................................................................... 170 3.3.1 The print command in AWK.......................................................................... 171 3.3.2 Accessing fields on a line.............................................................................. 172 3.4 SUMMARY.............................................................................................................. 174 3.5 ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 174 3.6 TERMINAL QUESTIONS .......................................................................................... 174
  • 169. COE Unit 3, Lesson 3 3. AWK Basics AWK is a utility for performing simple text-processing tasks. Awk also provides a small but powerful language that allows the user to manipulate files containing columns of data and strings, to print reports from the data. 3.0 Objectives After going through this lesson you will know  What is AWK, the syntax of AWK  How is AWK useful  Print command in AWK  How to access fields in AWK 3.1 Introduction and brief history AWK stands for the names of its authors: "Aho, Weinberger, & Kernighan". The original version of AWK was developed in 1977. In Unix it is available as awk. Advanced versions exist (e.g, nawk, gawk) that support user defined functions, multidimensional arrays, ?: operator, deleting elements in an array, etc. Awk operates in a cycle: get a line, process it, get the next line, process it, and so on. It is an "interpreted" language -- that is, an Awk program cannot run on its own, it must be executed by the Awk utility itself. Like sed, AWK reads an input file or reads from a pipe. It does not modify the input file and writes its output onto the standard output. In addition, because AWK is a programming language in itself, awk is very useful in processing data and printing reports. 3.2 The syntax of AWK awk [options] ‗ [ BEGIN {<initializations>} ] [ <program> ] ' [ [ <program>] ] ... [ END {<final actions>} ] ' <File Name> 169
  • 170. COE Unit 3, Lesson 3 Where each <program> has the format: [ <search pattern 1> ] [ {<program actions>} ] Awk operates as listed below: 1. Perform initialization if BEGIN is given 2. Read a line of text, break it into fields 3. For each <program> 4. Perform the program as given by user 5. Goto step2. 6. Perform END calculations if specified by the user The optional BEGIN clause performs any initializations required before Awk starts scanning the input file. The subsequent body of the Awk program consists of a series of search patterns, each with its own program action. Awk scans each line of the input file for each search pattern, and performs the appropriate actions for each string found. Once the file has been scanned, an optional END clause can be used to perform any final actions required. 3.3 Using AWK We will use the following example data to see how to use awk. This data is a file containing the top marks for some of the subjects along with the topper names and years. bash>cat toppers.txt Physics 92 2003 Abhay Malhotra Chemistry 97 2003 Suman Gupta Maths 99 2003 Suresh Yadav Physics 94.5 2004 Shriesh Jadhav Chemistry 98.5 2004 Shriesh Jadhav Maths 96 2004 Lokesh Arora Physics 89 2005 Vandana Agarwal Chemistry 92 2005 Srinivas Vardharajan Maths 99 2005 Anup Mathur Physics 98 2006 Ramakant Chemistry 88 2006 Raju Pandy Chemistry 89 2006 Rajni Kumar Maths 98 2006 Javed M. K. Akthar Example 1: Since almost all of the awk syntax is optional, at the minimum, the simplest awk command can be written as awk „‟ input_file 170
  • 171. COE Unit 3, Lesson 3 This will work like the cat command and print the entire input_file as is. Note that here we are running an AWK code using awk command. The code is always kept within quotes. Example 2: You can ask awk to work on specific lines. For example, you can give a search pattern. awk '/Physics/' toppers.txt > phy_toppers.txt Note that AWK does not modify the input. Also note that AWK writes output to the standard output. Here we have redirected the output into a file phy_toppers.txt. Now let‘s see the contents of this output file: bash>cat phy_toppers.txt Physics 92 2003 Abhay Malhotra Physics 94.5 2004 Shriesh Jadhav Physics 89 2005 Vandana Agarwal Physics 98 2006 Ramakant Example 3: Pattern matching is based on case. For example, here if you gave ―physics‖ in place of ―Physics‖ here as a search pattern, it would not match the lines containing ―Physics‖. awk '/physics/' toppers.txt > no_match.txt The file no_match.txt will come out an empty file. Self-Check Questions 1. Awk is useful for processing text co ntaining columns of data. (true/false). 2. Awk is a small programming language in itself (true/false). 3. Awk does not modify the input file (true/false). 4. Awk program cannot run on its own but needs which one to run: (a) awk (b) sed (c) grep 3.3.1 The print command in AWK A simple print command is available in AWK. This command does not need any format specifications and values can be printed in a simple way. 171
  • 172. COE Unit 3, Lesson 3 Example 4: If you use print with no arguments, it prints the input text as is. awk „/Physics/ { print } /Maths/ {print }‟ toppers.txt will print Physics 92 2003 Abhay Malhotra Maths 99 2003 Suresh Yadav Physics 94.5 2004 Shriesh Jadhav Maths 96 2004 Lokesh Arora Physics 89 2005 Vandana Agarwal Maths 99 2005 Anup Mathur Physics 98 2006 Ramakant Maths 98 2006 Javed M. K. Akthar Example 5: You can give arguments to print awk ‗/Maths/ {print ―This is a math topper‖}‘ toppers.txt will print This is a math topper This is a math topper This is a math topper This is a math topper Note an important point here. The print command prints the arguments as is, so if you need any text like spaces, you will need to add that in the print command itself as shown above. We will see more concrete examples of print command in subsequent examples. 3.3.2 Accessing fields on a line The power of AWK lies in the fact that it treats each input line as a record consisting of fields. Which means, as it reads lines, it breaks up the line into fields and lets you access and manipulate fields and the output. By default AWK uses spaces as the separator for fields which means when it reads a line, it breaks it up into words for you. The separator can be changed easily as we will see later in this unit. To access the fields of input line, awk provides the following built in variables: $0, $1, $2 … $9. The first one, $0, gives you the entire line, as is. $1 is the first field, $2 is the second field, .. and so on. Example 6: If the input line just read in by awk is Physics 92 2003 Abhay Malhotra 172
  • 173. COE Unit 3, Lesson 3 then, $1 will contain ―Physics‖ $2 will contain 92 $3 will contain 2003 $4 will contain ―Abhay‖ $5 will contain ―Malhotra‖ Note that because awk is treating space as the separator, it breaks up the name too into two separate fields. Example 7: To print just the names of chemistry toppers, you can use the following command: awk '/Chemistry/ {print $4, ― ―, $5, ― ―,$6, ― ―,$7,‖ ―, $8);}' toppers.txt > chem_topper.txt bash>cat chem_toppers.txt Suman Gupta Shriesh Jadhav Srinivas Vardharajan Raju Pandy Rajni Kumar Note that we have used $5 to $9, though the names that we got in the output would have come even with $5 and $6 alone because it seems from the output that names are occupying only two fields. However, we do have a longer name (Javed M. K. Akhtar) also in the names which is occupying 3 fields. Therefore we need to be aware of the data when printing multiple fields. AWK does not have a way to say things like ―print all fields from $5 onwards‖ so we need to use additional fields. However, if you simply want to print the entire line, then you do not need to use these fields. For example, Example 8: To print all data for math toppers, use the following awk '/Maths/ {print}' toppers.txt > math_toppers.txt #Note no $1, $2 used bash>cat math_toppers.txt Maths 99 2003 Suresh Yadav Maths 96 2004 Lokesh Arora Maths 99 2005 Anup Mathur Maths 98 2006 Javed M. K. Akthar The examples so far were solving things that can be solved by a combination of grep, sed, cut etc., as well. However AWK is much more capable. We will see the other features in subsequent chapters. 173
  • 174. COE Unit 3, Lesson 3 Self-Check Questions 5. awk processes how many line(s) of input at a time? (a)1, (b) 2, (c) depends on the available memory, (d) all lines in input. 6. awk breaks inputs into columns or words (true/false) 7. awk uses spaces to break inputs (true/false) 8. The print command can be used to print the fields of input with added text (true/false) 3.4 Summary The awk utility is a powerful command line tool, which can handle streams of data: it can take input lines from a pipe. This makes it fit for non−interactive use. 3.5 Answers to self check questions 1. true. 2. true. 3. true. 4. (a) 5. (a) 6. true. 7. true. 8. true. 3.6 Terminal Questions 1. Take the toppers.txt of this chapter. For each year and subject, print the first name of the topper, marks and then year. 2. Do the same question as listed above but now print the complete name of the topper followed by marks and then year. 3. See the AWK syntax. We have used only one pattern and its program in our examples. Try using multiple patterns and their corresponding programs and see the outputs. 174
  • 175. IT 102 Unit 3, Lesson 4 LESSON 4. AWK PROGRAMMING 4. AWK PROGRAMMING ................................................................................................ 177 4.0 OBJECTIVES .......................................................................................................... 177 4.1 INTRODUCTION ...................................................................................................... 177 4.2 RELATIONAL AND LOGIC OPERATORS IN AWK ..................................................... 177 4.3 CONTROL STRUCTURES IN AWK............................................................................ 178 4.3.1 The if-else construct....................................................................................... 178 4.3.2 The for loop ..................................................................................................... 179 4.4 SPECIAL VARIABLES - NF AND NR........................................................................ 180 4.4.1 Using BEGIN and END clauses in awk ...................................................... 180 4.4.2 Using variables in AWK ................................................................................. 181 4.5 RUNNING AWK PROGRAMS KEPT IN FILES........................................................... 182 4.6 GENERATING REPORTS USING AWK.................................................................... 184 4.6.1 The printf command of AWK ........................................................................ 184 4.6.2 Format specifications in printf....................................................................... 185 4.6.3 Printing the fields in different order than input ........................................... 186 4.6.4 Creating simple reports ................................................................................. 186 4.6.5 Field separator ................................................................................................ 187 4.6.6 Printing heading/heading row and summary/footer .................................. 189 4.7 MISCELLANEOUS FEATURES OF AWK.................................................................. 190 4.7.1 Specifying search patterns in AWK ............................................................. 190 4.7.2 Limiting the lines on which AWK would work............................................. 191 4.7.3 Built-in variables ............................................................................................. 192 4.7.4 Passing arguments to AWK.......................................................................... 193 4.7.5 Arrays and associative arrays in AWK ........................................................ 195 4.7.6 String functions in AWK................................................................................. 195 4.7.7 Few interesting, complex examples ............................................................ 196 4.8 SUMMARY.............................................................................................................. 197 4.9 ANSWERS TO THE SELF CHECK QUESTIONS ......................................................... 197 4.10 TERMINAL QUESTIONS .......................................................................................... 197
  • 177. COE Unit 5, Lesson 6 4. AWK Programming In the previous chapter we saw how AWK can be used to process the input data and print in some ways as needed. In this chapter we will see programming features of AWK that make it very powerful. 4.0 Objectives After going through this lesson you will know  How to use AWK programming  Relational and logic operators for conditions  Control structures  Use of variables, BEGIN and END clauses  How to generate reports using AWK  Miscellaneous features of AWK 4.1 Introduction AWK provides a simple yet powerful programming language. The programming language features are similar to C language constructs. Note that we will continue to refer to the toppers.txt file from chapter 3 for examples. 4.2 Relational and logic operators in AWK AWK supports comparing fields to create conditions. Relational operators, that compare two values, are available in awk. For example, a condition like $1 == 2006 can be used. We will see such usage in subsequent examples below. Relational operators like the following are there == Compares whether the values specified are equal != Compares whether the values specified are not equal > Tells whether a value is greater than the other. >= Tells whether a value is greater than or equal to other. < Tells whether a value is less than the other. <= Tells whether a value is less than or equal to 177 other
  • 178. COE Unit 5, Lesson 6 Multiple relational conditions can be combined using logic operators. For example $1 == 2006 && $2 != 98. This condition will be true only when first field will be 2006 and second will not be equal to 98. Logic operators like the following are there: && implies logic and || implies logic or ! Implies logic negation Note an important point here. The relational operators only evaluate to true/false. Unlike C operators they do not return a value which could be printed or used in an expression. So, for example ($1 == 1) + ($2 == 0) will result in an error during AWK run. Examples in subsequent sections will show conditions that use relational and logic operators. 4.3 Control structures in awk AWK provides C like control structures as well to facilitate programming. Control structures in AWK include the following: 4.3.1 The if-else construct The if-else construct of AWK has the following syntax. if (condition) statement [ else statement ] Example 1: To print the first name of the chemistry topper for year 2006, we can use awk ‗/Chemistry/ { if( $3 == 2006 ) print $5 }‘ toppers.txt will print Raju Note that there is no else in the example above. The else part of if-else is optional. Example 2: Print whether Maths toppers had more than 97 marks. 178
  • 179. COE Unit 5, Lesson 6 awk ‗/Maths/ { if( $2 > 98 ) { print ―In the year ―, $3; print ― ―, $5, ― had more than 98 marksn‖ } else { print ―In the year ―, $3; print ― ―, $5, ― had less than 98 marksn‖ } }‘ topper.txt This will print In the year 2003 Suresh had more than 98 marks. In the year 2004 Lokesh had less than 98 marks. In the year 2005 Anup had more than 98 marks. In the year 2006 Javed had more than 98 marks. Note that there is an else part used in this example. Also note that if there are more than one statements they can be clubbed together with curly braces as we have done here in the example above. 4.3.2 The for loop The for loop in AWK has the following syntax: for(initial condition; termination condition; increment) statement; Example 3: To print some text for each of the fields we can use awk ‗/Maths/ { for(i=1; i<=4 ) print $i, ‖:‖; }‘ toppers.txt will print Maths:99:2003:Suresh: Maths:96:2004:Lokesh: Maths:99:2005:Anup: Maths:98:2006:Javed: 179
  • 180. COE Unit 5, Lesson 6 Note that $0 contains the entire text input line and $1 onwards contains the fields. Also note that we have used a variable i here. We will see details on variables in AWK later. Self-Check Questions 1. AWK programs can compare two fields of the input line. (true/false). 2. Relational operators give true or false but return value cannot be used in expressions (true/false). 3. The if-else construct of AWK mandates that the else part must be there (true/false). 4. for loop can have a block of statements enclosed in curly brackets (true/false). 4.4 Special variables - NF and NR Awk provides internal special variables called NF – stands for the number of fields in the currently read line. NR – stands for the total number of records read. Example 4: Printing only the long lines more than 5 fields: awk ‗{if (NF > 5) print}‘ toppers.txt this will print maths 96 2006 Javed M. K. Akthar Example 5: For Maths toppers, if we want to skip printing the year, we can use the following AWK command: awk ‗/Maths/ { for(i=1; i<= NR ) if( i != 2) print $i ― ―; print ―n‖; }‘ toppers.txt will print Maths 99 Suresh Yadav Maths 96 Lokesh Arora Maths 99 Anup Mathur Maths 98 Javed M. K. Akhtar 4.4.1 Using BEGIN and END clauses in awk 180
  • 181. COE Unit 5, Lesson 6 Usual programming tasks consist of  Initializing some variables  Reading inputs, performing some calculations and outputs  Finally, generating some output based on the complete input set.  The BEGIN clause of AWK lets you specify initializations. And, the END clause lets you perform calculations based on the entire input. Example 6: Suppose you want to print the total number of toppers. awk 'END {print ―There are ― NR," toppers"}' toppers.txt will print There are 13 toppers. 4.4.2 Using variables in AWK AWK provides $0, $1, $2, .. etc. as fields. In addition, you can use your own variables as well for any calculations. You need not declare the variable. Simply using a variable is permitted. Example 7: Suppose we want to find out the average top marks for physics over the years. awk '/physics/ {marks += $2} END {print "The average top marks in physics are " marks/NR}' toppers.txt This will print The average top marks in physics are 93.375. In this example, "marks" is a user defined variable. You can use almost any string of characters as a variable name in AWK, as long as the name doesn't conflict with some string that has a specific meaning to Awk, such as "print" or "NR" or "END". There is no need to declare the variable, or to initialize it. A variable handled as a string variable is initialized to the "null string", meaning that if you try to print it, nothing will be there. A variable hand led as a numeric variable will be initialized to zero. Self-Check Questions 181
  • 182. COE Unit 5, Lesson 6 5. Special AWK variable NF stands for (a) Next field, (b) New Format, (c) Number of fields, (d) Next Line 6. END is used in AWK to (a) Exit from AWK, (b) To do final calculations 7. You can use any variable in AWK but you need to declare it first (a) true, (b) false. 4.5 Running AWK programs kept in files As you must have noticed, AWK programs can easily be longer than one line. Typing long programs on command line is quite cumbersome. Moreover, whenever you create programs, you would want to keep them in files to be able to use them over and over again. AWK provides a way to run AWK programs. The commands can be written into a file, and then AWK can be told to execute the comma nds from that file as follows: AWK -f <a wk program file name> Example 8: Suppose someone has a coin collection with gold and silver coins. Details of this collection are listed below. bash>cat coin_collection_details.txt Coin type weight(gm) year of making Gold 1 1945 Gold 1 1952 Silver 10 1948 Gold 1 1973 Gold 1 1973 Gold 0.5 1945 Gold 0.1 1933 Silver 1 1943 Gold 0.25 1921 Now we can create an AWK program to print a summary of this coin collection as shown below: 182
  • 183. COE Unit 5, Lesson 6 bash>cat show_coin_summary /gold/ { num_gold++; wt_gold += $2 } # Get weight of gold. /silver/ { num_silver++; wt_silver += $2 } # Get weight of silver. END { val_gold = 485 * wt_gold; # Compute value of gold. val_silver = 16 * wt_silver; # Compute value of silver. total = val_gold + val_silver; print "Summary data for coin collection:"; # Print results. printf ("n"); printf (" Gold pieces: %2dn", num_gold); printf (" Weight of gold pieces: %5.2fn", wt_gold); printf (" Value of gold pieces: %7.2fn",val_gold); printf ("n"); printf (" Silver pieces: %2d n", num_silver); printf (" Weight of silver pieces: %5.2fn", wt_silver); printf (" Value of silver pieces: %7.2fn",val_silver); printf ("n"); printf (" Total number of pieces: %2dn", NR); printf (" Value of collection: %7.2fn", total); } Note that AWK programs allow you to put comments as well. See the first two lines of show_coin_summary file listed above. You can run this AWK program as shown below: bash>awk –f show_coin_summary coin_summary_details.txt The Output of this run will be: Gold pieces: 7 Weight of gold pieces: 4.85 Value of gold pieces: 2352.25 Silver pieces: 2 Weight of silver pieces: 11 Value of silver pieces: 176 183 Total number of pieces: 9 Value of collection: 2528.25
  • 184. COE Unit 5, Lesson 6 4.6 Generating reports using AWK AWK programs can be used to quickly process text inputs and create various reports. Because AWK processes each record as fields, AWK is much more helpful in creating reports, compared to other Unix utilities, like sed. 4.6.1 The printf command of AWK While print command is available in AWK, print is quite a basic command. Often more sophisticated formatting is needed, specially while generating reports. For sophisticated output formatting, C like printf command is available in AWK Printf uses format specifications like %s, %d, etc. for formatting output. %s prints string %d prints a number in decimal format %f prints a floating point number In addition, you can use the following as well to control spacing t to print a tab n to print a new line Note that tabs come in very handy specially to print well aligned columns. The input text fields may vary in lengths. If you separate out fields with spaces, the fields in output may not align well. Use tabs to get better aligned outputs. Example 1: Printing the topper name and year for Maths, with spaces. awk ‗/Maths/ {printf(―%s %sn‖, $4, $3); }‘ toppers.txt will print Suresh 2003 Lokesh 2004 Anup 2005 Javed 2006 You can see that the output columns are not aligned. Example 2: Printing the topper name and year for Maths, with tabs. awk ‗/Maths/ {printf(―%st%sn‖, $4, $3); }‘ toppers.txt will print Suresh 2003 Lokesh 2004 Anup 2005 Javed 2006 You can see that the output columns are well aligned now after using tab. 184
  • 185. COE Unit 5, Lesson 6 4.6.2 Format specifications in printf The printf command of AWK accepts many format specifiers. Moreover, for each of the format specifier, you can control how the output will be printed. This control specially helps further in making the reports better readable. The table below lists how values will be printed when certain format specifiers are used: Format Value Results %s ―Hello‖ ―Hello‖ %10s ―Hello‖ ―Hello ― %-10s ―Hello‖ ― Hello‖ ----------------------------------------- %c 100 "d" %10c 100 " d" %010c 100 "000000000d" -------------------------------------------- %d 10 "10" %10d 10 " 10" %10.4d 10.123456789 " 0010" %10.8d 10.123456789 " 00000010" %.8d 10.123456789 "00000010" %010d 10.123456789 "0000000010" -------------------------------------------- %e 987.1234567890 "9.871235e+02" %10.4e 987.1234567890 "9.8712e+02" %10.8e 987.1234567890 "9.87123457e+02" %f 987.1234567890 "987.123457" %10.4f 987.1234567890 " 987.1235" %010.4f 987.1234567890 "00987.1235" %10.8f 987.1234567890 "987.12345679" -------------------------------------------- %g 987.1234567890 "987.123" %10g 987.1234567890 " 987.123" %10.4g 987.1234567890 " 987.1" %010.4g 987.1234567890 "00000987.1" %.8g 987.1234567890 "987.12346" Self-Check Questions 8. If you use tabs in printf, the output will not be aligned (true/false) 9. Tab is printed by putting (a) T, (b) tab, (c) t, (d) tab 10. For printing a string using print, a format specification is needed (true/false) 185
  • 186. COE Unit 5, Lesson 6 11. If you use a printf with %10s and give ―worlds‖ as argument to the printf, the output will come as ―10worlds‖ (true/false). 4.6.3 Printing the fields in different order than input If you want to print some of the fields in a order that is different from the input, you can simply change the order of the $ variables in the print commands. This powerful feature is often useful when creating reports as well. Example 3: awk ‗{if ($3 == 2006) print $3,‖ ―, $1); }‘ toppers.txt will print the following 2006 95 2006 88 2006 89 2006 96 AWK features make it very useful to process data and print reports, especially when the data is arranged in columns like our toppers.txt example. Let‘s see a few examples before looking at more AWK features. 4.6.4 Creating simple reports Creation of simple reports is straightforward using AWK. Example 4: If you want to print the physics toppers for years prior to 2005, you can use the following command: (note year is the 3‘rd field in input text): awk '/Physics/ {if ($3 < 2005) printf(―%s %s %s %sn‖, $3,$5,$6,$7,$8}' toppers.txt > phy_toppers_before_2005.txt bash>cat phy_toppers_before_2005.txt 2003 Abhay Malhotra 2004 Shriesh Jadhav Example 5: If you want to print a simple yes/no answer whether the topper had more than 92 marks or not, you can use the following: 186
  • 187. COE Unit 5, Lesson 6 awk ‗{if ($2>92) printf(―%st%stypesn‖, $3, $1) else printf(%st%stnon‖, $3, $1); }‘ toppers.txt > more_than_92.txt bash>cat more_than_92.txt 2003 Physics no 2003 Chemistry yes 2003 Maths yes 2004 Physics yes 2004 Chemistry yes 2004 Maths yes 2005 Physics no 2005 Chemistry no 2005 Maths yes and so on. You can see how quickly awk can be used to generate reports like this. Example 7: For Maths toppers, if we want to put a colon between fields except in the names, we can use the following AWK command: awk ‗/Maths/ { for(i=1; i<= NF ) { if( i < 4) printf(‖%s:‖ , $i); else printf(―$s ―, $i); } printf(―n‖); }‘ toppers.txt‘ will print Maths:99:2003:Suresh Yadav Maths:96:2004:Lokesh Arora Maths:99:2005:Anup Mathur Maths:98:2006:Javed M. K. Akhtar Note that the special variable NF has been used to define the terminating condition. With the use of NF you can work with data having variable number of columns as well like we are able to print names that fit in 2 fields (e.g., Lokesh Arora) and names that need 4 fields (e.g., Javed M. K. Akhtar). Also note that we have used if-else inside a for loop. The if-else part is ensuring that there are no colons in the names. 4.6.5 Field separator 187
  • 188. COE Unit 5, Lesson 6 AWK works by reading one input record (one line) and breaking it up into fields. By default, AWK uses white-spaces (space and tabs) as the field separator. However, you may encounter tabular data that uses some other characters as separator. For example, your input data may look like the output of example 8. Maths:99:2003:Suresh Yadav Maths:96:2004:Lokesh Arora Maths:99:2005:Anup Mathur Maths:98:2006:Javed M. K. Akhtar Here colon (‗:‘) is the separator. In such cases, you can tell AWK what character to use as field separator. The field separator is an optional argument to the awk command. awk -F<ch> e.g., awk -F: tells AWK to use colon as a separator awk -F'|' tells AWK to use bar as a separator awk -F'"' tells AWK to use double quote as a separator Example 8: If the input line is Maths:99:2005:Anup Mathur And AWK is run with –F: as an argument, the $1 will contain Maths $2 will contain 99 $3 will contain 2005 $4 will contain ―Anup Mathur‖ Note that $4 here will contain the entire name itself because the separator has been set as colon. Example 9: You can pipe the output of one awk into another awk as well. So we can pipe the output of the example 7 above into another AWK. 188
  • 189. COE Unit 5, Lesson 6 awk ‗ { for(i=1; i<= NR ) { if( i < 4) printf(‖%s:‖ , $i); else printf(―$s ―, $i); } printf(―n‖); }‘ toppers.txt‘ | awk –F: ‗{printf(―%-18st%dn‖, $4, $3); }‘ will print Suresh Yadav 2003 Lokesh Arora 2004 Anup Mathur 2005 Javed M. K. Akhtar 2006 4.6.6 Printing heading/heading row and summary/footer The BEGIN and END clauses can be used even to print headings and summary for reports, thus making the report more readable and attractive. Example 10: Here we will print the physics toppers with headers and will print a summary at the end. awk ‗BEGIN { printf(―Physics toppers details:n‖) printf(―-----------------------------------------n‖); printf(―YeartMarkstName of the toppern‖); printf(―-----------------------------------------n‖); } /Physics/ { printf(―%dt%dt%sn‖, $3, $2, $4); } sum += $2 } END { printf(―-----------------------------------------n‖); printf(―Avg top marks in physics were %f n‖, sum/NR) printf(―-----------------------------------------n‖); }‘ topper.txt This will print --------------------------------------------- Year Marks Name of the topper --------------------------------------------- 2003 92 Abhay 2004 94.5 Shiesh 2005 89 Vandana 189 2006 98 Ramakant ---------------------------------------------
  • 190. COE Unit 5, Lesson 6 Self-Check Questions 12. AWK always prints the fields in the same order as they appear in the input (true/false). 13. AWK can generate reports containing only the input fields. No other items can be added. (true/false). 14. Filed separator in AWK is fixed and cannot be changed (true/false). 4.7 Miscellaneous features of AWK 4.7.1 Specifying search patterns in AWK As we have seen in several examples and in AWK syntax, search patterns, along with their respective programs can be used in AWK. So far we have used simple search patterns like the example below: awk „/Physics/ {print}‟ toppers.txt However, AWK supports much more sophisticated patterns also, as listed below. /The/ matches any lines containing The So this will match lines containing There, These, Them too. But this will not match lines containing the, these, them, etc. because AWK uses case sensitive matching. /^The/ matches any lines beginning with The. So this will match lines which contain The, These, Them in the beginning only. /The$/ matches any lines ending with The /The$$/ matches any lines ending with The$ /[Tt][Hh][Ee]/ matches any lines with THE, The, tHe, thE, etc. /^[a-zA-Z][a-zA-Z0-9_]*$/ matches lines containing only identifiers. /(^India)|(^Pakistan)/ matches lines beginning with India or Pakistan You can even use complex regular expressions in AWK. The regular expressions can be created by using the following characters: 190
  • 191. COE Unit 5, Lesson 6 ? matches zero or one occurrence of character before it + matches one or more occurrences of character before it * matches zero or more occurrences of character before it . The dot matches any character For example, the following expression will match any line containing only a signed integer. The matched line cannot contain any other characters. /^[+-]?[0-9]+$/ matches signed integers. Example 1: A data file contains some text and some integer numbers. Here is the data file: bash>cat data_file.txt The number of loans given 12399 The number of loans fully repaid by now 2893 The number of defaulters 129 Defaulted amount (loss) -8929972 Loss after adjusting procedural expenses -9288990.72 awk ‗/^[+-]?[0-9]+$/ {print }‘ data_file.txt will print 12399 2893 129 -8929972 4.7.2 Limiting the lines on which AWK would work By default, awk works on each of the lines of input. We have already seen that we can use search patterns to limit the lines on which AWK would work. In addition, you can limit AWK to work only on some block of input lines. /^India/,/^Pakistan/ will operate on lines starting with India and will end operation with the line starting at Pakistan. NR == 15 will operate only on the 15'th line! NR==10,NR==25 will operate on lines 10 to 25. $1 == "India" will operate on lines where the first field is "India" $1 ~ /India/ will operate on lines where the first field contains India. 191
  • 192. COE Unit 5, Lesson 6 You can even create complex conditions using &&, || operators e.g., ((NR >= 30) && ($1 == "India")) || ($1 == "Pakistan") Example 2: If you know that your input data has some header text and some footer text and the data of your interest lies in between, then you should use such patterns to limit AWK to work only on data and not on the header and footer. bash>cat data.txt ------------------------------------------------- The weather report for 24.05.2007 ------------------------------------------------- City Humidity Max Temp Agra 92 38 Delhi 93 39 Mumbai 98 34 Copyright CNN world Data from 2pm IST awk ‗NR > 3 && NR < 8 {printf (―%stTemp=%dn‖, $1, $3); }‘ data.txt will print Agra Temp=38 Delhi Temp=39 Mumbai Temp=34 Self-Check Questions 15. AWK search patterns are case-insensitive. (true/false) 16. /NASA/ will match only lines containing NASA. (true/false). 17. AWK will work on each line of input. There is no way to limit the scope. (true/false) 4.7.3 Built-in variables We have used many of the built-in variables of AWK, such as $0, $1, $2,.. etc. and NF, NR. In addition, AWK has few other built in variables as listed below. Note that these variables are not read-only. That means, during a AWK program‟s run, the program itself can change the value of the variable!  FS : Field separator. By default AWK uses spaces as field separator and we have seen the –F option that can be used on the command line to specify the 192
  • 193. COE Unit 5, Lesson 6 field separator to be used by AWK. In addition, AWK has a built in variable FS that specifies the field separator.  RS : Record separator. By default AWK reads each line as an input line which means the default record separator is the new line. However, you can use RS to change the record separator.  OFS: Stores the "output field separator", which separates the fields when Awk prints them. The default is a "space" character.  ORS: Stores the "output record separator", which separates the output lines when Awk prints them. The default is a "newline" character.  FILENAME: Contains the name of the current input file. 4.7.4 Passing arguments to AWK So far we have seen AWK programs and commands where the values were fixed. For example, consider example from chapter 4 where a fixed value is being used: Example 3: Print whether Maths toppers had more than 98 marks. awk ‗/Maths/ { if( $2 > 98 ) { print ―In the year ―, $3; print ― ―, $4, ― had more than 98 marksn‖ } else { print ―In the year ―, $3; print ― ―, $4, ― had less than 98 marksn‖ } }‘ topper.txt This will print In the year 2003 Suresh had more than 98 marks. In the year 2004 Lokesh had less than 98 marks. In the year 2005 Anup had more than 98 marks. In the year 2006 Javed had more than 98 marks. Now, you may be asked to print the same report but for 94 marks. In which case, you will need to copy and modify the same script to replace 98 by 94. Such copying must be avoided because (a) it creates multiple scripts doing nearly the same things, (b) if you fix some error in one file you will need to fix it in all the files of same type, (c) the operation of copying and modifying is very error prone (what if the change from 98 to 94 is done in all places but gets accidentally left out at one place). Therefore, it is safer to make your 193
  • 194. COE Unit 5, Lesson 6 scripts in a generic way. Consider the example 3 again but made generalized as example 4 below: Example 4: Print whether Maths toppers had more than N marks. 194
  • 195. COE Unit 5, Lesson 6 bash>cat report_script if( $2 > N ) { print ―In the year ―, $3; printf( ― %s had more than %d marksn‖, $5, N); } else { print ―In the year ―, $3; printf( ― %s had less that %d marksn‖, $5, N); } It is invoked as awk –f report_script N=94 toppers.txt Note that we are passing N=94 in the command line. So if another report is needed to find with N=55, we need not copy/modify the file but we can simply pass N=55 on the command line itself. 4.7.5 Arrays and associative arrays in AWK Any user defined variable can work as an array in AWK. You can simply assign values with indexing. For example, Field[1] = $1 Field[3] = $3 AWK also supports associative arrays. For example, if $i contains the name of city and $j contains the city‘s temperature, you can store this information in an associative array. Temperature[ $i ] = $j; 4.7.6 String functions in AWK If you place multiple strings side by side, they will be joined. a = "DTU" "Delhi" # a will become "DTUDelhi". length() function returns the length of a given string. substring(str, startIndex, length) function takes out the substring. substring("DTU", 5, 3) will return "bag". 195
  • 196. COE Unit 5, Lesson 6 Note that index starts from 1, not 0. index(str, searchStr) gives the index of the searchStr or 0. index("DTU", "bag") will return 5. index("DTU", "DEI") will return 0. split(str, array [,separator]) splits an string by separator and fills them into an array. split("mera bharat mahan", slogan) will put slogan[1] as "mera" slogan[2] = "bharat", etc. Self-Check Questions 18. AWK provides a built in variable for field separator (true/false). 19. Built in variables are read only (true/false). 20. Variables passed to AWK are accessed as $1, $2, etc. (true/false) 21. AWK does not support complex structures but supports associative arrays (true/false). 4.7.7 Few interesting, complex examples Few interesting examples are listed below. These exemplify the power of AWK. Example 5: Counting non blank lines in a file: awk 'NF != 0 {++count} END {print count}' input_file.txt Example 6: Computing avg size of files in a directory ls -l | awk 'NR!=1 {s+=$5} END {print "Average: " s/(NR-1)}' Example 7: Print Fibonacci numbers: awk 'BEGIN {a=1;b=1; while(++x<=10){print a; t=a;a=a+b;b=t}; exit}' Example 8: Sometimes we may repeat words unintentionally like: "When I was going there". Detecting these manually is difficult, But we can write an AWK program to do this!! BEGIN { dups=0; w="xy-zzy" } { for( n=1; n<=NF; n++) { if ( w == $n ) { print w, "::", $0 ; dups = 1 } ; w = $n } } END { if (dups == 0) print "No duplicates found."} 196
  • 197. COE Unit 5, Lesson 6 4.8 Summary Awk is a very powerful utility in Unix. It helps in scripting and report generation. 4.9 Answers to the self check questions 1 true 2 true 3 false 4 true 5 (c) 6 (b) 7 (b) 8 false 9 (c) 10 false 11 false 12 false 13 false 14 false 15 false 16 false 17 false 18 true 19 false 20 false 21 true 4.10 Terminal Questions 1. Take the toppers.txt of this chapter. For each year and subject, print the first name of the topper, marks and then year. 2. Do the same question as listed above but now print the complete name of the topper followed by marks and then year. 3. Print the chemistry toppers marks, year and names for even years. 4. Print the years whenever the toppers scored >= 97 marks. 5. Input contains name and phone number records. To simplify, assume there is only one name (first name) and only one phone number per name. Use associative arrays to store numbers and names and at the end print them. 6. Upgrade example 8 to print the line number too where the repeated word i s there. 197
  • 198. COE Unit 5, Lesson 6 7. See the AWK syntax. We have used only one pattern and its program in our examples. Try using multiple patterns and their corresponding programs and see the outputs. 8. Generalize the coins example of chapter 4 by passing the values of per gram of gold and solver in place of hard coded values used in that example. 198