SlideShare a Scribd company logo
UNIT 6 -FILE MANAGEMENT
DR USHA RAGHAVAN
INTRODUCTION
 A file is a named collection of related information that is recorded on secondary storage such as
magnetic disks, magnetic tapes and optical disks.
 A file is a sequence of bits, bytes, lines or records whose meaning is defined by the files creator and
user.
 Data files may be numeric, alphabetic, alphanumeric or binary
FILE STRUCTURE
 A File Structure should be according to a required format that the operating system can understand.
• A file has a certain defined structure according to its type.
• A text file is a sequence of characters organized into lines.
• A source file is a sequence of procedures and functions.
• An object file is a sequence of bytes organized into blocks that are understandable by the machine.
• When operating system defines different file structures, it also contains the code to support these file
structure. Unix, MS-DOS support minimum number of file structure.
FILE TYPE
 File type refers to the ability of the operating system to distinguish different types of file
such as text files, source files and binary files etc. Many operating systems support many
types of files. Operating system like MS-DOS and UNIX have the following types of files
−
 Ordinary files
• These are the files that contain user information.
• These may have text, databases or executable program.
• The user can apply various operations on such files like add, modify, delete or even
remove the entire file.
 Directory files
• These files contain list of file names and other information related to these files.
FILE TYPE
 Special files
• These files are also known as device files.
• These files represent physical device like disks, terminals, printers,
networks, tape drive etc.
These files are of two types −
• Character special files − data is handled character by character as in
case of terminals or printers.
• Block special files − data is handled in blocks as in the case of disks
and tapes.
FILE ATTRIBUTES
 A file has a name and data. Moreover, it also stores meta information like file
creation date and time, current size, last modified date, etc. All this information is
called the attributes of a file system.
 File attributes used in OS are:
• Name: It is the only information stored in a human-readable form. It is always
followed by an extension name. It specifies the type of file . Eg.- OS .doc OS is file
name and .doc is extension name. ‘.’ is a separator
• Identifier: Every file is identified by a unique tag number within a file system
known as an identifier. It is not human readable
FILE ATTRIBUTES
• Location: Points to file location on device. It is a pointer that points to address the
file on storage device
• Type: This attribute is required for systems that support various types of files.
Type is indicated with file extension
• Size. Attribute used to display the current file size.It is the number of bytes
occupied by the contents of the file on storage device – Eg. -10 MB
• Protection. This attribute assigns and controls the access rights of reading,
writing, and executing the file.
• Time, date and security: It is used for protection, security, and also used for
monitoring. It specifies information about date and time of creation of the file, last
modification of file and last use of file. It is useful for protection and security and
usage monitoring
OPERATIONS ON FILE
• Create file, find space on disk, and make an entry in the directory.
• Write to file, requires positioning within the file
• Read from file involves positioning within the file
• Delete directory entry, regain disk space.
• Reposition: move read/write position.
CREATE A FILE
 Create operation is used to create a file by reserving memory space on the
storage device. It includes 2 steps.
1. To find free space from the file system
2. To make an entry of that file in its respective directory .
 Creating a file requires naming a file with unique file name inside a directory
WRITE INTO A FILE
 A system call with 2 parameters is required to write into a file. First parameter
specifies name of the file and the second parameter specifies the information
or data to be written into the file
 With the name of the file, system searches the directory to find the file’s
location. In that file, a write pointer is used to write data into the file. After
every write operator, pointer must be updated for next write operation
READING A FILE
 To read a file, a system call is required with 2 parameters that specify name of
the file and the 2nd optional parameter to specify the data to be read from the
file.
 With the file name , system searches a file from the directory and read pointer
is used to read data from the file. After every read operator a read pointer is
updated for next read operation
REPOSITIONING WITHIN THE FILE
 The directory is searched for appropriate entry of the file and a current
position pointer is repositioned to a given value.
 Repositioning may not always be I/O operation.
 This file operation is also called File Seek operation
DELETING A FILE
 For deleting a file, the OS requires location of the file. After searching the file, the system
releases the memory space allocated to that file to delete file from a storage device
 It also deletes file entry from the directory table
Other common operations include appending a new information to end of the file
and renaming an existing file. The primitive operations are combined to perform
other 5 operations such as creating a copy of the file, moving file from one location
to another, copying file to the I/O devices such as printer or display etc..
FILE TYPES
 Operating system recognises and supports various file types.
 After recognizing the type of file, OS can perform operations on it.
 File type can be mentioned as a part of file name. it consists of 2 parts. First part is the name
of the file and the second part is file extension separated with a ‘.’ operator or a character.
 With file extension, the OS recognises the type of file such as .doc- document file, . Exe –
executable file etc…
 In MSDOS , a name consists of upto 8 characters followed by a . Character and terminated
by an extension name with 3 characters
 In UNIX system it uses magicnumber stored at the beginning of some files to indicate type of
such file as executable program.
Common file types
File Type Extension Functions
executable Exe, com,bin or
none
ready-to-run machine- language program
Object obj, o complied, machine language, not linked
Source code c. p, pas, 177,
asm, a
source code in various languages
Batch bat, sh Series of commands to interpreter
Text txt, doc textual data documents
Word processor doc,docs, tex, rrf,
etc.
various word-processor formats
Library lib, h libraries of routines
archive arc, zip, tar related files grouped into one file, sometimes compressed.
multimedia Mpeg, mp3, Binary files containing audio / video information
FILE ACCESS METHODS
 File access mechanism refers to the manner in which the records of a file may be accessed. There are
several ways to access files −
• Sequential access
• Direct/Random access
• Indexed sequential access
SEQUENTIAL ACCESS
 A sequential access is that in which the records are accessed in some sequence, i.e., the information in
the file is processed in order, one record after the other.
 This access method is the most primitive one. Example: Compilers usually access files in this fashion.
DIRECT/RANDOM ACCESS
• Random access file organization provides, accessing the records directly.
• Each record has its own address on the file with by the help of which it can be directly accessed for
reading or writing.
• The records need not be in any sequence within the file and they need not be in adjacent locations on
the storage medium.
INDEXED SEQUENTIAL ACCESS
• This mechanism is built up on base of sequential access.
• An index is created for each file which contains pointers to various blocks.
• Index is searched sequentially and its pointer is used to access the file directly.
FILE ALLOCATION METHODS
 Files are allocated disk spaces by operating system. Operating systems deploy following three main
ways to allocate disk space to files.
• Contiguous Allocation
• Linked Allocation
• Indexed Allocation
CONTIGUOUS ALLOCATION
• Each file occupies a contiguous address space on disk.
• Assigned disk address is in linear order.
• Easy to implement.
• External fragmentation is a major issue with this type of allocation technique.
A single continuous set of blocks is allocated to a file at
the time of file creation. Thus, this is a pre-allocation
strategy, using variable size portions. The file allocation
table needs just a single entry for each file, showing the
starting block and the length of the file. This method is
best from the point of view of the individual sequential
file. Multiple blocks can be read in at a time to improve
I/O performance for sequential processing. It is also
easy to retrieve a single block. For example, if a file
starts at block b, and the ith block of the file is wanted,
its location on secondary storage is simply b+i-1.
Disadvantage
•External fragmentation will occur, making it difficult to
find contiguous blocks of space of sufficient length.
Compaction algorithm will be necessary to free up
additional space on disk.
•Also, with pre-allocation, it is necessary to declare the
size of the file at the time of creation.
LINKED ALLOCATION
• Each file carries a list of links to disk blocks.
• Directory contains link / pointer to first block of a file.
• No external fragmentation
• Effectively used in sequential access file.
• Inefficient in case of direct access file.
Allocation is on an individual block basis. Each block contains a
pointer to the next block in the chain. Again the file table needs
just a single entry for each file, showing the starting block and the
length of the file. Although pre-allocation is possible, it is more
common simply to allocate blocks as needed. Any free block can
be added to the chain. The blocks need not be continuous.
Increase in file size is always possible if free disk block is
available. There is no external fragmentation because only one
block at a time is needed but there can be internal fragmentation
but it exists only in the last disk block of file.
Disadvantage:
•Internal fragmentation exists in last disk block of file.
•There is an overhead of maintaining the pointer in every disk
block.
•If the pointer of any disk block is lost, the file will be truncated.
•It supports only the sequencial access of files.
INDEXED ALLOCATION
• Provides solutions to problems of contiguous and linked allocation.
• A index block is created having all pointers to files.
• Each file has its own index block which stores the addresses of disk space occupied by the file.
• Directory contains the addresses of index blocks of files.
It addresses many of the problems of contiguous and
chained allocation. In this case, the file allocation table
contains a separate one-level index for each file: The
index has one entry for each block allocated to the file.
Allocation may be on the basis of fixed-size blocks or
variable-sized blocks. Allocation by blocks eliminates
external fragmentation, whereas allocation by variable-
size blocks improves locality. This allocation technique
supports both sequential and direct access to the file
and thus is the most popular form of file allocation.
DIRECTORY STRUCTURE
Collection of files is a file directory. The directory contains information about the
files, including attributes, location and ownership. Much of this information,
especially that is concerned with storage, is managed by the operating system. The
directory is itself a file, accessible by various file management routines.
Information contained in a device directory are:
•Name
•Type
•Address
•Current length
•Maximum length
•Date last accessed
•Date last updated
•Owner id
Operation performed on directory are:
•Search for a file
•Create a file
•Delete a file
•List a directory
•Rename a file
•Traverse the file system
Advantages of maintaining directories are:
•Efficiency: A file can be located more quickly.
•Naming: It becomes convenient for users as two users can have same name
for different files or may have different name for same file.
•Grouping: Logical grouping of files can be done by properties e.g. all java
programs, all games etc.
SINGLE-LEVEL DIRECTORY
 In this a single directory is maintained for all the users.
• Naming problem: Users cannot have same name for two files.
• Grouping problem: Users cannot group files according to their need.
Single level Directory system
The single directory is also called root directory
The single level directory has 5 files owned by 3 different
users P,Q,R
User P has 2 files, User Q has 2 Files and user R has 1 File
in the directory
Advantages
Simple to implement
Locating files is very fast
Limitations
If a single user has a large number of files , it becomes
difficult to remember the name of each file
If more than one user keeps file in the same directory,
then different users may give the same names to their
files – thus violating the rule of uniqueness of names
Root Directory
P P Q Q
R
TWO-LEVEL DIRECTORY
 In this separate directories for each user is maintained.
• Path name: Due to two levels there is a path name for every file to locate that file.
• Now, we can have same file name for different user.
• Searching is efficient in this method.
Two Level Directory Systems
A private directory is given to each user. The same name given to files in different users does not interfere
When an user attempts to open a file, the system knows which user it is in order to know the directory in
which the file is to be searched
Advantage
Solves name collision problem
Independent user gets isolated from each other
Limitations
If The users are co-operative, then some systems do not allow accessing the other user’s files
It is not convenient for users with large number of files
TREE-STRUCTURED DIRECTORY
 Directory is maintained in the form of a tree. Searching is efficient and also
there is grouping capability. We have absolute or relative path name for a file.
Tree is the most common directory
structure
Each user can have as many directories as
are needed so that files can be grouped
together in the way it is needed
Every file has a unique pathname
All modern file systems use this
mechanism
DISK ORGANIZATION AND DISK STRUCTURE
 The magnetic disk is used as the main storage device .
 It is magnetic type of storage device
 Within one magnetic disk, many physical disks are present
 Each disk is called a platter. Several platters are present in a magnetic disk.
They are coated with special magnetic material
Platter
•One or more round, flat disks used to actually hold the data in the drive. Each platter
has two surfaces (top & bottom) that are capable of holding data;
•Each surface has one read /write head (Each platter has two heads, one on the top of
the platter and one on the bottom,)
•Hard disk with three platters has six surfaces and six total heads. Normally both
surfaces of each platter are used
•The outer surface of top and bottom disk cannot be used.
•Platter size is the form factor
• Disks are sometimes referred to by a size specification for example "3.5-inch hard
disk".
• The first PCs used hard disks that had a nominal size of 5.25".
• Today, by far the most common hard disk platter size is 3.5“
•Laptop drives are usually smaller, The platters on these drives are usually 2.5" in
diameter or less; 2.5" is the standard form factor, but drives with 1.8" and even 1.0"
platters are becoming more common.
• PCs usually have 1 to 5 platters
TRACKS AND SECTORS
 Each platter has its information recorded
in concentric circles called tracks.
 Each track is further broken down into
smaller pieces called sectors, each of
which holds 512 bytes of information.
STORAGE OF DATA IN PLATTERS
 A sector contains a fixed number of bytes -- for example, 256 or 512. Each track
typically holds between 100 and 300 sectors.
 Larger outer tracks hold more sectors than the smaller inner ones.
 All information stored on a hard disk is recorded in tracks.
 The tracks are numbered, starting from zero, starting at the outside of the platter.
 A hard disk has several thousand tracks on each platter.
 Either at the drive or the operating system level, sectors are often grouped
together into clusters.
Same tracks of different platters form an imaginary cylinder like structure
Data is stored cylinder by cylinder
All tracks on a cylinder are written and then the R/W head moves to the next Cylinder . This reduces movement
of R/W head and increases the speed of read and write operation
CONSTRUCTION OF HDD
The components of the Hard Disk
 Disk Platter
 Read/Write head
 Head Arm/ Head Slider
 Head Actuator mechanisms
 Spindle motor
 Bezel
 Cable & connectors
 Logic board
 Air filter
Read-Write(R-W) head moves over the rotating hard disk. It is this Read-Write head that performs all the
read and write operations on the disk and hence, position of the R-W head is a major concern.
To perform a read or write operation on a memory location, we need to place the R-W head over that
position. Some important terms must be noted here:
1.Seek time – The time taken by the R-W head to reach the desired track from it’s current position.
2.Rotational latency – Time taken by the sector to come under the R-W head.
3.Data transfer time – Time taken to transfer the required amount of data. It depends upon the rotational
speed.
4.Controller time – The processing time taken by the controller.
5.Average Access time – seek time + Average Rotational latency + data transfer time + controller time.
LOGICAL STRUCTURE
File Systems are stored on disks. The above figure
depicts a possible File-System Layout.
•MBR: Master Boot Record is used to boot the
computer
•Partition Table: Partition table is present at the end of
MBR. This table gives the starting and ending addresses
of each partition.
•Boot Block: When the computer is booted, the BIOS
reads in and executes the MBR. The first thing the MBR
program does is locate the active partition, read in its
first block, which is called the boot block, and execute
it. The program in the boot block loads the operating
system contained in that partition. Every partition
contains a boot block at the beginning though it does not
contain a bootable operating system.
•Super Block: It contains all the key parameters about
the file system and is read into memory when the
computer is booted or the file system is first touched.
Free space Management: To keep track of free disk space, the system maintains a free space list that records
all free blocks
I node: The information regarding each file in file system is kept in data structure called I-Node. For each file
there is one i-node
Root directory: It is the top of the file system tree
Files and directories: They are the files and directories in the disk
RAID(REDUNDANT ARRAY OF INDEPENDENT DISKS) STRUCTURE OF DISK
 RAID, or “Redundant Arrays of Independent Disks” is a technique which makes use of a combination of
multiple disks instead of using a single disk for increased performance, data redundancy or both
 Data redundancy, although taking up extra space, adds to disk reliability. This means, in case of disk
failure, if the same data is also backed up onto another disk, we can retrieve the data and go on with the
operation. On the other hand, if the data is spread across just multiple disks without the RAID technique,
the loss of a single disk can affect the entire data.
Key evaluation points for a RAID System
•Reliability: How many disk faults can the system tolerate?
•Availability: What fraction of the total session time is a system in uptime mode, i.e. how available is the
system for actual use?
•Performance: How good is the response time? How high is the throughput (rate of processing work)?
•Capacity: Given a set of N disks each with B blocks, how much useful capacity is available to the user?
• RAID is very transparent to the underlying system. This means, to the host system, it appears as a
single big disk presenting itself as a linear array of blocks. This allows older technologies to be replaced
by RAID without making too many changes in the existing code.
•In the figure, blocks “0,1,2,3” form a stripe.
•Instead of placing just one block into a disk at a
time, we can work with two (or more) blocks
placed into a disk before moving on to the next
one.
RAID-0 (Stripping)
•Blocks are “stripped” across disks.
Evaluation:
•Reliability: 0
There is no duplication of data. Hence, a block
once lost cannot be recovered.
•Capacity: N*B
The entire space is being used to store data. Since
there is no duplication, N disks each having B
blocks are fully utilized.
RAID-1 (Mirroring)
More than one copy of each block is stored in a separate disk. Thus, every block has
two (or more) copies, lying on different disks.
•RAID 0 was unable to tolerate any disk failure. But RAID 1 is capable of reliability.
Evaluation:
Assume a RAID system with mirroring level 2.
•Reliability: 1 to N/2
1 disk failure can be handled for certain, because blocks of that disk would have
duplicates on some other disk. If we are lucky enough and disks 0 and 2 fail, then again
this can be handled as the blocks of these disks have duplicates on disks 1 and 3. So,
in the best case, N/2 disk failures can be handled.
Raid 2
•This uses bit level striping. i.e Instead of
striping the blocks across the disks, it stripes
the bits across the disks.
•In the above diagram b1, b2, b3 are bits. E1,
E2, E3 are error correction codes.
•We need two groups of disks. One group of
disks are used to write the data, another
group is used to write the error correction
codes.
•When data is read from the disks, it also
reads the corresponding ECC code from the
redundancy disks, and checks whether the
data is consistent. If required, it makes
appropriate corrections .
•This is not used anymore. This is expensive
and implementing it in a RAID controller is
complex.
RAID 3
•This uses byte level striping. i.e
Instead of striping the blocks across
the disks, it stripes the bytes across
the disks.
•In the above diagram B1, B2, B3 are
bytes. p1, p2, p3 are parities.
•Uses multiple data disks, and a
dedicated disk to store parity.
•Sequential read and write will have
good performance.
•Random read and write will have
worst performance.
RAID 4
•This uses block level striping.
•In the above diagram A,B,C are blocks. p1,
p2, p3 are parities.
•Uses multiple data disks, and a dedicated
disk to store parity.
•Minimum of 3 disks (2 disks for data and 1
for parity)
•Good random reads, as the data blocks are
striped.
•Bad random writes, as for every write, it has
to write to the single parity disk.
•It is somewhat similar to RAID 3 and 5, but a
little different.
•This is just like RAID 3 in having the
dedicated parity disk, but this stripes blocks.
•This is just like RAID 5 in striping the blocks
across the data disks, but this has only one
parity disk.
RAID 5
This is a slight modification of the RAID-4
system where the only difference is that the
parity rotates among the drives.
•Reliability: 1
RAID-5 allows recovery of at most 1 disk
failure (because of the way parity works). If
more than one disk fails, there is no way to
recover the data. This is identical to RAID-
4.
•Capacity: (N-1)*B
Overall, space equivalent to one disk is
utilized in storing the parity. Hence, (N-1)
disks are made available for data storage,
each disk having B blocks.
RAID 6
•Just like RAID 5, this does block
level striping. However, it uses
dual parity.
•In the above diagram A, B, C are
blocks. p1, p2, p3 are parities.
•This creates two parity blocks for
each data block.
•Can handle two disk failure
•This RAID configuration is
complex to implement in a RAID
controller, as it has to calculate
two parity data for each data
block.

More Related Content

PDF
File Systems
PPTX
file_concept.pptx file presentation directories
PPTX
Learn about the File Concept in operating systems ppt
PPTX
file_concept.pptx file presentation directories
PPTX
The Operating System concepts.. -os.pptx
PPTX
File management
PPTX
(file systems)12312321321321312312312.pptx
PPTX
operating system notes for file managment.pptx
File Systems
file_concept.pptx file presentation directories
Learn about the File Concept in operating systems ppt
file_concept.pptx file presentation directories
The Operating System concepts.. -os.pptx
File management
(file systems)12312321321321312312312.pptx
operating system notes for file managment.pptx

Similar to Unit 6 OSY.pptx aaaaaaaaaaaaaaaaaaaaaaaa (20)

PPTX
8 File Management system project .pptx
PPTX
File concept and access method
PPTX
File Management & Access Control
PPT
PPT
Unit 3 file management
PDF
Unit ivos - file systems
PDF
Chapter 5
PPTX
Introduction to File System
PPT
Unit 3 chapter 1-file management
PDF
File system in operating system e learning
PPTX
File Management – File Concept, access methods, File types and File Operation
PPTX
Chapter 12.pptx
PPTX
Chapter 3
PPTX
Operating System Unit 4(RTU Syllabus).pptx
PPTX
File Concept.pptx fa s fasfasfasfsfsfasfasfas
PPT
Operating Systems - File Space Allocation
PPT
Operating System - File Management concepts
PPTX
File Management
PPT
File organisation
8 File Management system project .pptx
File concept and access method
File Management & Access Control
Unit 3 file management
Unit ivos - file systems
Chapter 5
Introduction to File System
Unit 3 chapter 1-file management
File system in operating system e learning
File Management – File Concept, access methods, File types and File Operation
Chapter 12.pptx
Chapter 3
Operating System Unit 4(RTU Syllabus).pptx
File Concept.pptx fa s fasfasfasfsfsfasfasfas
Operating Systems - File Space Allocation
Operating System - File Management concepts
File Management
File organisation
Ad

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
Cell Types and Its function , kingdom of life
PDF
Classroom Observation Tools for Teachers
PPTX
master seminar digital applications in india
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Pharma ospi slides which help in ospi learning
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Insiders guide to clinical Medicine.pdf
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Basic Mud Logging Guide for educational purpose
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Cell Types and Its function , kingdom of life
Classroom Observation Tools for Teachers
master seminar digital applications in india
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Pharma ospi slides which help in ospi learning
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPH.pptx obstetrics and gynecology in nursing
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Renaissance Architecture: A Journey from Faith to Humanism
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Insiders guide to clinical Medicine.pdf
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
102 student loan defaulters named and shamed – Is someone you know on the list?
Basic Mud Logging Guide for educational purpose
O5-L3 Freight Transport Ops (International) V1.pdf
Ad

Unit 6 OSY.pptx aaaaaaaaaaaaaaaaaaaaaaaa

  • 1. UNIT 6 -FILE MANAGEMENT DR USHA RAGHAVAN
  • 2. INTRODUCTION  A file is a named collection of related information that is recorded on secondary storage such as magnetic disks, magnetic tapes and optical disks.  A file is a sequence of bits, bytes, lines or records whose meaning is defined by the files creator and user.  Data files may be numeric, alphabetic, alphanumeric or binary
  • 3. FILE STRUCTURE  A File Structure should be according to a required format that the operating system can understand. • A file has a certain defined structure according to its type. • A text file is a sequence of characters organized into lines. • A source file is a sequence of procedures and functions. • An object file is a sequence of bytes organized into blocks that are understandable by the machine. • When operating system defines different file structures, it also contains the code to support these file structure. Unix, MS-DOS support minimum number of file structure.
  • 4. FILE TYPE  File type refers to the ability of the operating system to distinguish different types of file such as text files, source files and binary files etc. Many operating systems support many types of files. Operating system like MS-DOS and UNIX have the following types of files −  Ordinary files • These are the files that contain user information. • These may have text, databases or executable program. • The user can apply various operations on such files like add, modify, delete or even remove the entire file.  Directory files • These files contain list of file names and other information related to these files.
  • 5. FILE TYPE  Special files • These files are also known as device files. • These files represent physical device like disks, terminals, printers, networks, tape drive etc. These files are of two types − • Character special files − data is handled character by character as in case of terminals or printers. • Block special files − data is handled in blocks as in the case of disks and tapes.
  • 6. FILE ATTRIBUTES  A file has a name and data. Moreover, it also stores meta information like file creation date and time, current size, last modified date, etc. All this information is called the attributes of a file system.  File attributes used in OS are: • Name: It is the only information stored in a human-readable form. It is always followed by an extension name. It specifies the type of file . Eg.- OS .doc OS is file name and .doc is extension name. ‘.’ is a separator • Identifier: Every file is identified by a unique tag number within a file system known as an identifier. It is not human readable
  • 7. FILE ATTRIBUTES • Location: Points to file location on device. It is a pointer that points to address the file on storage device • Type: This attribute is required for systems that support various types of files. Type is indicated with file extension • Size. Attribute used to display the current file size.It is the number of bytes occupied by the contents of the file on storage device – Eg. -10 MB • Protection. This attribute assigns and controls the access rights of reading, writing, and executing the file. • Time, date and security: It is used for protection, security, and also used for monitoring. It specifies information about date and time of creation of the file, last modification of file and last use of file. It is useful for protection and security and usage monitoring
  • 8. OPERATIONS ON FILE • Create file, find space on disk, and make an entry in the directory. • Write to file, requires positioning within the file • Read from file involves positioning within the file • Delete directory entry, regain disk space. • Reposition: move read/write position.
  • 9. CREATE A FILE  Create operation is used to create a file by reserving memory space on the storage device. It includes 2 steps. 1. To find free space from the file system 2. To make an entry of that file in its respective directory .  Creating a file requires naming a file with unique file name inside a directory
  • 10. WRITE INTO A FILE  A system call with 2 parameters is required to write into a file. First parameter specifies name of the file and the second parameter specifies the information or data to be written into the file  With the name of the file, system searches the directory to find the file’s location. In that file, a write pointer is used to write data into the file. After every write operator, pointer must be updated for next write operation
  • 11. READING A FILE  To read a file, a system call is required with 2 parameters that specify name of the file and the 2nd optional parameter to specify the data to be read from the file.  With the file name , system searches a file from the directory and read pointer is used to read data from the file. After every read operator a read pointer is updated for next read operation
  • 12. REPOSITIONING WITHIN THE FILE  The directory is searched for appropriate entry of the file and a current position pointer is repositioned to a given value.  Repositioning may not always be I/O operation.  This file operation is also called File Seek operation
  • 13. DELETING A FILE  For deleting a file, the OS requires location of the file. After searching the file, the system releases the memory space allocated to that file to delete file from a storage device  It also deletes file entry from the directory table
  • 14. Other common operations include appending a new information to end of the file and renaming an existing file. The primitive operations are combined to perform other 5 operations such as creating a copy of the file, moving file from one location to another, copying file to the I/O devices such as printer or display etc..
  • 15. FILE TYPES  Operating system recognises and supports various file types.  After recognizing the type of file, OS can perform operations on it.  File type can be mentioned as a part of file name. it consists of 2 parts. First part is the name of the file and the second part is file extension separated with a ‘.’ operator or a character.  With file extension, the OS recognises the type of file such as .doc- document file, . Exe – executable file etc…  In MSDOS , a name consists of upto 8 characters followed by a . Character and terminated by an extension name with 3 characters  In UNIX system it uses magicnumber stored at the beginning of some files to indicate type of such file as executable program.
  • 16. Common file types File Type Extension Functions executable Exe, com,bin or none ready-to-run machine- language program Object obj, o complied, machine language, not linked Source code c. p, pas, 177, asm, a source code in various languages Batch bat, sh Series of commands to interpreter Text txt, doc textual data documents Word processor doc,docs, tex, rrf, etc. various word-processor formats Library lib, h libraries of routines archive arc, zip, tar related files grouped into one file, sometimes compressed. multimedia Mpeg, mp3, Binary files containing audio / video information
  • 17. FILE ACCESS METHODS  File access mechanism refers to the manner in which the records of a file may be accessed. There are several ways to access files − • Sequential access • Direct/Random access • Indexed sequential access
  • 18. SEQUENTIAL ACCESS  A sequential access is that in which the records are accessed in some sequence, i.e., the information in the file is processed in order, one record after the other.  This access method is the most primitive one. Example: Compilers usually access files in this fashion.
  • 19. DIRECT/RANDOM ACCESS • Random access file organization provides, accessing the records directly. • Each record has its own address on the file with by the help of which it can be directly accessed for reading or writing. • The records need not be in any sequence within the file and they need not be in adjacent locations on the storage medium.
  • 20. INDEXED SEQUENTIAL ACCESS • This mechanism is built up on base of sequential access. • An index is created for each file which contains pointers to various blocks. • Index is searched sequentially and its pointer is used to access the file directly.
  • 21. FILE ALLOCATION METHODS  Files are allocated disk spaces by operating system. Operating systems deploy following three main ways to allocate disk space to files. • Contiguous Allocation • Linked Allocation • Indexed Allocation
  • 22. CONTIGUOUS ALLOCATION • Each file occupies a contiguous address space on disk. • Assigned disk address is in linear order. • Easy to implement. • External fragmentation is a major issue with this type of allocation technique.
  • 23. A single continuous set of blocks is allocated to a file at the time of file creation. Thus, this is a pre-allocation strategy, using variable size portions. The file allocation table needs just a single entry for each file, showing the starting block and the length of the file. This method is best from the point of view of the individual sequential file. Multiple blocks can be read in at a time to improve I/O performance for sequential processing. It is also easy to retrieve a single block. For example, if a file starts at block b, and the ith block of the file is wanted, its location on secondary storage is simply b+i-1. Disadvantage •External fragmentation will occur, making it difficult to find contiguous blocks of space of sufficient length. Compaction algorithm will be necessary to free up additional space on disk. •Also, with pre-allocation, it is necessary to declare the size of the file at the time of creation.
  • 24. LINKED ALLOCATION • Each file carries a list of links to disk blocks. • Directory contains link / pointer to first block of a file. • No external fragmentation • Effectively used in sequential access file. • Inefficient in case of direct access file.
  • 25. Allocation is on an individual block basis. Each block contains a pointer to the next block in the chain. Again the file table needs just a single entry for each file, showing the starting block and the length of the file. Although pre-allocation is possible, it is more common simply to allocate blocks as needed. Any free block can be added to the chain. The blocks need not be continuous. Increase in file size is always possible if free disk block is available. There is no external fragmentation because only one block at a time is needed but there can be internal fragmentation but it exists only in the last disk block of file. Disadvantage: •Internal fragmentation exists in last disk block of file. •There is an overhead of maintaining the pointer in every disk block. •If the pointer of any disk block is lost, the file will be truncated. •It supports only the sequencial access of files.
  • 26. INDEXED ALLOCATION • Provides solutions to problems of contiguous and linked allocation. • A index block is created having all pointers to files. • Each file has its own index block which stores the addresses of disk space occupied by the file. • Directory contains the addresses of index blocks of files.
  • 27. It addresses many of the problems of contiguous and chained allocation. In this case, the file allocation table contains a separate one-level index for each file: The index has one entry for each block allocated to the file. Allocation may be on the basis of fixed-size blocks or variable-sized blocks. Allocation by blocks eliminates external fragmentation, whereas allocation by variable- size blocks improves locality. This allocation technique supports both sequential and direct access to the file and thus is the most popular form of file allocation.
  • 28. DIRECTORY STRUCTURE Collection of files is a file directory. The directory contains information about the files, including attributes, location and ownership. Much of this information, especially that is concerned with storage, is managed by the operating system. The directory is itself a file, accessible by various file management routines. Information contained in a device directory are: •Name •Type •Address •Current length •Maximum length •Date last accessed •Date last updated •Owner id
  • 29. Operation performed on directory are: •Search for a file •Create a file •Delete a file •List a directory •Rename a file •Traverse the file system Advantages of maintaining directories are: •Efficiency: A file can be located more quickly. •Naming: It becomes convenient for users as two users can have same name for different files or may have different name for same file. •Grouping: Logical grouping of files can be done by properties e.g. all java programs, all games etc.
  • 30. SINGLE-LEVEL DIRECTORY  In this a single directory is maintained for all the users. • Naming problem: Users cannot have same name for two files. • Grouping problem: Users cannot group files according to their need.
  • 31. Single level Directory system The single directory is also called root directory The single level directory has 5 files owned by 3 different users P,Q,R User P has 2 files, User Q has 2 Files and user R has 1 File in the directory Advantages Simple to implement Locating files is very fast Limitations If a single user has a large number of files , it becomes difficult to remember the name of each file If more than one user keeps file in the same directory, then different users may give the same names to their files – thus violating the rule of uniqueness of names Root Directory P P Q Q R
  • 32. TWO-LEVEL DIRECTORY  In this separate directories for each user is maintained. • Path name: Due to two levels there is a path name for every file to locate that file. • Now, we can have same file name for different user. • Searching is efficient in this method.
  • 33. Two Level Directory Systems A private directory is given to each user. The same name given to files in different users does not interfere When an user attempts to open a file, the system knows which user it is in order to know the directory in which the file is to be searched Advantage Solves name collision problem Independent user gets isolated from each other Limitations If The users are co-operative, then some systems do not allow accessing the other user’s files It is not convenient for users with large number of files
  • 34. TREE-STRUCTURED DIRECTORY  Directory is maintained in the form of a tree. Searching is efficient and also there is grouping capability. We have absolute or relative path name for a file. Tree is the most common directory structure Each user can have as many directories as are needed so that files can be grouped together in the way it is needed Every file has a unique pathname All modern file systems use this mechanism
  • 35. DISK ORGANIZATION AND DISK STRUCTURE  The magnetic disk is used as the main storage device .  It is magnetic type of storage device  Within one magnetic disk, many physical disks are present  Each disk is called a platter. Several platters are present in a magnetic disk. They are coated with special magnetic material
  • 36. Platter •One or more round, flat disks used to actually hold the data in the drive. Each platter has two surfaces (top & bottom) that are capable of holding data; •Each surface has one read /write head (Each platter has two heads, one on the top of the platter and one on the bottom,) •Hard disk with three platters has six surfaces and six total heads. Normally both surfaces of each platter are used •The outer surface of top and bottom disk cannot be used. •Platter size is the form factor • Disks are sometimes referred to by a size specification for example "3.5-inch hard disk". • The first PCs used hard disks that had a nominal size of 5.25". • Today, by far the most common hard disk platter size is 3.5“ •Laptop drives are usually smaller, The platters on these drives are usually 2.5" in diameter or less; 2.5" is the standard form factor, but drives with 1.8" and even 1.0" platters are becoming more common. • PCs usually have 1 to 5 platters
  • 37. TRACKS AND SECTORS  Each platter has its information recorded in concentric circles called tracks.  Each track is further broken down into smaller pieces called sectors, each of which holds 512 bytes of information.
  • 38. STORAGE OF DATA IN PLATTERS  A sector contains a fixed number of bytes -- for example, 256 or 512. Each track typically holds between 100 and 300 sectors.  Larger outer tracks hold more sectors than the smaller inner ones.  All information stored on a hard disk is recorded in tracks.  The tracks are numbered, starting from zero, starting at the outside of the platter.  A hard disk has several thousand tracks on each platter.  Either at the drive or the operating system level, sectors are often grouped together into clusters.
  • 39. Same tracks of different platters form an imaginary cylinder like structure Data is stored cylinder by cylinder All tracks on a cylinder are written and then the R/W head moves to the next Cylinder . This reduces movement of R/W head and increases the speed of read and write operation
  • 40. CONSTRUCTION OF HDD The components of the Hard Disk  Disk Platter  Read/Write head  Head Arm/ Head Slider  Head Actuator mechanisms  Spindle motor  Bezel  Cable & connectors  Logic board  Air filter
  • 41. Read-Write(R-W) head moves over the rotating hard disk. It is this Read-Write head that performs all the read and write operations on the disk and hence, position of the R-W head is a major concern. To perform a read or write operation on a memory location, we need to place the R-W head over that position. Some important terms must be noted here: 1.Seek time – The time taken by the R-W head to reach the desired track from it’s current position. 2.Rotational latency – Time taken by the sector to come under the R-W head. 3.Data transfer time – Time taken to transfer the required amount of data. It depends upon the rotational speed. 4.Controller time – The processing time taken by the controller. 5.Average Access time – seek time + Average Rotational latency + data transfer time + controller time.
  • 42. LOGICAL STRUCTURE File Systems are stored on disks. The above figure depicts a possible File-System Layout. •MBR: Master Boot Record is used to boot the computer •Partition Table: Partition table is present at the end of MBR. This table gives the starting and ending addresses of each partition. •Boot Block: When the computer is booted, the BIOS reads in and executes the MBR. The first thing the MBR program does is locate the active partition, read in its first block, which is called the boot block, and execute it. The program in the boot block loads the operating system contained in that partition. Every partition contains a boot block at the beginning though it does not contain a bootable operating system. •Super Block: It contains all the key parameters about the file system and is read into memory when the computer is booted or the file system is first touched.
  • 43. Free space Management: To keep track of free disk space, the system maintains a free space list that records all free blocks I node: The information regarding each file in file system is kept in data structure called I-Node. For each file there is one i-node Root directory: It is the top of the file system tree Files and directories: They are the files and directories in the disk
  • 44. RAID(REDUNDANT ARRAY OF INDEPENDENT DISKS) STRUCTURE OF DISK  RAID, or “Redundant Arrays of Independent Disks” is a technique which makes use of a combination of multiple disks instead of using a single disk for increased performance, data redundancy or both  Data redundancy, although taking up extra space, adds to disk reliability. This means, in case of disk failure, if the same data is also backed up onto another disk, we can retrieve the data and go on with the operation. On the other hand, if the data is spread across just multiple disks without the RAID technique, the loss of a single disk can affect the entire data.
  • 45. Key evaluation points for a RAID System •Reliability: How many disk faults can the system tolerate? •Availability: What fraction of the total session time is a system in uptime mode, i.e. how available is the system for actual use? •Performance: How good is the response time? How high is the throughput (rate of processing work)? •Capacity: Given a set of N disks each with B blocks, how much useful capacity is available to the user? • RAID is very transparent to the underlying system. This means, to the host system, it appears as a single big disk presenting itself as a linear array of blocks. This allows older technologies to be replaced by RAID without making too many changes in the existing code.
  • 46. •In the figure, blocks “0,1,2,3” form a stripe. •Instead of placing just one block into a disk at a time, we can work with two (or more) blocks placed into a disk before moving on to the next one. RAID-0 (Stripping) •Blocks are “stripped” across disks. Evaluation: •Reliability: 0 There is no duplication of data. Hence, a block once lost cannot be recovered. •Capacity: N*B The entire space is being used to store data. Since there is no duplication, N disks each having B blocks are fully utilized.
  • 47. RAID-1 (Mirroring) More than one copy of each block is stored in a separate disk. Thus, every block has two (or more) copies, lying on different disks. •RAID 0 was unable to tolerate any disk failure. But RAID 1 is capable of reliability. Evaluation: Assume a RAID system with mirroring level 2. •Reliability: 1 to N/2 1 disk failure can be handled for certain, because blocks of that disk would have duplicates on some other disk. If we are lucky enough and disks 0 and 2 fail, then again this can be handled as the blocks of these disks have duplicates on disks 1 and 3. So, in the best case, N/2 disk failures can be handled.
  • 48. Raid 2 •This uses bit level striping. i.e Instead of striping the blocks across the disks, it stripes the bits across the disks. •In the above diagram b1, b2, b3 are bits. E1, E2, E3 are error correction codes. •We need two groups of disks. One group of disks are used to write the data, another group is used to write the error correction codes. •When data is read from the disks, it also reads the corresponding ECC code from the redundancy disks, and checks whether the data is consistent. If required, it makes appropriate corrections . •This is not used anymore. This is expensive and implementing it in a RAID controller is complex.
  • 49. RAID 3 •This uses byte level striping. i.e Instead of striping the blocks across the disks, it stripes the bytes across the disks. •In the above diagram B1, B2, B3 are bytes. p1, p2, p3 are parities. •Uses multiple data disks, and a dedicated disk to store parity. •Sequential read and write will have good performance. •Random read and write will have worst performance.
  • 50. RAID 4 •This uses block level striping. •In the above diagram A,B,C are blocks. p1, p2, p3 are parities. •Uses multiple data disks, and a dedicated disk to store parity. •Minimum of 3 disks (2 disks for data and 1 for parity) •Good random reads, as the data blocks are striped. •Bad random writes, as for every write, it has to write to the single parity disk. •It is somewhat similar to RAID 3 and 5, but a little different. •This is just like RAID 3 in having the dedicated parity disk, but this stripes blocks. •This is just like RAID 5 in striping the blocks across the data disks, but this has only one parity disk.
  • 51. RAID 5 This is a slight modification of the RAID-4 system where the only difference is that the parity rotates among the drives. •Reliability: 1 RAID-5 allows recovery of at most 1 disk failure (because of the way parity works). If more than one disk fails, there is no way to recover the data. This is identical to RAID- 4. •Capacity: (N-1)*B Overall, space equivalent to one disk is utilized in storing the parity. Hence, (N-1) disks are made available for data storage, each disk having B blocks.
  • 52. RAID 6 •Just like RAID 5, this does block level striping. However, it uses dual parity. •In the above diagram A, B, C are blocks. p1, p2, p3 are parities. •This creates two parity blocks for each data block. •Can handle two disk failure •This RAID configuration is complex to implement in a RAID controller, as it has to calculate two parity data for each data block.