5. 5
File Concepts:
A file is a collection of similar records. The file is
treated as a single entity by user and applications
and may be referred by name. Files have unique
file names and may be created and deleted.
A File is a container for a collection of
information. The File manager provides a
protection mechanism to allow users administer
how processes executing on behalf of different
users can access the information in a file.
File represents programs and data. Data Files
may be numeric, alphabetic, binary or alpha
numeric
6. 6
A file has a certain defined structure according to its
type.
Text File
Source File
Executable File
Object File.
A text file is a sequence of characters organized into
lines.
A source file is a sequence of subroutine and
functions.
An object file is a sequence of bytes organized into
blocks understandable by the systems linker.
An executable file is a series of code sections that the
loader can bring into memory and execute.
7. 7
File Attributes:
File attributes vary from one operating system to
another. The common file attributes are
Name
Identifier
Type
Location
Size
Protection
Time, date and user identification
The Symbolic file name is the only information
kept in human readable form.
8. 8
Identifier is the unique tag which identifies the
file within the File system.
It is usually a number. Some systems support
different types of files.
The File type information is required in this
system. Location information is a pointer to a
device and to the location of the file on that
device.
Protection attributes is the fundamental property
of the file.
Access control information determines who can
do reading, writing, executing and so on.
Time, date and user identification may be kept
for creation, last modification and last use.
9. 9
File Operations:
Basic operations on files are
Create a File
Writing a file
Reading a file
Deleting a file
Truncating
Repositioning within a File
10. 10
Create a file: For creating a file, address space in
the file system is required. After creating a file,
entry of the file is made in the directory. The
directory entry records the name of the file and the
location in the file system.
Writing a File: System call is used for writing into
File. It is required to specify the name of the file
and information to be written to the file. According
to the File name, system will search the name in
the directory to find the location of the file.
Delete a File: system will search the directory,
which file to be deleted. If directory entry found, it
releases all file space. That free space can be
reused by another (user) Files.
11. 11
Truncating a File: User may want to erase
contents of file but keep its attributes. Rather
than forcing the user to delete a file and then
recreate it, truncation function allows all
attributes to remain unchanged except for file
length.
Repositioning within a file: The directory is
searched for the appropriate entry, and the
current file position is set to a given value.
Repositioning within a file does not need to
involve any actual I/O. This file operation is also
known as File seek.
12. 12
File Types:
A common technique for implementing file
types is to include the type as part of the file
name. The name is split into two parts: a name
and an extension.
The Following table gives the file type with usual
extension and function.
13. 13
File Organization and Mechanism:
File organization are as follows :
The Pile
Sequential access
Direct
Indexed
Indexed – Sequential.
14. 14
The Pile:
Least complicated and data are collected in the order in
which they arrive. Each record consists of one burst of
data.
Records may have different field or may have similar fields
Record access is by the exhaustive search because there is
no structure to the pile file
Pile files are encountered when data are collected and
stored before processing or when data are not easy to
organize.
This organization is suitable for exhaustive searches and
are easy to update.
17. 17
Sequential access:
Sequential access is the simplest method.
Information in the file is sequentially accessed. i ,e
one record for other record .
Editors and Compilers usually access files in the
fashion.
Normally read and write operations are done on the
files.
A read operation reads the next portion of the file
and automatically advances a file pointer which
tracks I/O location, write operation appends the end
of file and such file can be rest to the beginning
18. 18
Direct Access:
Direct access allows random access to any file block.
This method is based on a disk model of a file.
A file is made up of fixed length logical records.
It allows programs to read and write records rapidly
in no particular order.
A direct access allows arbitrary blocks to be read or
written.
For example, user may read block13, the read block
99, then write block 12.
In a direct access file, no restriction for reading or
writing a file in any sequence.
For searching, the record in large amount of
information with immediate result, direct access
methods is suitable. Database are often of this type
19. 19
Indexed File:
Two types of indexes are used.
An exhaustive index contains one entry for every
record in the main file.
An index is itself organized as a sequential file
for ease of searching.
A partial index contains entries to record where
the field of interest exists, with records of
variable length, some records will not contain all
fields.
When a new record is added to the main file, all
the index files must be updated.
20. 20
Indexed files are used mostly in applications where
timelines of information are critical and where data
are rarely processed exhaustively. Examples are
airline reservation systems and inventory control
systems.
21. 21
Indexed Sequential:
It maintains the key characteristic of the sequential
file.
Records are organized in sequence based on the key
field.
A single level of indexing is used in simple indexed
sequential structures.
Each record in the index file consists of two fields: a
key field and pointer into the main file.
To find a specific field, the index is searched to find
the highest key value that is equal to or precedes the
desired key value.
The search continues in the main file at the location
indicated by the pointer.
22. 22
Each record in the main file contains an additional field not visible to
the application which is a pointer to the overflow file.
For inserting new record into the file, it is added to the overflow file.
The record in the main file that immediately precedes the new record
in logical sequence is updated to contain a pointer to the new record
in the overflow file.
If the immediately preceding record is itself in the overflow file, then
the pointer in that record is updated.
It greatly reduces the time required to access a single record without
sacrificing nature of the file.
24. 24
Directory structure
Directories are basically symbol tables of files.
A single flat directory can contain a list of all
files in a system.
A directory contains information about the files,
including attributes, location and ownership.
Operating system is managed this information.
25. 25
Operations on Directory:
Create a file: When a new file is created, an
entry must be added to the directory.
Delete a file: When a file is deleted, an entry
must be removed from the directory.
Rename a File: Name of the files must be
changeable when the content or use of the file
changes. Renaming a file may allow its position
within the directory structure to be changed.
List directory: All or portion of the directory
may be requested. Request is made by a user and
result in a listing of all files owned by that user
plus some of the attributes of each file.
26. 26
Different types of directory structures are given
below
Single level directory
Two level directory
Tree structured directory
Single Level Directory:
Single level directory is simple directory
structure. All files are contained in the same
directory.
The below diagram shows single level directory
structure. Easy to implement and maintain.
27. 27
Disadvantages of single level directory are as
follows:
Not suitable for a large number of files and more
than one user.
Because of single directory, files require unique
file name.
It is difficult to remember the names of all the
files as the number of files increases
Ms-Dos operating system allows only 11-
character file names where asunix allows 255
charcters.
28. 28
Two Level Directory:
In two level directory, each user has its own
directory.
It is called user file directory(UFD).
Each user file directory has a similar structure.
The below diagram shows the two level
directory. When a user refers to a particular file,
only his own UFD is searched.
Different users may have files with the same
name, as long as all the file names within each
UFD are unique.
29. 29
To create a file for a user, the operating system
searches only that users directory to ascertain
whether another file of that name exists.
To delete a file, the operating system confines the
search to the local UFD.
Operating system cannot accidently delete
another users file that has the same name
30. 30
Tree Structured Directories:
MS-Dos system is a tree structure directory. It
allows users to create their own subdirectory
and to organize their files accordingly.
A subdirectory contains a set of files or
subdirectories.
A directory is simple another file, but it is treated
in a special way.
All the directories have the same internal format.
One bit in each directory-entry defines the entry
as file (0) or as a subdirectory (1).
Special systems calls are used to create and
delete directories.
31. 31
Current directory should contain most of the files
that are of current interest to the user.
When a reference is made to a file, the current
directory is searched. Path name is used to search
or for any operation on file with another directory.
32. 32
Path names can be of two types:
Absolute path name
Relative path name
Absolute Path namebegins at the root and
follows a path down to the specified file. Giving
the directory names on the path.
Relative pathnamedefines a path from the
current directory. MS-DOS will not delete a
directory unless it is empty. For deleting a
directory, two approaches can be taken.
33. 33
User must delete all the files from the directory.
Make it empty directory
In Unix, rm command is used with some option for
deleting directory.
Advantages:
It allows users to create their own directory
User can access the files of other users.
It allows users to define their own search paths.
Disadvantages:
Special system calls are required to create and
delete directories.
It prohibits the sharing of files and directories.
Path to the file is longer than the two level directory
35. 35
Directory Implementation:
Directory is implemented in two ways.
Linear list
Hash Table
Linear List:
Linear list is a simplest method
It uses a linear list of file names with pointers to the
data blocks
Simple for programming but time consuming to
execute.
For creating new file, it searches the directory for the
name whether same name already exists.
Linear search is the main advantage
Directory information is used frequently and users
would notice a slow implementation of access to it
36. 36
Hash Table:
Hash table decreases the directory search time
Insertion and deletion are also fairly
straightforward.
Hash Table takes the value computed from the
file name
Then it returns a pointer to the file name in the
linear list
Hash table uses fixed size.
38. 38
Allocation Methods:
A good space allocation strategy must take into
consideration several related and interactive
factors such as
Processing speed of sequential access to files,
random access to files and allocation and
deallocation of blocks
Disk space utilization
Two Types of Allocation
Contiguous Allocation
Linked Allocation
39. 39
Contiguous Allocation:
It suffers from the external fragmentation.
Depending on the total amount of disk storage
and the average file size, external fragmentation
be a minor or a major problem. Compaction is
used to solve the problem of external
fragmentation.
The below diagram shows the contiguous
allocation of disk space after compaction
40. 40
Characteristic of contiguous file allocation
It supports variable size portions
Pre-allocation is required.
It requires only single entry for a file.
Allocation frequency is only once.
Advantages:
It supports variable size portion.
Easy to retrieve single block
Accessing a file is easy
It provides good performance
Disadvantages:
It suffers from external fragmentation
Pre- allocation is required.
41. 41
Linked Allocation
Linked allocation solves the problem of
contiguous allocation.
This allocation is on the basis of an individual
block.
Each block contains a pointer to the next block in
the chain.
To create a new file, simply create a new entry in
the directory. With linked allocation, each
directory entry has a pointer to the first block of
the file.
42. 42
The pointer is initialized to nil to signify an empty file.
Size Field is also set to 0. There is no external
fragmentation to worry about because only one block at a
time is needed.
The size of a file does not need to be declared when the
file is created.
A file can continue to grow as long as free blocks are
available. It is never necessary to compact disk space.
43. 43
Characteristics:
It supports fixed size portions
Pre- allocation is possible
File allocation table size is one entry for afile
Allocation frequency is low to high.
Advantages:
There is no external fragmentation
It is never necessary to compact disk space.
Pre-allocation is not required.
Disadvantages:
Files are accessed only sequentially
Space required for pointers
Reliability is not good
Can not support direct access.
46. 46
Free space Management
• Disk space is limited.
• It is necessary to reuse the space freed when files are
deleted for new files.
• To keep track of free disk space, a free space list is
maintained by the system.
• The free space list records all disk blocks that are free.
• The free block are any that are not allocated to some file
or directory.
When a new file is created
1.Search the free-space list for the required amount of space
2.Allocate that space t the new file
When the file is deleted
1. its disk space is added to the free-space list.
47. 47
The free space list can be implemented mainly as
1. Bitmap or Bit vector
2. Linked List
3.Gr0uping
4. Counting
48. 48
1. Bitmap or Bit vector –
A Bitmap or Bit Vector is series or collection of bits
where each bit corresponds to a disk block.
The bit can take two values: 0 and 1: 0 indicates that
the block is allocated and 1 indicates a free block.
The given instance of disk blocks on the disk
in Figure 1 (where green blocks are allocated) can be
represented by a bitmap of 16 bits
as: 0000111000000110.
49. 49
Advantages –
Simple to understand.
Finding the first free block is efficient.
It requires scanning the words (a group of 8 bits)
in a bitmap for a non-zero word.
The first free block is then found by scanning for
the first 1 bit in the non-zero word.
50. 50
2. Linked List
In this approach, the free disk blocks are linked together
i.e. a free block contains a pointer to the next free block.
The block number of the very first disk block is stored at a
separate location on disk and is also cached in memory.
In Figure-2, the free space list head points to Block 5
which points to Block 6, the next free block and so on.
The last free block would contain a null pointer indicating
the end of free list.
51. 51
3. Grouping
This approach stores the address of the free
blocks in the first free block.
The first free block stores the address of some,
say n free blocks.
Out of these n blocks, the first n-1 blocks are
actually free and the last block contains the
address of next free n blocks.
An advantage of this approach is that the
addresses of a group of free disk blocks can be
found easily.
52. 52
4. Counting
This approach stores the address of the first free
disk block and a number n of free contiguous
disk blocks that follow the first block.
Every entry in the list would contain:
Address of first free disk block
A number n
For example, in Figure-1, the first entry of the free
space list would be: ([Address of Block 5], 2),
because 2 contiguous free blocks follow block 5.