Ch4 memory management

Memory
• Memory is an important resource that must be carefully
managed.
• What every programmer would like is an infinitely large,
infinitely fast memory that is also non-volatile (as in,
memory that does not lose its contents when power is cut).
• The part of the OS that manages the memory hierarchy is
called the memory manager.
– Its job is to keep track of which parts of memory are in use and which
parts are not in use.
– To allocate memory to processes when they need it.
– To deallocate it when they’re done.
– To manage swapping between main memory and disk when main memory
is too small to hold all the processes.

Basic Memory Management
• Memory management systems can be divided into two
basic classes:
– Those that move processes back and forth between main memory and disk
during execution (swapping and paging) and
– Those that don’t.
• The latter are simpler, so we will study them first.
• Later in the chapter we will examine swapping and paging.
• For now, keep in mind: swapping and paging are largely
artifacts caused by the lack of sufficient main memory to
hold all programs and data at once.
• Btw, we finally ―carbon-dated‖ the book: It’s ancient!!!
– “Now Microsoft recommends having at least 128MB for a single-user
Windows XP system” …no wonder they keep banging on about floppies
and tape drives!

Monoprogramming without Swapping
or Paging
• The simplest possible memory management scheme is to
run just one program at a time, sharing the memory
between that program and the OS.
• Three variations on this theme are shown below:

Figure 4-1. Three simple ways of organizing memory with an
operating system and one user process. Other possibilities
also exist.

or Paging
• The OS may be at the bottom of memory in RAM (a). Or it
may be in ROM at the top of memory (b) or the device
drivers may be at the top of memory in a ROM and the rest
of the system in RAM down below (c).

or Paging
• The first model was formerly used on mainframes and minicomputers
but is rarely used any more.
• The second model is used on some palmtop computers and embedded
systems.
• The third model was used by early personal computers (e.g., running
MS-DOS), where the portion of the system in the ROM is called the
BIOS.

• When the system is organised in this way, only one process at a time
can be running.
• As soon as the user types a command, the OS copies the requested
program from disk to memory and executes it.
• When the process finishes, the OS displays a prompt character and
waits for a new command.
• When it receives the command, it loads a new program into memory,
overwriting the first one.

Multiprogramming with Fixed Partitions
• Except on very simple embedded systems,
monoprogramming is hardly used any more.
• Most modern systems allow multiple processes to run at
the same time.
• Having multiple processes running at once means that
when one process is blocked waiting for I/O to finish,
another one can use the CPU.
– Multiprogramming increases the CPU utilisation.

• The easiest way to achieve multiprogramming is simply to
divide memory up into n (possibly unequal) partitions.
• This partitioning can, for example, be done manually when
the system is started up.

• When a job arrives, it can be put into the imput queue for
the smallest partition large enough to hold it.
• Since the partitions are fixed in this scheme, any space in
a partition not used by a job is wasted while that job runs.
• In the next figure (a) we see how this system of fixed
partitions and separate input queues look.
– The disadvantage of sorting the incoming jobs into separate queues
becomes apparent when the queue for a large partition is empty but the
queue for a small partition is full, as is the case for partitions 1 & 3 in (a).
• An alternative organisation is to maintain a single queue as
in (b).
– Whenever a partition becomes free, the job closest to the front of the
queue that fits in it could be loaded into the empty partition and run.
– Since it’s undesirable to waste a large partition on a small job, a different
strategy is to search the whole input queue whenever a partition becomes
free and pick the largest job that fits.

Multiprogramming with Fixed Partitions (1)

Figure 4-2. (a) Fixed
memory partitions with
separate input queues
for each partition.

Multiprogramming with Fixed Partitions (2)

Figure 4-2. (b) Fixed
memory partitions with
a single input queue.

• Note that the latter algorithm discriminates against small
jobs as being unworthy of having a whole partition,
whereas usually it is desirable to give the smallest jobs
(often interactive jobs) the best service, not the worst.
– One way out is to have at least one small partition around.
• Such a partition will allow small jobs to run without having to allocate
a large partition for them.
– Another approach is to have a rule stating that a job that is eligible to run
may not be skipped over more than k times.
• Each time it’s skipped over, it gets one point. When it has aquired k
points, it may not be skipped again.

• This system, with fixed partitions set up by the operator in the morning
and not changed thereafter, was used by OS/360 on large IBM
mainframes for many years – it was called MFT (Multiprogramming
with a Fixed number of Tasks or OS/MFT).

Relocation and Protection
• Multiprogramming introduces two essential problems that
must be solved:
– Relocation and protection.
• From the previous two figures it is clear that different jobs
will be run at different addresses.
– When a program is linked (i.e., the main program, user-written procedures,
and library procedures are combined into a single address space), the
linker must know at what address the program will begin in memory.
– For example, suppose that the first instruction is a call to a procedure at
absolute address 100 within the binary fire produced by the linker.
– If this program is loaded in partition 1 (at address 100K), that instruction
will jump to absolute address 100, which is inside the OS.
– What is needed is a call to 100K + 100.
– If the program is loaded into partition 2, it must be carried out as a call to
200K + 100, and so on.  this is the relocation problem.

• A solution for this is to equip the machine with two special
hardware registers, called the base and limit registers.
– When a process is scheduled, the base register is loaded with the address
of the start of its partition, and the limit register is loaded with the length
of the partition.
– Every memory address generated automatically has the base register
contents added to it before being sent to memory.
– Thus if the base register contains the value 100K, a CALL 100 instruction
is effectively turned into a CALL 100K + 100 instruction, without the
instruction itself being modified.
– Addresses are also checked against the limit register to make sure that they
do not attempt to address memory outside the current partition.
– The hardware protects the base and limit registers to prevent user
programs from modifying them.
– A disadvantage of this scheme is the need to perform an addition and a
comparison on every memory reference.

– Comparisons can be done fast, but additions are slow due to carry
propagation time unless special addition circuits are used.

• The CDC 6600 – the world’s first supercomputer – used
this scheme.

• The Intel 8088 CPU used for the original IBM PC used a
slightly weaker version of this scheme – base registers, but
no limit registers.

• Few computers use it now.

Swapping
• With a batch system, organising memory into fixed
partitions is simple and effective.

• Each job is loaded into a partition when it gets to the heard
of the queue.

• It stays in memory until it has finished.

• As long as enough jobs can be kept in memory to keep the
CPU busy all the time, there is no reason to use anything
more complicated.

Swapping
• With timesharing systems or graphics-orientated personal
computers, the situation is different.

• Sometimes there is not enough main memory to hold all
the currently active processes, so excess processes must
be kept on disk and brought in to run dynamically.

• Two general approaches to memory management can be
used, depending on the available hardware:
– Swapping (the simplest strategy that consists of bringing in each process in
its entirety, running it for a while, then putting it back on the disk) and
– Virtual memory (which allows programs to run even when they are only
partially in main memory).

Swapping
• The operation of a swapping system is shown below:

Figure 4-3. Memory allocation changes as processes come into
memory and leave it. The shaded regions are unused memory.

Swapping
• Initially, only process A is in memory.
• Then process B and C are created or swapped in from disk.
• In (d) A is swapped out to disk.
• Then D comes in and B goes out.
• Finally A comes in again.
• Since A is now at a different location, addresses contained in it must be
relocated, either by software when it is swapped in or (more likely) by
hardware during program execution.

Swapping
• The main difference between the fixed partitions of the second figure
(Fig. 4-2) and the variable partitions shown here is that the number,
location, and size of the partitions vary dynamically in the latter as
processes come and go, whereas they are fixed in the former.
• The flexibility of not being tied to a fixed number of partitions that may
be too large or too small improves memory utilization, but it also
complicates allocating and deallocating memory, as well as keeping
track of it.

Swapping
• When swapping creates multiple holes in memory, it is
possible to cimbine them all into one big one by moving all
the processes downward as far as possible.

• This technique is known as memory compaction.
– It is usually not done because it requires a lot of CPU time.

• Also, when swapping processes to disk, only the memory
actually in use should be swapped.

• It is wasteful to swap the extra memory as well.
• In Fig 4.4 (a) we see a memory configuration in which
space for growth has been allocated to two processes.

Swapping
• If processes can have two growing segments, for example,
– the data segment being used as a heap for variables that are dynamically
allocated and released
– and a stack segment for the normal local variables and return addresses,
an alternative arrangement suggest itself, namely that of
(b).

Figure 4-4. (a) Allocating space for a Figure 4-4. (b) Allocating space for a growing
growing data segment. stack and a growing data segment.

Swapping
• In (b) we see that each process illustrated has a stack at
the top of its allocated memory that is growing downward.
– And a data segment just beyond the program text that is growing upward.
• The memory between them can be used for either
segment.
• If it runs out, either the process will have to be moved to a
hole with sufficient space, swapped out of memory until a
large enough hole can be created, or killed.

Swapping
• Memory management with Bitmaps
– When memory is assigned dynamically, the OS must manage it.
– In general terms, there are two ways to keep track of memory usage:
bitmaps and free lists.
– In this section and the next one we will look at these two methods in turn.

– With a bitmap, memory is divided up into allocation units, perhaps as
small as a few words and perhaps as large as several kilobytes.
– Corresponding to each allocation unit is a bit in the bitmap, which is 0 if
the unit is free and 1 if it is occupied (or vice versa).

– The next figure shows part of memory and the corresponding bitmap.

Memory Management with Bitmaps
– The size of the allocation unit is an important design issue.
– The smaller the allocation unit, the larger the bitmap.
– However, even with an allocation unit as small as 4 bytes, 32 bits of
memory will require only 1 bit of the map.
– A memory of 32n bits will use n map bits, so the bitmap will take up only
1/33 of memory.
– If the alloc unit is chosen large, the bitmap will be smaller.

Figure 4-5. (a) A part of memory with five processes and three
holes. The tick marks show the memory allocation units. The
shaded regions (0 in the bitmap) are free. (b) The corresponding bitmap. (c) The same information as a list.

Memory Management with Bitmaps
– But, appreciable memory may be wasted in the last unit of the process if
the process size is not an exact multiple of the allocation unit.
– A bitmap provides a simple way to keep track of memory words in a fixed
amount of memory because the size of the bitmap depends only on the size
of memory and the size of the allocation unit.
– The main problem with it is that when it has been decided to bring a k unit
process into memory, the mem manager must search the bitmap to find a
run of k consecutive 0 bits in the map
– And searching a bitmap for a run of a given length is a slow operation


Memory Management with Linked Lists
– Another way of keeping track of memory is to maintain a linked list of
allocated and free memory segments, where a segment is either a process
or a hole between two processes.
– The memory of (a) is represented in (c) as a linked list of segments.
– Each entry in the list specifies a hole (H) or process (P), the address at
which it starts, the length, and a pointer to the next entry.
– In this example, the segment list is kept sorted by address.
– Sorting this way has the advantage that when a process terminates or is
swapped out, updating the list is straightforward.


– A terminating process normally has two neighbours (except when it is at
the very top or very bottom of memory).
– These may be either processes or holes, leading to the four combinations
shown below.
– In (a) updating the list requires replacing a P by an H.
– In (b) and also in (c), two entries are coalesced into one, and the list
becomes one entry shorted.
– In (d), three entries are merged and two items are removed from the list.

Figure 4-6. Four neighbor combinations for the terminating process, X.

– Since the process table slot for the terminating process will normally point
to the list entry for the process itself, it may be more convenient to have
the list as a double-linked list, rather than the single-linked list of Fig 4.5
(c).
– This structure makes it easier to find the previous entry and to see if a
merge is possible.

Figure 4-6. Four neighbor combinations for the terminating process, X.

• When the processes and holes are kept on a list sorted by
address, several algorithms can be used to allocate
memory for a newly created process (or an existing
process being swapped in from disk).
• We assume that the memory manager knows how much
memory to allocate.
• The simples algorithm is first fit.
– The process manager scans along the list of segments until it finds a hole
that is big enough.
– The hole is then broken up into two pieces, one for the process and one for
the unused memory, except in the statistically unlikely case of an exact fit.
– First fit is a fast algorithm because it searches as little as possible.

• Another well-known algorithm is next fit.
– It works the same way as first fit, except that it keeps track of where it is
whenever it finds a suitable hole.
– The next time it is called to find a hole, it starts searching the list from the
place where it left off last time, instead of always beginning, as first fit
does.
– Simulations by Bays (1977) show that next fit gives slightly worse
performance than first fit.
• Then, best fit.
– Best fit searches the entire list and takes the smallest hole that is adequate.
– Rather than breaking up a big hole that might be needed later, best fit tries
to find a hole that is close to the actual size needed.
– Best fit is slower that first fit because it must search the entire list every
time it is called.
– Somewhat surprisingly, it also results in more wasted memory than first fit
or next fit because it tends to fill up memory with tiny, useless holes (first
fit creates larger holes on average).

• Then there’s worst fit.
– To get around the problem of breaking up nearly exact matches into a process and
a tiny hole, one could think about worst fit, that is, always take the largest available
hole, so that the hole broken off will be big enough to be useful.
– Simulation has shown that worst fit is not a very good idea either.
• Then, quick fit.
– Quick fit maintains separate lists for some of the more common sizes requested.
– For example, it might have a table with n entries, in which the first entry is a
pointer to the head of a list of 4-KB holes, the second entry is a pointer to a list of
8-KB holes, the third entry a pointer to 12-KB holes, and so on.
– Holes of say, 21-KB, could either be put on the 20-KB list or on a special list of
odd-sized holes.
– With quick fit, finding a hole of the required size is extremely fast, but it has the
same disadvantage as all the other scheme that sort by hole size, namely: when a
process terminates or is swapped out, finding its neighbours to see if a merge is
possible, is expensive.
– If merging is not done, mem will quickly fragment into a large number of small
holes into which no processes fit.

Memory Allocation Algorithms
• First fit
Use first hole big enough
• Next fit
Use next hole big enough
• Best fit
Search list for smallest hole big enough
• Worst fit
Search list for largest hole available
• Quick fit
Separate lists of commonly requested sizes

Virtual Memory
• Many years ago people were first confronted with
programs that were too big to fit in the available memory.
• The solution usually adopted was to split the program into
pieces, called overlays.
• Overlay 0 would start running first.
• When it was done, it would call another overlay.
• Some overlay systems were highly complex, allowing
multiple overlays in memory at once.
• The overlays were kept on the disk and swapped in and
out of memory by the OS, dynamically, as needed.
• Although the actual work of swapping overlays was done
by the system, the decision of how to split the program into
pieces had to be done by the programmer.

Virtual Memory
• Splitting up large programs into small, modular pieces was
time consuming and boring.
• It did not take long before someone thought of a way to
turn the whole job over to the computer:
– This method: virtual memory.
• The basic idea behind virtual memory:
– The combined size of the program, data, and stack may exceed the amount
of physical memory available for it.
– The OS keeps those parts of the program currently in use in main memory,
and the rest on the disk.
– For example, a 512MB program can run on a 256MB machine by
carefully choosing which 256MB to keep in memory at each instant, with
pieces of the program being swapped between disk and memory as
needed.

Paging
• Most virtual memory systems use a technique called
paging, which we will now describe.
• One any computer, there exists a set of memory addresses
that programs can produce.
• When a program uses an instruction like:
– MOV REG, 1000
• It does this to copy the contents of memory address 1000
to REG.
• Addresses can be generated using indexing, base
registers, segment registers, etc.

Paging
– These program-generated addresses are called virtual addresses and form
the virtual address space.
– On computers without virtual memory, the virtual address is put directly
onto the memory bus and causes the physical memory word with the same
address to be read or written.
– When virtual memory is used, the virtual addresses do not directly go to
the memory bus.
– Instead, they go to an MMU (Mem. Management Unit) that maps the
virtual addresses onto the physical memory addresses:

Figure 4-7. The position and function of the MMU. Here the MMU
is shown as being a part of the CPU chip because it commonly is
nowadays. However, logically it could be a separate chip and
was in years gone by.

Paging (2)
• An e.g. of how this mapping works is shown on the
RHS.

• Here we have a computer that can generate 16-bit
addresses, from 0 up to 64-K.

• These are the virtual addresses.

• The computer, however, only has 32-KB of physical
memory, so although 64-KB programs can be
written, they cannon be loaded into memory in their
entirety and run.

• A complete copy of a program’s memory image, up
to 64-KB, must be present on the disk, however, so
that pieces can be brought in as needed.
Figure 4-8. The relation between
virtual addresses and physical
memory addresses is given by
the page table.

Paging (2)
• The virtual address space is divided up into units
called pages.

• The corresponding units in the physical memory are
called ―page frames‖.

• The pages and page frames are always the same
size.

• In this example they are 4-KB, but page sizes from
512 bytes to 1 MB have been used in real systems.

• With 64KB of virtual address space and 32KB of
physical memory, we get 16 virtual pages and 8
page frames.

• Transfers between Ram and disk are always in units Figure 4-8. The relation between
of a page. virtual addresses and physical
memory addresses is given by
the page table.

Paging (2)
• When the program tries to access address 0, for
example, using the instruction
MOV REG, 0
virtual address 0 is sent to the MMU.

• The MMU sees that this virtual address falls in page
0 (0 – 4095), which according to its mapping is page
frame 2 (8192 to 12287)

• It thus transforms the address to 8192 and outputs
address 8192 onto the bus.

• The memory knows nothing at all about the MMU
and just sees a request for reading or writing
address 8192, which it honours.

• Thus, the MMU has effectively mapped all virtual Figure 4-8. The relation between
addresses between 0 and 4095 onto physical virtual addresses and physical
addresses 8192 to 121287. memory addresses is given by
the page table.

Paging (2)
• By itself, the ability to map the 16 virtual pages onto
any of the 8 page frames by setting the MMU’s map
appropriately does not solve the problem that the
virtual address space is larger than the physical
memory.

• Since we have only 8 physical page frames, only 8
of the virtual pages in the figure are mapped onto
physical memory.

• In the RHS figure, we see an example of a virtual
address 8196 (0010000000000100) being mapped
using the MMU map op the previous figure.
– The incoming 16-bit virtual address is split into a 4-bit page
number and a 12-bit offset.
– With 4 bits for the page number, we can have 16 pages
– And with 12 bits for the offset, we can address all 4096
bytes within a page.
Figure 4-9. The internal
• The page number is used as an index into the page
table, yielding the number of the page frame operation of the MMU
corresponding to that virtual page. with 16 4-KB pages.

Page Tables

• Purpose : map virtual pages onto page
frames

• Major issues to be faced
1. The page table can be extremely large
2. The mapping must be fast.

Multilevel Page Tables
• To get around the problem of having to store huge page
tables in memory all the time, many computers use a
multilevel page table.
• A simple example is shown:

Figure 4-10. (a) A 32-bit
address with two page table
fields. (b) Two-level page
tables.

Multilevel Page Tables
• In (a) we have a 32-bit virtual address that is
partitioned into a 10-bit PT1 field, a 10-bit PT2 field,
and a 12-bit Offset field.
• Since offsets are 12 bits, pages are 4KB, and there
are a total of 2^20 of them.
• The secret to the multilevel page table method is to
avoid keeping all the page tables in memory all the
time.
• In particular, those that are not needed should not
be kept around.
• In (b) we see how the two-level page table works.
– On the left we have the top-level page table, with 1024
entries, corresponding to the 10-bit PT1 field.
– When a virtual address is presented to the MMU, it first
extracts the PT1 field and uses this value as an index into the
top-level page table.
– Each of these 1024 entries represents 4M because the entire
4-gigabyte virtual address space has been chopped into
chunks of 1024 bytes. Figure 4-10. (a) A 32-bit
• The entry located by indexing into the top-level page address with two page table
table yields the address of the page frame # of a fields. (b) Two-level page
second-level page table. tables.

Structure of a Page Table Entry
• The exact layout of a page table entry is highly machine dependent, but the
kind of information present is roughly the same from machine to machine.
• The figure below shows a sample page entry.
• The size varies from computer to computer, but 32 bits is a common size.
• The most important field is the page frame number.
– The goal of the page mapping is to locate this value.
• Next to it we have the present/absent bit.
– If this bit is 1, the entry is valid and can be used.
– If it is 0, the virtual page to which the entry belongs is not currently in memory.
– Accessing a page table entry with this bit set to 0 causes a page fault.
• The protection bit tells what kinds of access are permitted.

Figure 4-11. A typical page table entry.

Structure of a Page Table Entry
• In the simplest form, the protection bit is 0 for read/write and 1 for read
only.
• A more sophisticated arrangement is having 3 independent bits, one
bit each for individually enabling reading, writing and executing the
page.
• The modified and referenced bits keep track of page usage.
– When a page is written to, the hardware automatically sets the modified bit.
– This bit is used when the OS decided to reclaim a page frame.
– If the page in it has been modified (i.e. is “dirty”), it must be written back to the
disk
– If it has not been modified (i.e. is “clean”), it can just be abandoned, since the disk
copy is still valid.
– The bit is sometimes called the “dirty bit”, since it reflects the page’s state.

TLBs—Translation Lookaside Buffers
• In most page schemes, the page tables are kept in memory, due to
their large size.
• Potentially, this design has an enormous impact on performance.
• The solution is to equip computers with a small hardware device for
rapidly mapping virtual addresses to physical addresses without going
through the page table.
– This device, called the TLB, or associated memory, is shown below:

Figure 4-12. A TLB to speed up paging.

TLBs—Translation Lookaside Buffers
• It’s usually inside the MMU and consists of a small number of entries,
eight in this case, bur rarely more than 64.
• Each entry contains information about one page, including the virtual
page number, a bit that is set when the page is modified, the
protection code (read/write/execute permisions), and the physical
page frame in which the page is located.
• These fields have a one-to-one correspondence with the fields in the
page table.

Figure 4-12. A TLB to speed up paging.

Inverted Page Tables
• Traditional page tables, like the one described, require one entry per
virtual page, since they are indexed by virtual page number.
• If the address space consists of 2^32 bytes, with 4096 bytes per page,
then over 1 million page table entries are needed.
• As a bare minimum, the page table will have a size of 4 MB (doable).
• On 64-bit computes, this situation changes drastically
– If the address space is 2^64 bytes, with 4KB pages, we need a page table with
2^52 entries

Figure 4-13. Comparison of a traditional page table with an inverted page table.

• If each page entry is 8 bytes, the table is over 30 million
gigabytes.

• Consequently, a different solution is needed for 64-bit
paged virtual address spaces
– One such solution is the inverted page table.

Figure 4-13. Comparison of a traditional page table with an inverted page table.

• The inverted page table (IPT) is best thought of as an off-
chip extension of the TLB which uses normal system
RAM. Unlike a true page table, it is not necessarily able to
hold all current mappings. The OS must be prepared to
handle misses, just as it would with a MIPS-style software-
filled TLB.

• The IPT combines a page table and a frame table into one
data structure. At its core is a fixed-size table with the
number of rows equal to the number of frames in memory.
If there are 4000 frames, the inverted page table has 4000
rows. For each row there is an entry for the virtual page
number (VPN), the physical page number (not the physical
address), some other data and a means for creating a
collision chain, as we will see later.

• To search through all entries of the core IPT structure is
inefficient, so we use a hash table mapping virtual
addresses (and address space/PID information if need be)
to an index in the IPT - this is where the collision chain is
used.

• This hash table is known as a hash anchor table.
– The hashing function is not generally optimized for coverage - raw speed
is more desirable.
– Of course, hash tables experience collisions.
– Due to this chosen hashing function, we may experience a lot of collisions
in usage, so for each entry in the table the VPN is provided to check if it
is the searched entry or a collision.

• In searching for a mapping, the hash anchor table is used.
If no entry exists, a page fault occurs.
– Otherwise, the entry is found.

• Depending on the architecture, the entry may be placed in
the TLB again and the memory reference is restarted, or
the collision chain may be followed until it has been
exhausted and a page fault occurs.

• A virtual address in this schema could be split into two,
– the first half being a virtual page number and the second half being the
offset in that page.

• A major problem with this design is poor cache locality
caused by the hash function.

• Tree-based designs avoid this by placing the page table
entries for adjacent pages in adjacent locations, but an
inverted page table destroys spatial locality of reference
by scattering entries all over.

• An operating system may minimise the size of the hash
table to reduce this problem, with the tradeoff being an
increased miss rate.

Page Replacement Algorithms

• Optimal replacement
• Not recently used (NRU) replacement
• First-in, first-out (FIFO) replacement
• Second chance replacement
• Clock page replacement
• Least recently used (LRU) replacement

• Page replacement algorithms decide which memory pages
to page out (swap out, write to disk) when a page of
memory needs to be allocated.

• Paging happens when a page fault occurs and a free page
cannot be used to satisfy the allocation, either because
there are none, or because the number of free pages is
lower than some threshold.

• When the page that was selected for replacement and
paged out is referenced again it has to be paged in (read
in from disk), and this involves waiting for I/O completion.
– This determines the quality of the page replacement algorithm: the less
time waiting for page-ins, the better the algorithm.

• A page replacement algorithm looks at:
– the limited information about accesses to the pages provided by hardware,
– and tries to guess which pages should be replaced to minimize the total
number of page misses,
– while balancing this with the costs (primary storage and processor time)
of the algorithm itself.

The theoretically optimal page
replacement algorithm
• The theoretically optimal page replacement algorithm (also
known as OPT, clairvoyant replacement algorithm, or
Bélády's optimal page replacement policy) is an algorithm
that works as follows:
– when a page needs to be swapped in, the operating system swaps out the
page whose next use will occur farthest in the future.
– For example, a page that is not going to be used for the next 6 seconds will
be swapped out over a page that is going to be used within the next 0.4
seconds.
• This algorithm cannot be implemented in the general
purpose operating system because it is impossible to
compute reliably how long it will be before a page is going
to be used, except when all software that will run on a
system is either known beforehand and is amenable to the
static analysis of its memory reference patterns, or only a
class of applications allowing run-time analysis.

Not recently used
• At a certain fixed time interval, the clock interrupt triggers
and clears the referenced bit of all the pages, so only pages
referenced within the current clock interval are marked with
a referenced bit. When a page needs to be replaced, the
operating system divides the pages into four classes:
0. not referenced, not modified
1. not referenced, modified
2. referenced, not modified
3. referenced, modified
• Although it does not seem possible for a page to be not
referenced yet modified, this happens when a class 3 page
has its referenced bit cleared by the clock interrupt.
– The NRU algorithm picks a random page from the lowest category for
removal. Note that this algorithm implies that a modified (within clock
interval) but not referenced page is less important than a not modified page
that is intensely referenced.

First-in, first-out
• The simplest page-replacement algorithm is a FIFO
algorithm.
• The first-in, first-out (FIFO) page replacement algorithm is a
low-overhead algorithm that requires little book-keeping on
the part of the operating system.
• The idea is obvious from the name - the operating system
keeps track of all the pages in memory in a queue, with the
most recent arrival at the back, and the earliest arrival in
front.
• When a page needs to be replaced, the page at the front of
the queue (the oldest page) is selected.
– While FIFO is cheap and intuitive, it performs poorly in practical
application. Thus, it is rarely used in its unmodified form. This algorithm
experiences Bélády's anomaly.

Second-chance
• A modified form of the FIFO page replacement algorithm,
known as the Second-chance page replacement algorithm,
fares relatively better than FIFO at little cost for the
improvement.
• It works by looking at the front of the queue as FIFO does,
but instead of immediately paging out that page, it checks
to see if its referenced bit is set.
– If it is not set, the page is swapped out.
– Otherwise, the referenced bit is cleared, the page is inserted at the back of
the queue (as if it were a new page) and this process is repeated.
– This can also be thought of as a circular queue.
– If all the pages have their referenced bit set, on the second encounter of the
first page in the list, that page will be swapped out, as it now has its
referenced bit cleared.
– If all the pages have their reference bit set then second chance algorithm
degenerates into pure FIFO.

Second Chance Replacement

Figure 4-14. Operation of second chance. (a) Pages sorted
in FIFO order. (b) Page list if a page fault occurs at time 20
and A has its R bit set. The numbers above the pages are
their loading times.

Clock
• Clock is a more efficient version of FIFO than Second-
chance because pages don't have to be constantly pushed
to the back of the list, but it performs the same general
function as Second-Chance.

• The clock algorithm keeps a circular list of pages in
memory, with the "hand" (iterator) pointing to the last
examined page frame in the list.
• When a page fault occurs and no empty frames exist, then
the R (referenced) bit is inspected at the hand's location.
• If R is 0, the new page is put in place of the page the "hand"
points to, otherwise the R bit is cleared.
• Then, the clock hand is incremented and the process is
repeated until a page is replaced.

Clock Page Replacement

Figure 4-15. The clock page replacement algorithm.

Least recently used
• The least recently used page (LRU) replacement algorithm,
though similar in name to NRU, differs in the fact that LRU
keeps track of page usage over a short period of time, while
NRU just looks at the usage in the last clock interval.
• LRU works on the idea that pages that have been most
heavily used in the past few instructions are most likely to
be used heavily in the next few instructions too.
• While LRU can provide near-optimal performance in theory
(almost as good as Adaptive Replacement Cache), it is
rather expensive to implement in practice.
• There are a few implementation methods for this algorithm
that try to reduce the cost yet keep as much of the
performance as possible.

Least recently used
• The most expensive method is the linked list method, which
uses a linked list containing all the pages in memory.
• At the back of this list is the least recently used page, and
at the front is the most recently used page.
• The cost of this implementation lies in the fact that items in
the list will have to be moved about every memory
reference, which is a very time-consuming process.
• Another method that requires hardware support is as
follows: suppose the hardware has a 64-bit counter that is
incremented at every instruction.
– Whenever a page is accessed, it gains a value equal to the counter at the
time of page access.
– Whenever a page needs to be replaced, the operating system selects the
page with the lowest counter and swaps it out.
– With present hardware, this is not feasible because the OS needs to
examine the counter for every page in memory.

Simulating LRU in Software (1)
Read through pg 401 - 403

Figure 4-16. LRU using a matrix when pages are referenced in the
order 0, 1, 2, 3, 2, 1, 0, 3, 2, 3.

Simulating LRU in Software (2)

Figure 4-17. The aging algorithm simulates LRU in software.
Shown are six pages for five clock ticks. The five clock ticks are
represented by (a) to (e).

Design Issues for Paging Systems
• Knowing the bare mechanics of paging is not
enough.

• To design a system, you have to know a lot more
to make it work well.

• In the following sections, we will look at other
issues that OS designers must consider in order to
get good performance from a paging system.

The Working Set Model
• The working set of a process is the set of pages expected
to be used by that process during some time interval.

• The "working set model" isn't a page replacement algorithm
in the strict sense (it's actually a kind of medium-term
scheduler)

• Working set is a concept in computer science which defines
what memory a process requires in a given time interval.

• The working set of information W(t, tau) of a process at time
t to be the collection of information referenced by the
process during the process time interval (t - tau, t).

• Typically the units of information in question are considered
to be memory pages.

• This is suggested to be an approximation of the set of
pages that the process will access in the future (say during
the next tau time units), and more specifically is suggested
to be an indication of what pages ought to be kept in main
memory to allow most progress to be made in the execution
of that process.

• The effect of choice of what pages to be kept in main
memory (as distinct from being paged out to auxiliary
storage) is important:
– if too many pages of a process are kept in main memory, then fewer other
processes can be ready at any one time.
– If too few pages of a process are kept in main memory, then the page fault
frequency is greatly increased and the number of active (non-suspended)
processes currently executing in the system approaches zero.
• The working set model states that a process can be in RAM
if and only if all of the pages that it is currently using (often
approximated by the most recently used pages) can be in
RAM.
• The model is an all or nothing model, meaning if the pages
it needs to use increases, and there is no room in RAM, the
process is swapped out of memory to free the memory for
other processes to use.

• Often a heavily loaded computer has so many processes
queued up that, if all the processes were allowed to run for
one scheduling time slice, they would refer to more pages
than there is RAM, causing the computer to "thrash".
• By swapping some processes from memory, the result is
that processes -- even processes that were temporarily
removed from memory -- finish much sooner than they
would if the computer attempted to run them all at once.
• The processes also finish much sooner than they would if
the computer only ran one process at a time to completion,
– since it allows other processes to run and make progress during times that
one process is waiting on the hard drive or some other global resource.
• In other words, the working set strategy prevents thrashing
while keeping the degree of multiprogramming as high as
possible. Thus it optimizes CPU utilization and throughput.


• Thrashing?
– describes a computer whose virtual memory subsystem is in a constant state
of paging, rapidly exchanging data in memory for data on disk, to the
exclusion of most application-level processing.
– This causes the performance of the computer to degrade or collapse. The
situation may not resolve itself quickly, but can continue indefinitely until
the underlying cause is addressed.


k

Figure 4-18. The working set is the set of pages used by the k
most recent memory references. The function w(k, t) is the size of
the working set at time t.

Local versus Global Allocation Policies
• In the preceding sections we have discussed
several algorithms for choosing a page to replace
when a fault occurs.
• A major issue associated with this choice is how
memory should be allocated among the competing
runnable processes.
• Local algorithms:
– Allocate every process a fixed fraction of memory.
• Global algorithms:
– Dynamically allocate page frames among the runable
processes
– Thus the number of page frames assigned to each process
varies in time.

Page Fault Frequency

Figure 4-20. Page fault rate as a function of the
number of page frames assigned.

Page Size

• The page size is often a parameter that can
be chosen by the OS.

• Determining the best page size requires
balancing several competing factors.

• As a result, there is no overall optimum.

Virtual Memory Interface
• The use of virtual memory addressing (such as paging or
segmentation) means that the kernel can choose what
memory each program may use at any given time, allowing
the operating system to use the same memory locations for
multiple tasks.

• If a program tries to access memory that isn't in its current
range of accessible memory, but nonetheless has been
allocated to it, the kernel will be interrupted in the same way
as it would if the program were to exceed its allocated
memory.

• Under UNIX this kind of interrupt is referred to as a page
fault.

Distributed Shared Memory
• Distributed Shared Memory (DSM) is a form of
memory architecture where the (physically
separate) memories can be addressed as one
(logically shared) address space.

• Here, the term shared does not mean that there is
a single centralised memory but shared essentially
means that the address space is shared (same
physical address on two processors refers to the
same location in memory)

Segmentation
• The virtual memory discussed so far is one-
dimensional because the virtual addresses go from
0 to some maximum address, one address after
another.

• For many problems, having two or more separate
virtual address spaces may be much better than
having only one.

• For example, a compiler has many tables that are
built up as compilation proceeds…

Segmentation (1)
Examples of tables saved by a compiler …

1. The source text being saved for the printed listing (on
batch systems).
2. The symbol table, containing the names and attributes of
variables.
3. The table containing all the integer and floating-point
constants used.
4. The parse tree, containing the syntactic analysis of the
program.
5. The stack used for procedure calls within the compiler.

These will vary in size dynamically during the compile process

Segmentation (2)
• Each of the first four tables
grows continuously as
compilation proceeds.

• The last one grows and shrinks
in unpredictable ways during
compilation.

• In a one-dimensional memory,
these five tables would have to
be allocated neighbouring
chunks of virtual address space

Figure 4-21. In a one-dimensional address space with growing
tables, one table may bump into another.

Segmentation
• Consider what happens if a program has an
exceptionally large number of variables but a
normal amount of everything else.
• The chunk of address space allocated for the
symbol table may fill up, but there may be lots of
room in the other tables.
• A straightforward and extremely general solution is
to provide the machine with many completely
independent address spaces, called segments.
• Each segment consists of a linear sequence of
addresses, from 0 to some maximum.

Segmentation (3)

Figure 4-22. A segmented memory allows each table to grow or
shrink independently of the other tables.

Segmentation (4)

...
Figure 4-23. Comparison of paging and segmentation.

Segmentation (4)
...

Figure 4-23. Comparison of paging and segmentation.

Implementation of Pure Segmentation
• The implementation of segmentation differs from
paging in an essential way:
– Pages as fixed size and segments are not.
• Figure 4-24(a) shows an example of physical
memory initially containing five segments.
– Now consider what happens if segment 1 is evicted and
segment 7, which is smaller, is put in its place.
– We arrive at the memory configuration of (b).
– Between segment 7 and segment 2 is an unused area – a hole.
– Then segment 4 is replaced by segment 5 (as in (c))
– And segment 3 is replaced by segment 6, as in (d).

Implementation of Pure Segmentation
• After the system has been running for a while, memory will be
divided up into a number of chunks, some containing segments and
some containing holes.
• This phenomenon, called checker-boarding or external
fragmentation, wastes memory in the holes (can be dealt with by
compaction (e)).

Figure 4-24. (a)-(d) Development of checkerboarding.
(e) Removal of the checkerboarding by compaction.

Segmentation with Paging:
The Intel Pentium

See p415 - 420

Overview of the MINIX 3 Process
Manager
• Memory management in MINIX 3 is simple:
– Paging is not used at all.
– Memory management doesn’t include swapping either.
– MINIX 3 works on a system with limited physical memory.
– In practice, memories are so large now that swapping is rarely needed.

• A user-space server designated the process manager (or
PM) does, however, exist.
– It handles system calls relating to process management.
– Of these some are intimately involved with memory management.
– Process management also includes processing system calls related to
signals, setting and examining process properties such as user and group
ownership, and reporting CPU usage times.
– The MINIX 3 process manager also handles setting and querying the real
time clock.

Memory Layout
• In normal MINIX 3 operation, memory is allocated on two
occasions.
– First, when a process forks (the amount of memory needed by the child is
allocated).
– Second, when a process changes its memory image via the exec system
call, the space occupied by the old image is returned to the free list as a
hole, and memory is allocated for the new image.
• The new image may be in a part of memory different from
the released memory
• Its location will depend upon where an adequate hole is
found.
• Memory is also released whenever a process terminates,
either by exiting or by being killed by a signal.

Memory Layout (1)
• The figure shows memory allocation during a fork and an exec. In
(a) we see two processes, A and B, in memory.
• If A forks, we get the situation of (b). The child is an exact copy of
A.
• If the child now execs the file C, the memory looks like ©.
• The child’s image is replaced by C.

Figure 4-30. Memory allocation (a) Originally. (b) After a fork.
(c) After the child does an exec. The shaded regions are unused memory. The
process is a common I&D one.

Memory Layout (2)

The data part of the image is enlarged by the
amount specified in the bss field in the header

Figure 4-31. (a) A program as stored in a disk file. (b) Internal
memory layout for a single process. In both parts of the
figure the lowest disk or memory address is at the bottom and
the highest address is at the top.

Message Handling
• The process manager is message driven.
• After the system has been initialised
– PM enters its main loop, which consists of waiting for a message, carrying
out the request contained in the message, and sending a reply.

• Two message categories may be received by the process
manager.
– For high priority communication between the kernel and system servers
such as PM, a system notification message is used
(these are special cases)
– The majority of messages received by the process manager result from
system calls originated by user processes.
• For this category, the next figure gives a list of legal message types,
input parameters and values sent back in the reply message.

Process Manager Data Structures
and Algorithms (1)

...
Figure 4-32. The message types, input parameters, and reply
values used for communicating with the PM.

Process Manager Data Structures
and Algorithms (2)
...

Figure 4-32. The message types, input parameters, and reply
values used for communicating with the PM.

Processes in Memory and Shared Text
• See p428-431
• ―The PM’s process table is called mproc and its
definition is given in src/servers/om/mproc.h‖

• It contains all the fields related to a process’
memory allocation, as well as some additional
items.

• The most important field is the array mp_seg

• Etc…

The Hole List
• The other major process manager data structure is the hole table,
hole, defined in src/servers/pm/alloc.c, which lists every hole in
memory in order of increasing memory address.

• The gaps between the data and stack segments are not considered
holes; they have already been allocated and procssed.

Figure 4-35. The hole list is an array of struct hole.

FORK System Call
• When processes are created or destroyed, memory must be
allocated or deallocated.
• Also, the process table must be updated, including the parts held by
the kernel and FS.
• The PM coordinates this activity.

Figure 4-36. The steps required to carry out the fork system call.

EXEC System Call (1)
• EXEC is the most complex system call in MINIX 3.
• It must replace the current memory image with a new one,
including setting up a new stack.
• The new image must be a binary executable file, of course.
• Exec carries out its job in a series of steps:

Figure 4-37. The steps required to carry out the exec system call.

Signal Handling (1)

See p438 - 446

Figure 4-40. Three phases of dealing with signals.

Signal Handling (2)

Figure 4-41. The sigaction structure.

Signal
Handling
(3)

Figure 4-42. Signals defined by POSIX and MINIX 3. Signals
indicated by (*) depend on hardware support. Signals
marked (M) not defined by POSIX, but are defined by MINIX
3 for compatibility with older programs. Signals kernel are
MINIX 3 specific signals generated by the kernel, and used to
inform system processes about system events. Several
obsolete names and synonyms are not listed here.

Signal
Handling
(4)

Figure 4-42. Signals defined by POSIX and MINIX 3. Signals
indicated by (*) depend on hardware support. Signals
marked (M) not defined by POSIX, but are defined by MINIX
3 for compatibility with older programs. Signals kernel are
MINIX 3 specific signals generated by the kernel, and used to
inform system processes about system events. Several
obsolete names and synonyms are not listed here.

IMPLEMENTATION OF THE
MINIX 3 PROCESS MANAGER

Read Through and Generally Grasp
Detail on p447 - 475

Ch4 memory management

More Related Content

What's hot (20)

Similar to Ch4 memory management (20)

Recently uploaded (20)

Ch4 memory management