Operating System
|
|
|
|
Is a program that controls the
execution of application programs |
|
OS must relinquish control to user
programs and regain it safely and efficiently |
|
Tells the CPU when to execute other
pgms |
|
Is an interface between the user and
hardware |
|
Masks the details of the hardware to
application programs |
|
Hence OS must deal with hardware
details |
Evolution of an Operating
System
|
|
|
|
Must adapt to hardware upgrades and new
types of hardware. Examples: |
|
Character vs graphic terminals |
|
Introduction of paging hardware |
|
Must offer new services, eg: internet
support |
|
The need to change the OS on regular
basis place requirements on it’s design |
|
modular construction with clean
interfaces |
|
object oriented methodology |
Simple Batch Systems
|
|
|
Are the first operating systems
(mid-50s) |
|
The user submit a job (written on card
or tape) to a computer operator |
|
The computer operator place a batch of
several jobs on a input device |
|
A special program, the monitor, manages
the execution of each program in the batch |
|
Resident monitor is in main memory and
available for execution |
|
Monitor utilities are loaded when
needed |
The Monitor
|
|
|
|
Monitor reads jobs one at a time from
the input device |
|
Monitor places a job in the user
program area |
|
A monitor instruction branches to the
start of the user program |
|
Execution of user pgm continues until: |
|
end-of-pgm occurs |
|
error occurs |
|
Causes the CPU to fetch its next
instruction from Monitor |
|
OS: |
|
Alternates execution between user
program and the monitor program |
|
Relies on available hardware to
effectively alternate execution from various parts of memory |
|
|
|
|
Desirable Hardware
Features
|
|
|
|
Memory protection |
|
do not allow the memory area containing
the monitor to be altered by user programs |
|
Timer |
|
prevents a job from monopolizing the
system |
|
an interrupt occurs when time expires |
|
Privileged instructions |
|
can be executed only by the monitor |
|
an interrupt occurs if a program tries
these instructions |
|
Interrupts |
|
provides flexibility for relinquishing
control to and regaining control from user programs |
|
manage resource contention |
|
either hardware support through special
instructions (lock, swap, …) |
|
software support: semaphores, monitors,
…. |
Multiprogrammed Batch
Systems
|
|
|
I/O operations are exceedingly slow
(compared to instruction execution) |
|
A program containing even a very small
number of I/O ops, will spend most of its time waiting for them |
|
Hence: poor CPU usage when only one
program is present in memory |
Multiprogrammed Batch
Systems
|
|
|
If memory can hold several programs,
then CPU can switch to another one whenever a program is awaiting for an I/O
to complete |
|
This is multitasking (multiprogramming) |
Example:
Three Jobs Submitted
|
|
|
Almost no contention for resources |
|
All 3 can run in minimum time in a
multitasking environment (assuming JOB2/3 have enough CPU time to keep their
I/O operations active) |
Advantages of
Multiprogramming
Time Sharing Systems
(TSS)
|
|
|
Batch multiprogramming does not support
interaction with users |
|
TSS extends multiprogramming to handle
multiple interactive jobs |
|
Processor’s time is shared among
multiple users |
|
Multiple users simultaneously access
the system through terminals |
|
Because of slow human reaction time, a
typical user needs 2 sec of processing time per minute |
|
Then (about) 30 users should be able to
share the same system without noticeable delay in the computer reaction time |
|
The file system must be protected
(multiple users…) |
|
|
Pseudo-Parallelism
|
|
|
|
|
Operating systems strive to do many
things at the same time |
|
Run a user program |
|
Read from or write to a disk |
|
Print on a printer |
|
Would like to keep as many resources
busy as possible to increase the system’s efficiency |
|
Multi-programming or time-sharing |
|
CPU switches from program to program |
|
Switch when |
|
A time slice (10s or 100s milli-sec
pass) or when a process does an IO operation |
|
Gives the appearance of concurrency |
|
Many users feel like they have
exclusive access to the machine |
|
Process model supports this approach |
Process
|
|
|
|
|
|
Introduced to obtain a systematic way
of monitoring and controlling program execution |
|
A process is an executable program
with: |
|
associated data (variables, buffers…) |
|
execution context: i.e. all the information that |
|
the CPU needs to execute the process |
|
content of the processor registers |
|
the OS needs to manage the process: |
|
priority of the process |
|
the event (if any) after which the
process is waiting |
|
other data (that we will introduce
later) |
Process Model
|
|
|
|
|
Process model: states |
|
Running |
|
Blocked |
|
Ready |
|
|
|
|
|
|
|
Scheduler (kernel) is lowest level of the operating system |
|
Handles all the interrupts |
|
Does context switching |
|
Handles interprocess communication |
|
Rest of OS is on top of the scheduler
(kernel) implemented as processes |
|
For example |
|
When a disk interrupt occurs |
|
The system stops running the current
process |
|
Makes the disk process, which was
blocked waiting for the interrupt, ready to run |
|
Runs another process (possibly resuming
the original process) |
Processes and Interrupt
Handling Implementation
|
|
|
To implement processes the OS maintains
a process table |
|
Examples of important information in
the table |
|
Process management Memory management File Management |
|
Registers Pointers to segments Root directory |
|
Program counter Process id Working
directory |
|
Stack pointer Exit status File descriptors |
|
Process start time .... .... |
|
CPU time used |
|
Message queue pointers |
|
|
|
Interrupt handling is accomplished as
follows |
|
1. Interrupt: hardware stacks program
counter, etc.. of current program |
|
2. Hardware loads new program counter
from interrupt vector (table) |
|
3. Assembly language routine saves
registers in process table (above) |
|
4. Assembly language routine procedures
sets up new stack for C handler |
|
5. C procedure marks interrupt service
process as ready in the above table |
|
6. Scheduler decides which process to
run next (choose highest priority) |
|
7. C procedure returns to assembly code |
|
8. Assembly languages procedure starts
up next process |
Process Scheduling
|
|
|
|
|
Criteria for scheduling |
|
Fairness: make sure each process gets
its fair share of the CPU |
|
Efficiency: keep the CPU busy 100% of
the time |
|
Response time: minimize response times
for interactive users |
|
Turnaround: minimize the time batch
users must wait for output |
|
Throughput: maximize the number of jobs
processed per hour |
|
Notes: |
|
Any scheduling algorithm will favor
some class of processes over others |
|
Need to find a balance of interests for
each specific environment |
|
Maximize throughput |
|
Minimize response times |
|
Some combination of the two |
Scheduling Methods
|
|
|
|
|
Examples of scheduling methods |
|
Preemptive |
|
Willing to context switch out current
job if a higher priority job is ready to run (vs. non-preemptive) |
|
Round robin |
|
Define a time quantum, say x
milliseconds |
|
Preempt the current process every time
quantum |
|
Run the next ready to run process |
|
Priority |
|
Assign priorities to processes |
|
Run process with the highest priority |
|
Shortest job first |
|
If
you know how long programs will take to run, you can use this
information to maximize throughput |
|
Run the shortest job first! |
|
Shortest job first also gives the
minimum average response time (this can be proven) |
Concurrency
|
|
|
|
|
What is concurrency |
|
Permit many activities to proceed in
parallel |
|
The activities synchronize and
communicate at well defined points |
|
Why concurrency? |
|
Easier to describe a system using
several independent cohesive activities than to have one monolithic coupled
function |
|
Link the activities using well-defined
interaction mechanisms |
|
Exploit multiple resources |
|
For example: many processors |
|
How can we time share between
activities? |
|
Need to switch between concurrent
activities |
|
Switching is referred to as context
switching |
|
Examples of concurrent programming
paradigms |
|
Event-driven environments |
|
Co-routines |
|
Threads |
Event-driven
Environments:
Window Interface Environments
|
|
|
|
|
Windowing environments |
|
Call back system |
|
Runtime environment catches events |
|
Mouse clicks |
|
Interprocess messages |
|
User defined events |
|
Application programs |
|
Define/register events with the runtime |
|
Associate each event with a user
written subprogram |
|
When an event occurs the runtime |
|
Causes the corresponding subprogram to
execute |
|
Subprogram may cause other
events....then returns control to the runtime system |
|
Runtime may or may not be
multi-threaded |
|
If it is it could associate each of the
subprograms with a new thread |
Concurrency:
Co-routines
|
|
|
|
|
|
Co-routines, a simple strategy |
|
When the current activity is ready to
transfer control it: |
|
Saves its context locally |
|
Context includes: registers including
the stack pointer, base pointer, instruction pointer, flags etc.. |
|
Branches to an entry point of the next
routine |
|
The first instructions of the called
routine restore its context then resumes execution |
|
The schedule is often well defined and
hard coded |
|
Could also use an intermediate routine
to implement a scheduler |
|
Co-routines with a user implemented
scheduler |
Co-routines (cont.)
|
|
|
|
|
|
A routine gives up control explicitly
by branching to another routine |
|
In the example: |
|
Each routine is only aware of one entry
point for each other routine |
|
There could be more |
|
The scheduler decides on the order of
execution |
|
Could describe the scheduler using a
finite state machine/state transition diagram |
|
These details could be hard-coded into
the routines |
|
Advantage: faster execution |
|
Disadvantage: more difficult to
maintain |
|
Communication can take place via
variables in a data segment |
|
Co-routines of this nature are usually
written in assembler |
|
Need access to registers to save
context and do branching |
|
A single stack used in a high level
language would get in the way |
|
Why? We have not explicitly
distinguished threads of control |
Threads
|
|
|
|
Has an execution state (running, ready,
etc.) |
|
Saves thread context when not running |
|
Has an execution stack and some
per-thread static storage for local variables |
|
Has access to the memory address space
and resources of its process |
|
all threads of a process share this |
|
when one thread alters a (non-private)
memory item, all other threads (of the process) sees that |
|
a file open with one thread, is
available to others |
Benefits of Threads vs
Processes
|
|
|
Takes less time to create a new thread
than a process |
|
Less time to terminate a thread than a
process |
|
Less time to switch between two threads
within the same process |
|
Since threads within the same process
share memory and files, they can communicate with each other without invoking
the kernel |
|
Therefore necessary to synchronize the
activities of various threads so that they do not obtain inconsistent views
of the data |
Benefits of Threads
|
|
|
Example 1: a file server on a LAN |
|
It needs to handle several file
requests over a short period |
|
Hence more efficient to create (and
destroy) a single thread for each request |
|
On a SMP machine: multiple threads can
possibly be executing simultaneously on different processors |
|
Example 2: one thread display menu and
read user input while the other threads execute user commands |
Problem: Inconsistent
Data Views
|
|
|
|
3 variables: A, B, C which are shared
by thread T1 and thread T2 |
|
T1 computes C = A+B |
|
T2 transfers amount X from A to B |
|
T2 must do: A = A-X and B = B+X (so
that A+B is unchanged) |
|
But if T1 computes A+B after T2 has
done A = A-X but before B = B+X, T1 will not obtain the correct result for C
= A + B |
|
Solution: critical sections (later) |
Threads States
|
|
|
Three key states: running, ready,
blocked |
|
They have no suspend state because all
threads within the same process share the same address space |
|
Indeed: suspending (i.e.: swapping) a
single thread involves suspending all threads of the same process |
|
Termination of a process, terminates
all threads within the process |
User-Level Threads (ULT)
|
|
|
The kernel is not aware of the
existence of threads |
|
All thread management is done by the
application by using a thread library |
|
Thread switching does not require
kernel mode privileges (no mode switch) |
|
Scheduling is application specific |
Threads library
|
|
|
|
Contains code for: |
|
creating and destroying threads |
|
passing messages and data between
threads |
|
scheduling thread execution |
|
saving and restoring thread contexts |
Kernel activity for
User-Level Threads
|
|
|
The kernel is not aware of thread
activity but it is still managing process activity |
|
When a thread makes a system call, the
whole process will be blocked |
|
but for the thread library that thread
is still in the running state |
|
So thread states are independent of
process states |
Advantages and
Inconveniences of ULT
|
|
|
|
Advantages |
|
Thread switching does not involve the
kernel: no mode switching |
|
Scheduling can be application specific:
choose the best algorithm. |
|
ULTs can run on any OS. Only needs a
thread library |
|
Inconveniences |
|
Most system calls are blocking and the
kernel blocks processes. So all threads within the process will be blocked |
|
The kernel can only assign processes to
processors. Two threads within the same process cannot run simultaneously on
two processors |
Kernel-Level Threads
(KLT)
|
|
|
All thread management is done by kernel |
|
No thread library but an API to the
kernel thread facility |
|
Kernel maintains context information
for the process and the threads |
|
Switching between threads requires the
kernel |
|
Scheduling on a thread basis |
|
Ex: Windows NT and OS/2 |
Advantages and
Inconveniences of KLT
|
|
|
|
Advantages |
|
the kernel can simultaneously schedule
many threads of the same process on many processors |
|
blocking is done on a thread level |
|
kernel routines can be multithreaded |
|
Inconveniences |
|
thread switching within the same
process involves the kernel. We have 2 mode switches per thread switch |
|
this results in a significant slow down |
Combined ULT/KLT
Approaches
|
|
|
Thread creation done in the user space |
|
Bulk of scheduling and synchronization
of threads done in the user space |
|
The programmer may adjust the number of
KLTs |
|
May combine the best of both approaches |
|
Example is Solaris |
Language with Support for
Concurrency: Ada
|
|
|
|
|
|
Explicit language support for
concurrency |
|
Has a programming construct named a task |
|
Each task is a thread with its own
stack |
|
Tasks are objects with methods named entries |
|
Task Stack is |
|
Entry Init(....); |
|
Entry Push(...); |
|
Entry Pop(...); |
|
End Stack; |
|
Task body Stack is |
|
Begin |
|
Loop |
|
Select |
|
accept Init(....) do ...... |
|
End Init; |
|
or |
|
accept Push(...) do ..... |
|
End Push; |
|
.... |
|
End select; |
|
Other work.... |
|
End Loop |
|
End Stack; |
Ada Rendezvous
|
|
|
|
|
Interaction mechanism |
|
Rendezvous |
|
Looks like a procedure call in the
source code |
|
Stack.Init(...), Stack.Push(....) |
|
Rendezvous as a call-accept-release
sequence |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The serving task can only serve one
caller at a time, if its busy when a call arrives the calling task is placed
in a queue for the entry |
Stacks in a Concurrent
Program
|
|
|
|
|
What does the stack look like in a
concurrent program? |
|
No longer a single stack |
|
Each thread needs its own stack |
|
Stack space can be allocated from a
heap |
|
Issue becomes complex in a language
like Ada with sophisticated scoping rules |
|
Task A specification and body |
|
Local variable a |
|
Task B specification and body |
|
Task C specification and body |
|
Procedure D |
|
End Task A |
|
Local variables are placed on the
stack, how are they linked together? |
|
A complex multi-linked list data
structure called a cactus stack! |
|
What would the assembler sequences look
like for accessing a local variable in main from C? |
PThreads (POSIX Threads)
|
|
|
|
Instead of extensions to languages a
package of routines is used |
|
Environment usually initialized using a
subroutine call in the main environment |
|
Interaction mechanisms |
|
Shared memory |
|
Mutex for synchronization |
|
Fork and join primitives |
|
Examples of operations: |
|
pthread_create(thread_id, attributes,
start_routine, address_of_stack) |
|
pthread_yield(void) |
|
pthread_mutex_init(mutex, attributes) |
|
pthread_mutex_lock(mutex) |
|
pthread_mutex_unlock(mutex) |
|
Others..... |
Problems with Concurrent
Execution
|
|
|
Concurrent processes (or threads) often
need to share data (maintained either in shared memory or files) and
resources |
|
If there is no controlled access to
shared data, some processes will obtain an inconsistent view of this data |
|
The action performed by concurrent
processes will then depend on the order in which their execution is
interleaved |
An Example
|
|
|
Process P1 and P2 are running this same
procedure and have access to the same variable “a” |
|
Processes can be interrupted anywhere |
|
If P1 is first interrupted after user
input and P2 executes entirely |
|
Then the character echoed by P1 will be
the one read by P2 !! |
The Critical Section
Problem
|
|
|
When a process executes code that
manipulates shared data (or resource), we say that the process is in it’s critical
section (CS) (for that shared data) |
|
The execution of critical sections must
be mutually exclusive: at any time, only one process is allowed to execute in
its critical section (even with multiple CPUs) |
|
Then each process must request the permission to enter it’s critical section
(CS) |
The Critical Section
Problem
|
|
|
The section of code implementing this
request is called the entry section |
|
The critical section (CS) might be
followed by an exit section |
|
The remaining code is the remainder
section |
|
The critical section problem is to
design a protocol that the processes can use so that their action will not
depend on the order in which their execution is interleaved (possibly on many
processors) |
Framework for Analysis of
Solutions
|
|
|
Each process executes at nonzero speed
but no assumption on the relative speed of n processes |
|
General structure of a process: |
|
many CPU may be present but memory
hardware prevents simultaneous access to the same memory location |
|
No assumption about order of
interleaved execution |
|
For solutions: we need to specify entry
and exit sections |
|
|
Software Solutions:
Algorithm 1
|
|
|
The shared variable turn is initialized
(to 0 or 1) before executing any Pi |
|
Pi’s critical section is executed iff
turn = i |
|
Pi is busy waiting if Pj is in CS:
mutual exclusion is satisfied |
|
Progress requirement is not satisfied
since it requires strict alternation of CSs |
|
Ex: if turn=0 and P0 is in it’s RS, P1
can never enter it’s CS |
Software
Solutions:
Algorithm 2
|
|
|
|
Keep 1
Bool variable for each process: flag[0] and flag[1] |
|
Pi signals that it is ready to enter
it’s CS by: flag[i]:=true |
|
Mutual Exclusion is satisfied but not
the progress requirement |
|
If we have the sequence: |
|
T0: flag[0]:=true |
|
T1: flag[1]:=true |
|
Both process will wait forever to enter
their CS: we have a deadlock |
|
|
Software
Solutions:
Algorithm 3
|
|
|
Initialization: flag[0]:=flag[1]:=false
turn:= 0 or 1 |
|
Willingness to enter CS specified by
flag[i]:=true |
|
If both processes attempt to enter
their CS simultaneously, only one turn value will last |
|
Exit section: specifies that Pi is
unwilling to enter CS |
|
This algorithm is also known as
“Peterson’s Algorithm” |
Semaphores
|
|
|
|
Synchronization tool (provided by the
OS) that do not require busy waiting |
|
A semaphore S is an integer variable
that, apart from initialization, can only be accessed through 2 atomic and
mutually exclusive operations: |
|
wait(S) |
|
signal(S) |
|
To avoid busy waiting: when a process
has to wait, it will be put in a blocked queue of processes waiting for the
same event |
Semaphores
|
|
|
Hence, in fact, a semaphore is a record
(structure): |
|
When a process must wait for a
semaphore S, it is blocked and put on the semaphore’s queue |
|
The signal operation removes (according
to a fair policy like FIFO) one process from the queue and puts it in the
list of ready processes |
Semaphore Operations
(atomic)
Semaphores: Observations
|
|
|
|
When S.count >=0: the number of processes that can execute
wait(S) without being blocked = S.count |
|
When S.count<0: the number of
processes waiting on S is = |S.count| |
|
Atomicity and mutual exclusion: no 2
processes can be in wait(S) and signal(S) (on the same S) at the same time
(even with multiple CPUs) |
|
This is a critical section problem:
critical sections are blocks of code defining wait(S) and signal(S) |
|
Hence: the critical sections here are
very short: typically 10 instructions |
|
Solutions: |
|
uniprocessor: disable interrupts during
these operations (ie: for a very short period). This does not work on a
multiprocessor machine. |
|
multiprocessor: use previous software
or hardware schemes. The amount of busy waiting should be small. |
Semaphores and Critical
Sections
|
|
|
For n processes |
|
Initialize S.count to 1 |
|
Then only 1 process is allowed into CS
(mutual exclusion) |
|
To allow k processes into CS, we
initialize S.count to k |
Process
Synchronization:
Monitors
|
|
|
|
|
Monitors |
|
A package of subprograms with conditions |
|
conditions are basically semaphores |
|
Subprograms in the monitor are
protected by a single mutex variable so only one caller can enter the monitor
at a time |
|
Clients can call the subprograms and
then wait on a condition within the monitor |
|
For example: cwait (buffer_full) |
|
When waiting, a client gives up its
mutex on the monitor as well |
|
A new caller may enter and signal the
condition |
|
For example: csignal(buffer_full) |
|
When the new caller leaves the monitor,
a single client waiting on buffer_full will resume where it left off |
|
Note the resumed client will have mutex
over the monitor’s subprograms |
Monitor
|
|
|
Awaiting processes are either in the
entrance queue or in a condition queue |
|
A process puts itself into condition
queue cn by issuing cwait(cn) |
|
csignal(cn) brings into the monitor 1
process in condition cn queue |
|
Hence csignal(cn) blocks the calling
process and puts it in the urgent queue (unless signal is the last operation
of the monitor procedure) |
Monitor Example:
Producer-Consumer
Managing Memory within a
Program
|
|
|
|
|
Three ways memory is allocated |
|
Static |
|
Allocated as soon as execution begins |
|
Lasts until termination |
|
Usually reside in fixed data segments |
|
Note: a variable can be static yet have
a local scope |
|
Local |
|
Created on the stack or in a register
when an enclosing block or function is entered |
|
Disappear when the block or function is
exited |
|
Dynamic |
|
Created and destroyed by specific
function calls during a program |
|
Can be called by application source or
the runtime |
|
Operations: malloc or new, and free or
delete |
|
How does dynamic memory allocation
work? |
|
Use a datatype we call a heap |
Heap
|
|
|
|
|
Heap: |
|
Pre-allocated area of memory
partitioned into blocks |
|
Linked list of blocks of various sizes |
|
Could even start with a single large
block |
|
Operations: malloc or new and free or
delete |
|
|
|
|
|
|
|
|
|
|
|
|
|
Time complexity for operations depends
on the choice of data structure |
|
Could have a single list, multiple
lists, doubly linked lists |
|
List(s) may be sorted, or not |
|
May split blocks if they are large
compared to request sizes |
|
Can lead to fragmentation: all blocks
become small |
|
May have to merge blocks if none are
large enough to satisfy a request |
Garbage Collection
|
|
|
|
|
Garbage collection is a reorganization
of the heap to decrease fragmentation |
|
We can merge adjacent free blocks to
make larger blocks (easy to do) |
|
Theoretically you could move data and change
pointers to the data in the application program to decrease fragmentation
(note: this is difficult) |
|
|
|
|
|
|
|
|
|
|
|
|
|
Tools for managing a heap |
|
Heap statistics |
|
Give frequency with which each block
size was used |
|
Number of times blocks are
fragmented and merged |
|
Configuration utilities |
|
Setting an initial configuration for a
heap |