|
|
|
|
Is a program that controls the execution of
application programs |
|
OS must relinquish control to user programs and
regain it safely and efficiently |
|
Tells the CPU when to execute other pgms |
|
Is an interface between the user and hardware |
|
Masks the details of the hardware to application
programs |
|
Hence OS must deal with hardware details |
|
|
|
|
|
Must adapt to hardware upgrades and new types of
hardware. Examples: |
|
Character vs graphic terminals |
|
Introduction of paging hardware |
|
Must offer new services, eg: internet support |
|
The need to change the OS on regular basis place
requirements on it’s design |
|
modular construction with clean interfaces |
|
object oriented methodology |
|
|
|
|
Are the first operating systems (mid-50s) |
|
The user submit a job (written on card or tape)
to a computer operator |
|
The computer operator place a batch of several
jobs on a input device |
|
A special program, the monitor, manages the
execution of each program in the batch |
|
Resident monitor is in main memory and available
for execution |
|
Monitor utilities are loaded when needed |
|
|
|
|
|
Monitor reads jobs one at a time from the input
device |
|
Monitor places a job in the user program area |
|
A monitor instruction branches to the start of
the user program |
|
Execution of user pgm continues until: |
|
end-of-pgm occurs |
|
error occurs |
|
Causes the CPU to fetch its next instruction
from Monitor |
|
OS: |
|
Alternates execution between user program and
the monitor program |
|
Relies on available hardware to effectively
alternate execution from various parts of memory |
|
|
|
|
|
|
|
|
|
Memory protection |
|
do not allow the memory area containing the
monitor to be altered by user programs |
|
Timer |
|
prevents a job from monopolizing the system |
|
an interrupt occurs when time expires |
|
Privileged instructions |
|
can be executed only by the monitor |
|
an interrupt occurs if a program tries these
instructions |
|
Interrupts |
|
provides flexibility for relinquishing control
to and regaining control from user programs |
|
manage resource contention |
|
either hardware support through special
instructions (lock, swap, …) |
|
software support: semaphores, monitors, …. |
|
|
|
|
I/O operations are exceedingly slow (compared to
instruction execution) |
|
A program containing even a very small number of
I/O ops, will spend most of its time waiting for them |
|
Hence: poor CPU usage when only one program is
present in memory |
|
|
|
|
If memory can hold several programs, then CPU
can switch to another one whenever a program is awaiting for an I/O to
complete |
|
This is multitasking (multiprogramming) |
|
|
|
|
Almost no contention for resources |
|
All 3 can run in minimum time in a multitasking
environment (assuming JOB2/3 have enough CPU time to keep their I/O
operations active) |
|
|
|
|
|
Batch multiprogramming does not support
interaction with users |
|
TSS extends multiprogramming to handle multiple
interactive jobs |
|
Processor’s time is shared among multiple users |
|
Multiple users simultaneously access the system
through terminals |
|
Because of slow human reaction time, a typical
user needs 2 sec of processing time per minute |
|
Then (about) 30 users should be able to share
the same system without noticeable delay in the computer reaction time |
|
The file system must be protected (multiple users…) |
|
|
|
|
|
|
|
|
Operating systems strive to do many things at
the same time |
|
Run a user program |
|
Read from or write to a disk |
|
Print on a printer |
|
Would like to keep as many resources busy as
possible to increase the system’s efficiency |
|
Multi-programming or time-sharing |
|
CPU switches from program to program |
|
Switch when |
|
A time slice (10s or 100s milli-sec pass) or
when a process does an IO operation |
|
Gives the appearance of concurrency |
|
Many users feel like they have exclusive access
to the machine |
|
Process model supports this approach |
|
|
|
|
|
|
|
Introduced to obtain a systematic way of
monitoring and controlling program execution |
|
A process is an executable program with: |
|
associated data (variables, buffers…) |
|
execution context: i.e. all the information that |
|
the CPU
needs to execute the process |
|
content of
the processor registers |
|
the OS needs to manage the process: |
|
priority of the process |
|
the event (if any) after which the process is
waiting |
|
other data (that we will introduce later) |
|
|
|
|
|
|
Process model: states |
|
Running |
|
Blocked |
|
Ready |
|
|
|
|
|
|
|
Scheduler
(kernel) is lowest level of the operating system |
|
Handles all the interrupts |
|
Does context switching |
|
Handles interprocess communication |
|
Rest of OS is on top of the scheduler (kernel)
implemented as processes |
|
For example |
|
When a disk interrupt occurs |
|
The system stops running the current process |
|
Makes the disk process, which was blocked
waiting for the interrupt, ready to run |
|
Runs another process (possibly resuming the
original process) |
|
|
|
|
To implement processes the OS maintains a
process table |
|
Examples of important information in the table |
|
Process management Memory management
File Management |
|
Registers Pointers to segments Root directory |
|
Program counter Process id Working directory |
|
Stack pointer Exit status File descriptors |
|
Process start time .... .... |
|
CPU time used |
|
Message queue pointers |
|
|
|
Interrupt handling is accomplished as follows |
|
1. Interrupt: hardware stacks program counter,
etc.. of current program |
|
2. Hardware loads new program counter from
interrupt vector (table) |
|
3. Assembly language routine saves registers in
process table (above) |
|
4. Assembly language routine procedures sets up
new stack for C handler |
|
5. C procedure marks interrupt service process
as ready in the above table |
|
6. Scheduler decides which process to run next
(choose highest priority) |
|
7. C procedure returns to assembly code |
|
8. Assembly languages procedure starts up next
process |
|
|
|
|
|
|
Criteria for scheduling |
|
Fairness: make sure each process gets its fair
share of the CPU |
|
Efficiency: keep the CPU busy 100% of the time |
|
Response time: minimize response times for
interactive users |
|
Turnaround: minimize the time batch users must
wait for output |
|
Throughput: maximize the number of jobs
processed per hour |
|
Notes: |
|
Any scheduling algorithm will favor some class
of processes over others |
|
Need to find a balance of interests for each
specific environment |
|
Maximize throughput |
|
Minimize response times |
|
Some combination of the two |
|
|
|
|
|
|
Examples of scheduling methods |
|
Preemptive |
|
Willing to context switch out current job if a
higher priority job is ready to run (vs. non-preemptive) |
|
Round robin |
|
Define a time quantum, say x milliseconds |
|
Preempt the current process every time quantum |
|
Run the next ready to run process |
|
Priority |
|
Assign priorities to processes |
|
Run process with the highest priority |
|
Shortest job first |
|
If you
know how long programs will take to run, you can use this information to
maximize throughput |
|
Run the shortest job first! |
|
Shortest job first also gives the minimum
average response time (this can be proven) |
|
|
|
|
|
|
What is concurrency |
|
Permit many activities to proceed in parallel |
|
The activities synchronize and communicate at
well defined points |
|
Why concurrency? |
|
Easier to describe a system using several
independent cohesive activities than to have one monolithic coupled
function |
|
Link the activities using well-defined
interaction mechanisms |
|
Exploit multiple resources |
|
For example: many processors |
|
How can we time share between activities? |
|
Need to switch between concurrent activities |
|
Switching is referred to as context switching |
|
Examples of concurrent programming paradigms |
|
Event-driven environments |
|
Co-routines |
|
Threads |
|
|
|
|
|
|
Windowing environments |
|
Call back system |
|
Runtime environment catches events |
|
Mouse clicks |
|
Interprocess messages |
|
User defined events |
|
Application programs |
|
Define/register events with the runtime |
|
Associate each event with a user written
subprogram |
|
When an event occurs the runtime |
|
Causes the corresponding subprogram to execute |
|
Subprogram may cause other events....then
returns control to the runtime system |
|
Runtime may or may not be multi-threaded |
|
If it is it could associate each of the
subprograms with a new thread |
|
|
|
|
|
|
|
Co-routines, a simple strategy |
|
When the current activity is ready to transfer
control it: |
|
Saves its context locally |
|
Context includes: registers including the stack
pointer, base pointer, instruction pointer, flags etc.. |
|
Branches to an entry point of the next routine |
|
The first instructions of the called routine
restore its context then resumes execution |
|
The schedule is often well defined and hard
coded |
|
Could also use an intermediate routine to
implement a scheduler |
|
Co-routines with a user implemented scheduler |
|
|
|
|
|
|
|
A routine gives up control explicitly by
branching to another routine |
|
In the example: |
|
Each routine is only aware of one entry point
for each other routine |
|
There could be more |
|
The scheduler decides on the order of execution |
|
Could describe the scheduler using a finite
state machine/state transition diagram |
|
These details could be hard-coded into the
routines |
|
Advantage: faster execution |
|
Disadvantage: more difficult to maintain |
|
Communication can take place via variables in a
data segment |
|
Co-routines of this nature are usually written
in assembler |
|
Need access to registers to save context and do
branching |
|
A single stack used in a high level language
would get in the way |
|
Why? We have not explicitly distinguished
threads of control |
|
|
|
|
|
Has an execution state (running, ready, etc.) |
|
Saves thread context when not running |
|
Has an execution stack and some per-thread
static storage for local variables |
|
Has access to the memory address space and
resources of its process |
|
all threads of a process share this |
|
when one thread alters a (non-private) memory
item, all other threads (of the process) sees that |
|
a file open with one thread, is available to
others |
|
|
|
|
Takes less time to create a new thread than a
process |
|
Less time to terminate a thread than a process |
|
Less time to switch between two threads within
the same process |
|
Since threads within the same process share
memory and files, they can communicate with each other without invoking the
kernel |
|
Therefore necessary to synchronize the
activities of various threads so that they do not obtain inconsistent views
of the data |
|
|
|
|
Example 1: a file server on a LAN |
|
It needs to handle several file requests over a
short period |
|
Hence more efficient to create (and destroy) a
single thread for each request |
|
On a SMP machine: multiple threads can possibly
be executing simultaneously on different processors |
|
Example 2: one thread display menu and read user
input while the other threads execute user commands |
|
|
|
|
|
3 variables: A, B, C which are shared by thread
T1 and thread T2 |
|
T1 computes C = A+B |
|
T2 transfers amount X from A to B |
|
T2 must do: A = A-X and B = B+X (so that A+B is
unchanged) |
|
But if T1 computes A+B after T2 has done A = A-X
but before B = B+X, T1 will not obtain the correct result for C = A + B |
|
Solution: critical sections (later) |
|
|
|
|
Three key states: running, ready, blocked |
|
They have no suspend state because all threads
within the same process share the same address space |
|
Indeed: suspending (i.e.: swapping) a single
thread involves suspending all threads of the same process |
|
Termination of a process, terminates all threads
within the process |
|
|
|
|
The kernel is not aware of the existence of
threads |
|
All thread management is done by the application
by using a thread library |
|
Thread switching does not require kernel mode
privileges (no mode switch) |
|
Scheduling is application specific |
|
|
|
|
|
Contains code for: |
|
creating and destroying threads |
|
passing messages and data between threads |
|
scheduling thread execution |
|
saving and restoring thread contexts |
|
|
|
|
The kernel is not aware of thread activity but
it is still managing process activity |
|
When a thread makes a system call, the whole
process will be blocked |
|
but for the thread library that thread is still
in the running state |
|
So thread states are independent of process
states |
|
|
|
|
|
Advantages |
|
Thread switching does not involve the kernel: no
mode switching |
|
Scheduling can be application specific: choose
the best algorithm. |
|
ULTs can run on any OS. Only needs a thread
library |
|
Inconveniences |
|
Most system calls are blocking and the kernel
blocks processes. So all threads within the process will be blocked |
|
The kernel can only assign processes to
processors. Two threads within the same process cannot run simultaneously
on two processors |
|
|
|
|
All thread management is done by kernel |
|
No thread library but an API to the kernel
thread facility |
|
Kernel maintains context information for the
process and the threads |
|
Switching between threads requires the kernel |
|
Scheduling on a thread basis |
|
Ex: Windows NT and OS/2 |
|
|
|
|
|
Advantages |
|
the kernel can simultaneously schedule many
threads of the same process on many processors |
|
blocking is done on a thread level |
|
kernel routines can be multithreaded |
|
Inconveniences |
|
thread switching within the same process
involves the kernel. We have 2 mode switches per thread switch |
|
this results in a significant slow down |
|
|
|
|
Thread creation done in the user space |
|
Bulk of scheduling and synchronization of
threads done in the user space |
|
The programmer may adjust the number of KLTs |
|
May combine the best of both approaches |
|
Example is Solaris |
|
|
|
|
|
|
|
Explicit language support for concurrency |
|
Has a programming construct named a task |
|
Each task is a thread with its own stack |
|
Tasks are objects with methods named entries |
|
Task Stack is |
|
Entry Init(....); |
|
Entry Push(...); |
|
Entry Pop(...); |
|
End Stack; |
|
Task body Stack is |
|
Begin |
|
Loop |
|
Select |
|
accept Init(....) do ...... |
|
End Init; |
|
or |
|
accept Push(...) do ..... |
|
End Push; |
|
.... |
|
End select; |
|
Other work.... |
|
End Loop |
|
End Stack; |
|
|
|
|
|
|
Interaction mechanism |
|
Rendezvous |
|
Looks like a procedure call in the source code |
|
Stack.Init(...), Stack.Push(....) |
|
Rendezvous as a call-accept-release sequence |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The serving task can only serve one caller at a
time, if its busy when a call arrives the calling task is placed in a queue
for the entry |
|
|
|
|
|
|
What does the stack look like in a concurrent
program? |
|
No longer a single stack |
|
Each thread needs its own stack |
|
Stack space can be allocated from a heap |
|
Issue becomes complex in a language like Ada
with sophisticated scoping rules |
|
Task A specification and body |
|
Local variable a |
|
Task B specification and body |
|
Task C specification and body |
|
Procedure D |
|
End Task A |
|
Local variables are placed on the stack, how are
they linked together? |
|
A complex multi-linked list data structure
called a cactus stack! |
|
What would the assembler sequences look like for
accessing a local variable in main from C? |
|
|
|
|
|
Instead of extensions to languages a package of
routines is used |
|
Environment usually initialized using a
subroutine call in the main environment |
|
Interaction mechanisms |
|
Shared memory |
|
Mutex for synchronization |
|
Fork and join primitives |
|
Examples of operations: |
|
pthread_create(thread_id, attributes,
start_routine, address_of_stack) |
|
pthread_yield(void) |
|
pthread_mutex_init(mutex, attributes) |
|
pthread_mutex_lock(mutex) |
|
pthread_mutex_unlock(mutex) |
|
Others..... |
|
|
|
|
Concurrent processes (or threads) often need to
share data (maintained either in shared memory or files) and resources |
|
If there is no controlled access to shared data,
some processes will obtain an inconsistent view of this data |
|
The action performed by concurrent processes
will then depend on the order in which their execution is interleaved |
|
|
|
|
Process P1 and P2 are running this same
procedure and have access to the same variable “a” |
|
Processes can be interrupted anywhere |
|
If P1 is first interrupted after user input and
P2 executes entirely |
|
Then the character echoed by P1 will be the one
read by P2 !! |
|
|
|
|
When a process executes code that manipulates
shared data (or resource), we say that the process is in it’s critical
section (CS) (for that shared data) |
|
The execution of critical sections must be mutually
exclusive: at any time, only one process is allowed to execute in its
critical section (even with multiple CPUs) |
|
Then each process must request the permission to enter it’s critical
section (CS) |
|
|
|
|
The section of code implementing this request is
called the entry section |
|
The critical section (CS) might be followed by
an exit section |
|
The remaining code is the remainder section |
|
The critical section problem is to design a
protocol that the processes can use so that their action will not depend on
the order in which their execution is interleaved (possibly on many
processors) |
|
|
|
|
Each process executes at nonzero speed but no
assumption on the relative speed of n processes |
|
General structure of a process: |
|
many CPU may be present but memory hardware
prevents simultaneous access to the same memory location |
|
No assumption about order of interleaved
execution |
|
For solutions: we need to specify entry and exit
sections |
|
|
|
|
|
|
The shared variable turn is initialized (to 0 or
1) before executing any Pi |
|
Pi’s critical section is executed iff turn = i |
|
Pi is busy waiting if Pj is in CS: mutual
exclusion is satisfied |
|
Progress requirement is not satisfied since it
requires strict alternation of CSs |
|
Ex: if turn=0 and P0 is in it’s RS, P1 can never
enter it’s CS |
|
|
|
|
|
Keep 1
Bool variable for each process: flag[0] and flag[1] |
|
Pi signals that it is ready to enter it’s CS by:
flag[i]:=true |
|
Mutual Exclusion is satisfied but not the
progress requirement |
|
If we have the sequence: |
|
T0: flag[0]:=true |
|
T1: flag[1]:=true |
|
Both process will wait forever to enter their
CS: we have a deadlock |
|
|
|
|
|
|
Initialization: flag[0]:=flag[1]:=false turn:= 0
or 1 |
|
Willingness to enter CS specified by
flag[i]:=true |
|
If both processes attempt to enter their CS
simultaneously, only one turn value will last |
|
Exit section: specifies that Pi is unwilling to
enter CS |
|
This algorithm is also known as “Peterson’s
Algorithm” |
|
|
|
|
|
Synchronization tool (provided by the OS) that
do not require busy waiting |
|
A semaphore S is an integer variable that, apart
from initialization, can only be accessed through 2 atomic and mutually
exclusive operations: |
|
wait(S) |
|
signal(S) |
|
To avoid busy waiting: when a process has to
wait, it will be put in a blocked queue of processes waiting for the same
event |
|
|
|
|
Hence, in fact, a semaphore is a record
(structure): |
|
When a process must wait for a semaphore S, it
is blocked and put on the semaphore’s queue |
|
The signal operation removes (according to a
fair policy like FIFO) one process from the queue and puts it in the list
of ready processes |
|
|
|
|
|
|
When S.count >=0: the number of processes that can execute wait(S) without
being blocked = S.count |
|
When S.count<0: the number of processes
waiting on S is = |S.count| |
|
Atomicity and mutual exclusion: no 2 processes
can be in wait(S) and signal(S) (on the same S) at the same time (even with
multiple CPUs) |
|
This is a critical section problem: critical
sections are blocks of code defining wait(S) and signal(S) |
|
Hence: the critical sections here are very
short: typically 10 instructions |
|
Solutions: |
|
uniprocessor: disable interrupts during these
operations (ie: for a very short period). This does not work on a
multiprocessor machine. |
|
multiprocessor: use previous software or
hardware schemes. The amount of busy waiting should be small. |
|
|
|
|
For n processes |
|
Initialize S.count to 1 |
|
Then only 1 process is allowed into CS (mutual
exclusion) |
|
To allow k processes into CS, we initialize
S.count to k |
|
|
|
|
|
|
Monitors |
|
A package of subprograms with conditions |
|
conditions are basically semaphores |
|
Subprograms in the monitor are protected by a
single mutex variable so only one caller can enter the monitor at a time |
|
Clients can call the subprograms and then wait
on a condition within the monitor |
|
For example: cwait (buffer_full) |
|
When waiting, a client gives up its mutex on the
monitor as well |
|
A new caller may enter and signal the condition |
|
For example: csignal(buffer_full) |
|
When the new caller leaves the monitor, a single
client waiting on buffer_full will resume where it left off |
|
Note the resumed client will have mutex over the
monitor’s subprograms |
|
|
|
|
Awaiting processes are either in the entrance
queue or in a condition queue |
|
A process puts itself into condition queue cn by
issuing cwait(cn) |
|
csignal(cn) brings into the monitor 1 process in
condition cn queue |
|
Hence csignal(cn) blocks the calling process and
puts it in the urgent queue (unless signal is the last operation of the
monitor procedure) |
|
|
|
|
|
|
|
Three ways memory is allocated |
|
Static |
|
Allocated as soon as execution begins |
|
Lasts until termination |
|
Usually reside in fixed data segments |
|
Note: a variable can be static yet have a local
scope |
|
Local |
|
Created on the stack or in a register when an
enclosing block or function is entered |
|
Disappear when the block or function is exited |
|
Dynamic |
|
Created and destroyed by specific function calls
during a program |
|
Can be called by application source or the
runtime |
|
Operations: malloc or new, and free or delete |
|
How does dynamic memory allocation work? |
|
Use a datatype we call a heap |
|
|
|
|
|
|
Heap: |
|
Pre-allocated area of memory partitioned into
blocks |
|
Linked list of blocks of various sizes |
|
Could even start with a single large block |
|
Operations: malloc or new and free or delete |
|
|
|
|
|
|
|
|
|
|
|
|
|
Time complexity for operations depends on the
choice of data structure |
|
Could have a single list, multiple lists, doubly
linked lists |
|
List(s) may be sorted, or not |
|
May split blocks if they are large compared to
request sizes |
|
Can lead to fragmentation: all blocks become
small |
|
May have to merge blocks if none are large
enough to satisfy a request |
|
|
|
|
|
|
Garbage collection is a reorganization of the
heap to decrease fragmentation |
|
We can merge adjacent free blocks to make larger
blocks (easy to do) |
|
Theoretically you could move data and change
pointers to the data in the application program to decrease fragmentation
(note: this is difficult) |
|
|
|
|
|
|
|
|
|
|
|
|
|
Tools for managing a heap |
|
Heap statistics |
|
Give frequency with which each block size was
used |
|
Number of times blocks are fragmented and merged |
|
Configuration utilities |
|
Setting an initial configuration for a heap |
|