Notes
Outline
   Basic Components
Processor (CPU)
Main Memory (aka real memory, aka primary memory)
holds data in code
I/O modules (I/O controllers, I/O channels, I/O processors...)
hardware (with registers called I/O ports) that moves data between cpu and peripherals like:
secondary memory devices (eg: hard disks)
keyboard, display...
communications equipment
System interconnection (ie: Buses)
communication among processors, memory, and I/O modules
Secondary memory (aka stable memory, aka storage)
preserves contents even with computer powered down
Processor Model
Processor system
Collection of registers with information to manipulate
A register is a memory location that may contain program instructions and/or data, sequence of bits [0|1]
Different processors have different numbers and sizes of registers
Some registers act as interface to outside world (via pins)
The pins are directly connected to special purpose bus interface registers within the processor
Active behaviour: perform instructions
Instructions operate on the registers
For example: add the contents of two registers and place the result in a register
Buses/lines are wires that physically connect the pins of processors, memories, and other devices
   Digital Logic
Registers are made up of and connected by digital logic circuitry
And, or, and not gates
Values in registers + values on pins is input, new set of values in registers + pins is output
Clock used to activate changes in processor state
Several clock cycles are often needed to accomplish what we think of as an instruction
In general:
The faster the clock the faster we can execute instructions
The more complex the instruction the more cycles that are needed
Some processors can execute parts of several instructions in parallel (pipeline) to increase their instructions processed per second
   Data Representation
Registers: hold addresses and data
Fixed width memory word that hold binary values
Width = M bits => 2M different binary values
16 bit register can point to 65536 locations
32 bit register can point to 4294967296
64 bit register can point to 1.84467E+19 locations
Hexadecimal notation 0...9, A..F
16 values, 1 hex digit is 4 binary bits
0000 => 0  0100 => 4  1000 => 8   1100 => C
0001 => 1  0101 => 5  1001 => 9   1101 => D
0010 => 2  0110 => 6  1010 => A  1110 => E
0011 => 3  0111 => 7  1011 => B  1111 => F
Data Representation (cont.)
Example
Can represent long binary values using shorter hex notation
1011 0110 0010 1101 16 digit binary value (base 2)
B       6       2       D        4 digit hex value (base 16)
How can we describe abstract information relevant to a system’s objectives using binary values?
Examples: integers, characters, real numbers, colours, dates, graphics....
Must encode information into binary values
Standard schemes often used, but programmers must sometimes invent their own schemes
Standard formats for data encoding
Unsigned integers (counting integers), signed integers, real numbers, characters
Data Representation (cont.)
Unsigned integers
Interpret binary value as a base 2 value
1   0   1  0  0  1
                       |---> 20  = 1
                    |-----> 21    = 0
                |--------> 22    = 0
            |-----------> 23    = 8
       |---------------> 24   = 0
 |-------------------> 25    = 32
                           Sum is 41 decimal (base 10)
Size of register determines how many integers can be encoded this way
Some times systems have double integers or longs for double the width and more integers to be represented
Data Representation (cont.)
Signed Integers
Need encoding for both positive and negative numbers
Assume M bits in a register  word
Use one bit (say the left most, most significant bit) to represent sign
0 => positive
1 => negative
Encode magnitude with remaining M-1 bits
Example for M=4
(+1) 0 001
(-1) 1 001
  -------
  1 010 =  -2 decimal?? Answer should be zero!
Also, there are two representations of zero
(+0)   0  000
(-0)    1  000
Data Representation (cont.)
Solution: 2’s complement representation
To negate a value take the 2’s complement
i) complement every bit (flip) 0->1 and 1->0
ii) add 1
Consider the decimal value (+1), take its 2’s complement
0001 -> i) 1110
             ii) 0001
Add to get 1111 This value represents (-1) decimal
Add the two together
1 0000
We truncate to keep the 4 lowest bits => 0000
We ignore the 5th bit (ie the carry over bit for now)
In general, for 2’s complement M bit values we have the range -2M-1 to 2M-1 -1
Data Representation (cont.)
Characters
Usually represented in ASCII or EBCDIC
ASCII American National Character Set
Can be found in any programming language text
00 Null
....
0A Line Feed (LF)
OD Return     (CR)
...
41 A
....
5A Z
...
61 a
.....
Data Representation (cont.)
Real numbers cover a wide range of values
Only useful computer representation is the scientific notation
Often called floating point
For ex:
0.231418x102=23.1418
0.7621x10-3 = 0.0007621
Computer uses binary number representation
For ex. 0.110111 x 21101
Numbers are normalized which means that the leading bit is always a one
Some computer use this fact to give an extra bit of precision (they simply don’t store the implicit leading one)
IEEE 754 Floating Point Storage
Used on PCs, many larger machines (not all)
32 bit single precision
Exponent is biased, ie offset by a constant for hardware efficiency reasons
64 bit double precision
For example
     20.5937510  = .1010010011 x 25
Sign is 0, exponent is 5 + 127 (bias)=132, leading 1 in mantissa is not stored
IEEE 754 Floating Point (cont.)
Bias is chosen as follows
Let e be the number of bits in the exponent
bias = 2(e-1)-1
Gives exponents in range -bias +1 to bias
e=8 => bias is 127
                  => min exponent = -126 =  00000001
            => max exponent = 127 =  11111110
What about 11111111?
Used to represent NaN - Not a Number
Could be caused by overflow or underflow
What about 00000000?
If mantissa is zero => the number 0
Note this is special, since a leading one is usually implicit
If mantissa is non-zero => denormalized numbers
Numbers where leading bit is not 1
Depending on implementation, may cause an  underflow
Too small a number
Characteristics of Floating Point Numbers
1. If mantissa is normalized some bit patterns may not be used
Wasteful of space (Not true on better implementations)
For example:
Consider a 1 bit exponent
               a 2 bit normalized mantissa (no implied leading one)
               all signs positive
To represent 0    :   0.00 x 20
                     1/2 :   0.10 x 20
             not used :   0.00 x 21
             not used :   0.01 x 21
2. Distribution of number representations is not uniform (property of fixed size mantissa)
Step sizes between numbers gets larger as exponent increases
Same example as above:
0.10 x 20 to 0.11 x 20 stepsize is 0.01x20 = 0.2510
0.10 x 21 to 0.11 x 21 stepsize is 0.01x21 = 0.5010
Characteristics of Floating Point Numbers (cont.)
3. Normalization of mantissa causes a “hole at zero”
With same example:
Smallest non-normalized +ve is 0.01x2(-1+1)=0.12510
Smallest normalized +ve is        0.10x2(-1+1) =0.2510
Get a pattern as follows:
4. Overflow and underflow
If required exponent exceeds max exponent
Error message? NaN?
If required exponent for normalized number is less than min exponent
Could use denormal numbers
Could set number to zero
Could get an error message
Characteristics of Floating Point Numbers (cont.)
5. Rounding and truncation
Let fl(x) be the floating point representation of the real value x
fl( x ) = x (1 + epsilon), where epsilon is a small error multiplier, that depends on the exponent
Consider an n bit effective mantissa
Truncation
-2-n <= epsilon <= 0
Note error is always negative
Rounding
-2-(n+1) <= epsilon <= 2-(n+1)
Error is sometimes -ve, sometimes +ve
6. Comparison
Truncation error can be twice as large and always negative (bad - results are skewed)
Rounding errors may even out over calculations
Intermediate Calculations
Implementations may use higher precision for intermediate calculations
Resort to published precision when storing values back to variables
This helps to reduce the likelihood of underflow and overflow in intermediate calculations
Using different/higher precision for intermediate results may cause portability problems: JAVA requires/mandates that intermediate results are written back to memory to force uniformity across platforms
   Types of Memory
Model for memory components
Registers in memory are referred to as cells or locations (Note: a processor’s registers are a form of local memory)
Each cell has an address (numerical name) used to select it using the address bus
Number of cells determines number of address bits
220 memory locations needs 20 address bits
   Types of Memory
   (cont.)
Types
ROM (Read only memory)
RAM (Random access memory)
Alterable ROM (Erasable, Programmable ROM, sometimes called EPROM)
Caches and disks
ROM (Read only Memory)
Contents determined when component is fabricated
Contents can be read but not modified during the operation of the memory
To read a cell:
Processor sets the address bus to the address of the desired cell
Processor sets the control bus to indicate an address is ready
Memory puts the contents of cell on data bus
Memory sets the control bus to indicate that the data bus now contains the contents of the memory cell
Properties
Contents are not lost if power is removed
ROMS are not programmable, they are fabricated!
   Types of Memory
   (cont.)
RAM (Random access memory)
Contents of cells may be read or written during operation of device
Reading a cell is similar to ROM
To write:
Select cell using address bus
Value on data bus replaces contents of cell
Synchronize using the control bus signals
Properties
Contents are lost if power is removed from component
Once value of a cell is written, value in cell remains the same until it is overwritten or power is turned off
Contents typically unknown when power is first applied
   Types of Memory
   (cont.)
Alterable ROM (or EPROM)
Sometimes called erasable programmable ROM
Less common than ROM and RAM
Like a ROM
Contents not lost if power is lost
Like a RAM
Can change the contents while in operation
Usually need to use a special protocol to alter the contents
Examples: printer configurations, terminal setup
Note:
Disk and tape memory do not qualify as memory components (under this model)  because individual cells are not accessed directly from a bus
Blocks of data are read/written from memory used to support input/output
   Caches
A cache is a form of memory used to provide fast access to a copy of data and instructions that are most likely to be used by the processor
Before accessing data in slow memory check to see if it is already in the cache, if not move it there by replacing something else
If you replace a value that has been written, it must first be written back to the slower memory
Caches are physically closer to the processor than the rest of memory
Large cache may even be within the chip (Pentium Pro)
They are usually faster than the rest of memory
They are usually more expensive than the rest of  memory
Rule of thumb: a processor using a cache with a 95% hit rate will run as if all memory is the same speed as the cache
Memory management hardware and operating system support are needed to ensure the contents of the cache and memory are consistent
Memory Hierarchy
Caches:
Example Organization
    Cache Memory
Small cache of expensive but very fast memory interacting with slower but much larger memory
Invisible to OS and user programs but interact with other memory management hardware
Processor first checks if word referenced to is in cache
If not found in cache, a block of memory containing the word is moved to the cache
    The Hit Ratio
Hit ratio = fraction of access where data is in cache
T1 = access time for fast memory
T2 = access time for slow memory
T2 >> T1
When hit ratio is close to 1 the average access time is close to T1
   Locality of Reference
Memory reference for both instruction and data tend to cluster over a long period of time.
Example: once a loop is entered, there is frequent access to a small set of instructions.
Hence: once a word gets referenced, it is likely that nearby words will get referenced often in the near future.
Thus, the hit ratio will be close to 1 even for a small cache.
Disk Cache
(same principles)
A portion of main memory used as a buffer to temporarily to hold data for the disk
Locality of reference also applies here: once a record gets referenced, it is likely that nearby records will get referenced often in the near future.
If a record referenced is not in the disk cache, the sector containing the record is moved into the disk cache.
Cache Coherency: Simple Example
   Caches (cont.)
Caching is a concept that is exploited by software systems as well
Used for improving disk input/output subsystems (such as a distributed file system)
When reading from a disk read a full buffers worth of data even if the program requests less
On the next read, the data may already be available in the buffer
Used within application programs themselves
Web browser
Keep a copy of the most recently activated HTML pages so that if the user goes back/forward the page doesn’t have to be retrieved again
   Caches: Pentium PC
   Caches: PowerPC
Input/Output Components
Input/Output (IO) components permit computer systems to interact with the real world via IO interactions
What is the typical model for interacting with an I/O component?
Subset of the processor’s registers are classified as control, status, and data registers for the device
Writing to control registers influences how IO interactions are performed, or causes them
Reading data registers obtains input
Writing to data registers causes output
Status registers are read only and give status of device
Busy, ready, in an error mode
   I/O Module Structure
Data to/from system bus are buffered in data register(s)
Status/Control register(s) holds
current status information
current control information from
I/O logic interact with CPU via control bus
Contains logic specific to the interface of each device
I/O Communication Techniques
3 techniques are possible for I/O operation
Programmed I/O
Does not use interrupts: CPU has to wait for completion of each I/O operation
Interrupt-driven I/O
CPU can execute code during I/O operation: it gets interrupted when I/O operation is done.
Direct Memory Access
A block of data is transferred directly from/to memory without going through CPU
   Programmed I/O
I/O module performs the action, on behalf of the processor
But the I/O module does not interrupt the CPU when I/O is done
Processor is kept busy checking status of I/O module
   Problem: I/O is slow!
WRITE transfer control to the printer driver (I/O pgm)
I/O pgm prepare I/O module for printing (4)
CPU has to WAIT for I/O command to complete
Long wait for a printer
I/O pgm finishes in (5) and report status of operation
User program is on hold until WRITE terminates: not very efficient!
Better: do something else while I/O operation completes
   Interrupt-Driven I/O
Processor is interrupted when I/O module ready to exchange data
Processor is free to do other work
No needless waiting
Consumes a lot of processor time because every word read or written passes through the processor
   Direct Memory Access
CPU issues request to a DMA module (separate module or incorporated into I/O module)
DMA module transfers a block of data directly to or from memory (without going through CPU)
An interrupt is sent when the task is complete
The CPU is only involved at the beginning and end of the transfer
The CPU is free to perform other tasks during data transfer
Disks and Direct Memory Access (DMA)
Disk accesses can be caused by:
Specifying: an address for a buffer in memory to be read/written, the number of bytes, the location on the disk, and the type of operation (read/write)
Signaling the disk controller that an operation is required
The disk controller can cause an interrupt when the operation is complete
Direct memory access
Direct connections can be made between memory and a disk device
Once a disk is ready to read/write data it can transfer it directly to/from memory without the help of the processor
Provides for a more efficient use of the processor
Note: the processor and disk can now compete for the memory and possibly for the bus/lines connected to the memory
Interrupt Controller Circuitry and Serving Interrupts
Interrupt controller circuitry is used to arbitrate between interrupts from:
Timers
Keyboards
Serial ports
Hard disk controllers
Floppy disk controllers
and Printers
When the processor services the interrupt it postpones what it is doing to support the interaction
The processing needed to service the interrupt depends on the component that causes the interrupt
To process an interrupt, the interrupt code must:
Save the contents of the registers modified to do the interrupt processing
Do the interrupt processing
Restore the registers before leaving
   Interrupt Handler
Is a program that determines nature of the interrupt and performs whatever actions are needed
Control is transferred to this program
Control must be transferred back to the interrupted program so that it can be resumed from the point of interruption
This point of interruption can occur anywhere in the program
Thus: must save the state of the program (content of PC + PSW + registers + ...)
Simple Interrupt
Processing
Interrupts improve
CPU usage
I/O pgm prepares the I/O module and issues the I/O command (eg: to printer)
I/O pgm branches to user pgm
User code gets executed during I/O operation (eg: printing): no waiting
User pgm gets interrupted (x) when I/O operation is done and branches to interrupt handler to examine status of I/O module
Execution of user code resumes
   Classes of Interrupts
I/O
signals normal completion of operation or error
Program Exception
overflows
try to execute illegal instruction
reference outside user’s memory space
Timer
preempts a pgm to perform another task
Hardware failure (eg: memory parity error)
Multiple Interrupts: Sequential Order
Disable interrupts during an interrupt
Interrupts remain pending until the processor enables interrupts
After interrupt handler routine completes, the processor checks for additional interrupts
Multiple Interrupts: Priorities
Higher priority interrupts cause lower-priority interrupts to wait
Causes a lower-priority interrupt handler to be interrupted
Example: when input arrives from communication line, it needs to be absorbed quickly to make room for more input
Multiprogramming
When a program reads a value on an I/O device it will need to wait for the I/O operation to complete
Interrupts are mostly effective when a single CPU is shared among several concurrently active processes.
The CPU can then switch to execute another program when a program waits for the result of the read operation. (more later)
   Instruction Processing
For programming purposes, each processor is characterized by its register set and its instruction set
Instructions: operations and operands
How does a processor architecture work?
Von-Neuman architecture (repeat 3 steps)
1. Fetch instruction
2. Decode instruction
3. Execute instruction
Older processors did this sequentially
Newer processors have pipelines so that steps 1-3 are done in parallel whenever possible
    Fetch Step
Fetch
How does a processor fetch instructions?
Executes a memory read at the address in the Program counter (PC) register
Increments the PC to point to the next instruction location
Instruction fetched from bus and read into the instruction register (IR)
Instructions act on registers and operands
Processor executes the operation involving its registers
May cause bus interactions to read and write operands into registers
Getting Started and Flags
How does a processor get started?
Power is applied
Initializes the PC to contain a specific value
OR
The processor initializes with a memory read from a specific location
Assumes the value read is the address of the first instruction to be executed (ie. loads the read value into the PC)
Typically, processors fetch a first instruction from a ROM that contains a control program
What are the flags for?
Reflect the result of executing instructions
Zero bit: the result of an operation was 0
Sign bit: the result was positive or negative
Assumes a coding convention for +- values
Carry bit: a carry resulted from an addition
Values of the flags can be tested during some instructions and influence how the execution proceeds
   Instructions
What are the basic instruction categories?
1. Data transfers
2. Arithmetic and bit manipulation operations
3. Program control transfer operations
1. Data transfer: transfers data between
memory and registers
between IO components and registers
between registers
Some processors support block transfers to move more than one word
Ex: MOV destination, source
Move the contents of source to the destination
These instructions do not usually modify flags
   Instructions (cont.)
2. Arithmetic and bit manipulation operations
Add, subtract, multiply, divide
Ex: ADD destination, source
Add the contents of source to the contents of destination and leave the result in destination
Modify the flags as appropriate
Bit manipulation
Logical (and, or, xor, not)
Shift (left, right, logical/arithmetic)
Rotate (left, right)
Flag manipulation (set, clear, complement)
Ex: SHR destination, count
Shift the contents of the destination right the number of bits specified in the count
For each bit shifted, zero is shifted into the most significant bit (MSB)
Modify flags as is appropriate
   Instructions (cont.)
3. Program control transfer
Determine where the next instruction is located (sequential transfer is implicit in Von-Neuman Architecture)
Branch instructions
Unconditional
Branch to a location (or offset from this location)
Conditional
Test flags, transfer under some condition
Iteration
Perform arithmetic modification to a register (increment or decrement)
Conditionally transfer based on result
Subroutine support
Call/return (unconditional and conditional)
HALT to terminate a program
Example: Logic for an Assembler Language Program
unsigned int R0, R1, R2;   Register set for        processor
unsigned int M[64K];        Memory
R0 = M[0];                    Move
R1 = 5;                    Move
R2 = R2 + R1;  -- too complicated
R2 = R0;                  Move
R2 += R1;                Add
If R2 != 0 Goto End  Branch
R2 = 0;                               Move
M[0] = R2;                         Move
End: HALT