Assembler
Recap
Processor characterized by register set
Instruction set: operations + operands
Active behaviour
IR = memory[PC]
PC = PC + 1 (go to next instruction)
decode instruction
execute instruction
Assembly language concepts
Program consists of data and instructions that are loaded into memory such that if processor begins executing from the start address then the computer will safely meet its objectives

Assembler Language Concepts
Program development
1. Specify what the module must do
2. Design the module (data structure + algorithms)
3. Code the module
4. Load into system and execute
5. Test and debug
Steps 1, 2
Analysis and Design techniques
Step 3
Need a language, assembler (compiler), linker...

Anatomy of an Assembler Language
A) Expressions
(1) Represent  numbers, characters,  strings (0,‘0’,“AB”)
(2) Labels for addresses or constants (“STEP2”, “MAX”)
(3) Operators (+,-)
(4) Brackets ()
(5) Location counter - special reference to current memory location
B) Register mnemonics
Names that represent registers R0, R1, or AX ..
C) Instruction mnemonics
Names that identify instructions MOV
D) Addressing mode syntax
Identify whether a word contains a value, reference (pointer), or other.... (will see more later)
E) Instruction statements
Operation and operands

Anatomy of an Assembler Language (cont.)
F) Memory declarations
Reserve and possibly initialize memory
G) Comments
For human consumption, ignored by the assembler
H) Macros
Define a sequence of code (like a procedure) with a name
PUSHALL MACRO
PUSH AX BX CX DX
PUSH DS SI
PUSH ES DI
ENDM
To invoke, use name instead of the code sequence
PUSHALL
to generate
PUSH AX BX CX DX
PUSH DS SI
PUSH ES DI

Anatomy of an Assembler Language (cont.)
I) Assembler directives
Special instructions to assembler
Ex:
How  to create and structure a program (type of processor, for ex: 80186, 80286,.....)
Conditional assembly
IFDEF Test
; process these assembly language statements
ELSE
; process these assembly language statements
ENDIF
How to structure the resulting support files (listings, cross reference files, etc...
Sometimes directives exist for high level statements
While while_expression
macro_body
ENDM

Example: IBM PC, Intel 8088/8086
Processor
Intel 8088 - PC, XT
20 bit address bus
8 bit data bus
8086 => 16 bit data bus, so it executes more quickly, fewer cycles needed to load data
Memory, example 1 Megabyte (1 MB)
Memory map
00000 - 9FFF RAM - interrupt vectors, DOS and user program space
A0000 - BFFFF RAM - video display memory
C0000 - FFFFF ROM BIOS, boot ROM
Input/Output
Parallel printer interface (can send bits in parallel)
Serial communications (one bit at a time)
Keyboard - parallel interface
CRT,  Disk controllers, add-on boards....

   8088
Bus interface
8 bit data bus
20 bit address bus, each byte in memory is accessible
16 bit registers
Points to a byte, can be used to point to words of 2 bytes as well, need to decide on order of bytes for 2 byte words
How do the 16 bit registers work with 20 bit addresses? We will take a look soon....
Control bus, beyond the scope of this course

   8088 (cont.)
8088 Register set
General purpose 16 bit registers
Can also split each register into 2 8 bit registers
H is high byte, L is low byte, X is 16 bit (both)
AH, AL, AX
BH, BL, BX
CH, CL, CX
DH, DL, DX
Also have
BP base pointer 16 bit
SP stack pointer 16 bit
SI source index 16 bit
DI destination index 16 bit
Segment registers and control registers
DS data segment 16 bit
CS code segment 16 bit
SS stack segment 16 bit
ES extra segment 16 bit
IP instruction pointer (PC) 16 bit
FLAGS status word 16 bit but only 9 are used as flags

   8088 (cont.)
Use of registers
AX - accumulator, supports multiply, divide, I/O, decimal arithmetic, translate
BX - general, translate, based addressing mode
CX - general, string operations, loop control counters
DX - general + multiply and divide, indirect I/O
SP - stack operations
BP - base access
SI - string operations (source)
DI - string operations (destination)
Segment registers used for addressing

   What is a Segment?
20 bit address can access 1 MB of memory
16 bit registers can only access 64KB of memory
Need some strategy for accessing more memory!
Approach
Use value in segment + offsets from register
Notation SEG:OFF
Both value in segment and offset are 16 bits
To create a 20 bit address
(1) Shift segment value left by 4 bits (shift in 0s)
(2) Add offset to get 20 bit value

   About Segments
Note:
Segments create the possibility of many ways to address the same memory location
For ex: Suppose the memory location is 10028Hex
The following all reference the same cell
1000:28, 1001:18, 1002:8, 0FFF:38
For ex: 1000:28 => 10000 + 28 = 10028
As a result 20 bit addresses never appear in an instruction (or the assembly language), we always use a 16 bit segment and 16 bit offset
How does the 8088 use segments?
Memory access (including instruction fetching) is performed using one of the segment registers as the segment value and an operand as the offset
instruction fetch CS:IP
default data access DS:offset operand
default stack access SS:SP
Creates 64K windows with exact memory locations determined by value in segment registers

8088 Addressing
Modes
Addressing modes identify the system registers to be used as source and destination operands
Modes
Memory: operand is in a memory location
Immediate: operand is part of instruction (constant)
Register: operand is a processor register
Memory modes: operand is in a memory location
(1) Direct
address of operand is part of instruction
(2) Indirect: use register contents + direct information
(a) register - offset is in one of BX, BP, SI, DI (a default register is assumed)
(b) based DS:BX + 16 bit displacement (Sets up default)
(c i) indexed DS:SI or DI + displacement
(c ii) indexed ES:SI or DI + displacement
(d i) based indexed DS:BX + SI or DI + displacement
(d ii) based indexed SS:BP + SI or DI + displacement

Addressing Modes: Examples
Immediate MOV AX, 0
Register MOV AX, CX
Direct MOV AX,[1234]
Loads contents of memory location DS:1234 into AX (16 bits because of the X)
Assumes DS as the default segment register
Direct MOV [1234],AX
Store AX into memory location @DS:1234
Indirect
(a) MOV AX, [BX]
Moves contents of word @DS:[BX] into AX
(b) MOV [BX+1], AL
Move a byte into @DS:[BX + 1]
(c) MOV CH,[DI +10]
(d) MOV AX,[BP+SI]
(d) MOV BX,[BX + DI +  4]

Basic Addressing
Modes
Immediate
+: no memory reference
-: limited operand magnitude
Direct
+: simple
-: limited address space
Indirect
+: large address space
-: multiple memory references
Register
+: no memory reference
-: limited address space
Displacement
+: flexibility
-: complexity

Pentium Addressing Modes

Power PC Addressing Modes

Memory Reference Ambiguities
Consider MOV [BX],5
Should we move a byte or a word?
Need to instruct the assembler
WORD PTR
BYTE PTR
So we get:
MOV WORD PTR [BX], 5
OR
 MOV BYTE PTR [BX], 5
Consider the following example

   8088 Flags
Flag values change as part of execution of the instructions
General rules for 6 condition/status registers
PF - set if and only if (iff) lsbyte of result contains an even number of 1s
ZF - zero flag - set iff result was zero
SF - sign flag - set equal to msbit of result
AF - auxiliary flag
Used by decimal arithmetic instruction
Set iff there has been a carry from (or borrow to) the low nibble to (from) the high nibble
CF - carry flag
For unsigned arithmetic => overflow
Set iff carry out of (or borrow into) the msbit of the result
Can also be used in rotate instructions

   Flags (cont.)
OF - overflow flag
For unsigned arithmetic - set iff result can not be represented in fixed width
For signed arithmetic: set iff carry into msbit differs from carry out of msbit
Set iff a carry/borrow into/from the msbit but not into/from the fictitious bit above the msbit OR a carry/borrow into/from the fictitious bit above the msbit but not into/from the msbit
3 Control flags
TF - single step trap flag - if set - single step interrupt occurs after execution of next instruction, TF is cleared by the single step interrupt
IF - interrupt flag - if set - CPU will recognize maskable interrupts
DF - direction flag - controls auto increment/decrement of index register on string instructions - is set - then auto-decrement

Turbo Assembler: Assembler Language Code Template
.MODEL Small Identifies memory model
.STACK 200h (200h bytes are reserved)
.DATA
define the data variables here
.CODE
startlabel:
instructions
.END startlabel

   Data Declaration
Reserve and initialize memory locations
Can define symbolic labels (names) for memory locations
Value of the label is a memory address
General form LABEL TYPE OPERAND
LABEL :legal characters are alphanumeric + [‘-’ | ‘@’ | ‘$’ | ‘?’], cannot start with a digit
TYPE: DW for a word, DB for a byte
OPERAND: one or more definitions separated by commas
EX:
X DW 0
Y DW -1
Z DB 5
Hi DB ‘Hello’

Data Declarations
(cont.)
More examples
MyArray DW 10,0,-5,20,13,6
A_Block DB 100h DUP(0)
Initializes the 100h  values to 0
MoreSpace DW 20 DUP(?)
Does not initialize the 20 values
A4by3_Array DW 4 DUP(0,1,-1)

   Pentium Data Types

Slide 24

   PowerPC Data Types
Unsigned byte: can be used for logical or integer arithmetic operations, loaded from memory into a general register by zero-extending on the left to full register size
Unsigned halfword: as above, for 16-bit quantities
Signed halfword: used for arithmetic operations, loaded by sign-extension to register size
Unsigned word: used for logical operations and as address pointer
Signed word: used for arithmetic operations
Unsigned doubleword: used as an address pointer
Byte string: from 0 to 128 bytes in length
IEEE 754 single- and double-precision floating-point data types

   Assembler Example
.DATA
X DW 1
Y DW -1
My_Array DW 4 DUP(10)
.CODE
START MOV  AX,@DATA
MOV  DS,AX
MOV  AX,X
ADD   AX,Y
......
MOV BX,OFFSET My_Array
MOV [BX], AX
MOV [BX+2],AX
MOV [My_Array+4],AX
MOV SI,6
MOV [BX +SI],AX
.END  START

   Assembler Example
Initialize an array example
.DATA
ArraySize EQU 4
InitialValue EQU 0
ArrayX DW ArraySize DUP(?)
.CODE
Start: MOV AX, InitialValue
MOV BX,OFFSET ArrayX
MOV     CX,0
InitLoop: CMP CX, ArraySize
JE DONE
MOV [BX],AX
ADD BX,2
ADD CX,1
JMP InitLoop
DONE: ....

Assembler Example:
CX as a loop counter
We can take advantage of CX as a loop counter to speed up the program
MOV AX, InitialValue
MOV BX, OFFSET ArrayX
MOV CX, ArraySize
JCXZ DONE
InitLoop: MOV [BX],AX
ADD BX,2
LOOP InitLoop
...
DONE:
Only CX can be used as a loop counter
Compilers (good compilers) generate code to take advantage of special instructions that help reduce execution time

   Control Flow
What if we want to make use of the concept of subroutines?
Jumps and Jump Tables
JMP   (Unconditionally Jump to a location)
JMP Register/Memory_Label
Like a “Go TO” statement
JMP   End_Switch
OR
MOV BX,OFFSET End_Switch
JMP   BX
Can be used for indirect calls (will see soon)
Subroutines
CALL  to call a subroutine
RET     to return from a subroutine
Software Interrupts
Special type of indirect call used to simulate hardware interrupts
For Debugging, to execute routines within the OS and device drivers
INT, IRET (Interrupt, Interrupt return)

   Jump Tables
Can have a multi-way jump by building a jump table
We can choose between routines to execute
PROG0, PROG1,...., PROGN
Set up a table with the addresses of the routines to execute
TABLE DW PROG0
DW PROG1
DW PROG2
.....
Load the address of the routine into a register based on contents of another register
CMP BX, 4
JNC ERROR
SHL BX, 1
MOV BX, [TABLE+BX]
JMP BX
.......
Can add more routines by increasing the length of the table

   Returning from a Jump
What do we do? How do we branch back to another part of our code?
Can hard code a finite state machine into our assembly language program
But what if we want to branch back to where we came from, and we can come from many places?
Could store a return address in a chosen register, branch to its address at the end of the call
This would be a calling convention
More general support is available from subroutine CALL and RET operations

Stacks for Subroutine Support
Stack
Used to support assembly level subroutine calls
Helps store values temporarily and recall them later
Contiguous region of memory pointed to by the stack pointer SP register
If used, must initialize its initial memory
sseg  segment stack ‘data’
dw 80h dup(?)
sseg ends
Two operations: PUSH and POP
PUSH
Inserts a new data value at the top of the stack
(1) Decrements the SP by 2
(2) Save the data value in memory location addressed by SP
POP
Returns valued of data at top of stack and removes it from the stack
(1) return value from memory addressed by SP
(2) increment the stack pointer SP by 2

   Stack Operations

   Subroutine Call/Return
CALL does a push of the instruction pointer (IP) onto the stack automatically (FAR call also pushes segment information)
RET pops the IP off the stack automatically (If FAR call, the FAR return is done automatically)
Consider the following nested subroutines
PRINTB prints 4 hexadecimal digits
PRINTB: MOV AL,BH ;get high order byte
CALL PBYTE ;print both nibbles
MOV AL,BL ;get low order byte
CALL PBYTE ; print both nibbles
RET
PBYTE: PUSH AX ;saves a copy of value AX
MOV CL,4 ;shift AL right 4 bits
SHR AL,CL
CALL PRINT ;print hex digit low nibble
;of AL
POP AX ;restore value of AX
CALL PRINT ;print second hex digit
RET

   Passing Parameters
How do we pass parameters?
Can push parameters onto the stack before calling to pass parameters, the subprogram can then reference the stack for the parameters
They can be discarded from the stack using the following variant of the RET operation
RET nn
Where nn is the number added to the SP  just after the return address is popped off
By Value
Push the value
By Reference
Push the address, the called routine must also assume the parameter will be accessed indirectly

   Stack Frame Pointers
Values and references on the stack can’t just be popped off because the SP must remain pointed at the return address
Solution
Load the SP into the base pointer BP and access the parameters through the BP
Note: BP is used by many subprograms, so it must be saved first! Where? Programmer saves it on the stack.
DS, SS, BP are also saved/restored to original status (by the programmer) if changed by a subprogram (more calling conventions)
BP is said to point to a stack frame

Example of PASCAL Calling Conventions
PASCAL calling convention example
Procedure outpt(var port_num, valu : integer)
PUBLIC OUTPT ;enable access to external program
A_TEXT SEGMENT ‘CODE’
ASSUME CS:A_TEXT
OUTPT PROC FAR ;to enable far return
PUSH BP ;save old value
MOV BP,SP ;point to top of stack
MOV BX,[BP+8] ;addr. of 1st param
MOV DX,[BX]    ;value of first parameter
MOV BX,[BP+6] ;addr of 2nd param
MOV AX,[BX] ;value of 2nd parameter OUT DX,AL ;do output operation
POP BP ; programmer has to do this
RET 4 ; advance SP past parameters
OUTPT ENDP
A_TEXT ENDS
END

Difference with C Calling Conventions
C calling conventions
Parameters are passed by value only
Values can be addresses of variables (ie. &my_variable) so this isn’t a big restriction
Parameters are passed in a reverse order
Subroutine does not pop the parameters off the stack before returning
Pop is done by the caller
Subroutine doesn’t have to know how many parameters are passed to it (it can be deduced at runtime) just RET
Other differences
Pascal assumes you will not change the BP, DS, and SS registers and the direction flag must be cleared
C assumes SI, DP, BP and direction flag will not be changed
Other languages make other assumptions about use of registers etc., order of parameters etc.
Interlanguage calls need pragmas to explain which calling conventions should be used

   C Example
#include <stdio.h>
void do_it(int a, int * b, int c, int d, int e, int f, int g){
       (*b) = a;
}
int a;
void main (){
       int  b,c,d,e,f,g;
       a = 1;
       b = 2;
       do_it(a,&b,c,d,e,f,g);
       printf("a %d b %d \n",a,b);
}
Output: a 1 b 1

Corresponding Assembler Code for 80386 Processor
do_it proc
push ebp
mov ebp,esp
mov [ebp+08h],eax;a
mov [ebp+0ch],edx;b
mov [ebp+010h],ecx;c
; 4        (*b) = a;
mov eax,[ebp+0ch];b
mov ecx,[ebp+08h];a
mov [eax],ecx
leave
ret
do_it endp

   Software Interrupts
INT Similar to the Call instruction
INT nn  executes the following sequence of instructions
PUSH flags,CS, and IP Registers onto Stack
IP and CS registers are loaded with the contents of the two words at the absolute memory addresses 0:nn*4 and 0:nn*4+2
Program control jumps to the address contained in these two absolute memory locations
Address locations 0:0 through 0:3FFH are an interrupt vector (table) and contain starting addresses of subroutines you may want to execute
To finish an interrupt, use IRET
It restores CS, IP and flags in the proper order
Not RET only restores IP (or IP and CS for FAR call)

Why Software Interrupts?
Hardware interrupts are used to handle communication to/from peripheral devices
Hardware interrupt occurs when an external device sends a signal to the 8088’s interrupt request line along with an interrupt number n on the 8088 data lines. The 8088 then:
1. Stops whatever is being done
2. Jumps to a subroutine n to handle the data going to/from the device
3. Returns to whatever was being done
Software interrupts
Can be used to test the implementation of the interrupt handling routines
Interrupt table approach is useful because
Indirection given by the interrupt vector means we only need to specify the address of the routine in the vector, can reference the routine using n
Can lead to problems if 2 devices expect to use same n

Other Instructions of Processors
NOP No operation
Can be used to fill in code areas to start instructions on word boundaries
Can be used to control timing in a loop
HLT Halt, to stop execution until:
A pulse is sent to the reset line
Or an interrupt occurs
Useful for debugging
LOCK
Used to synchronize operations when multiple processors
Provides support for semaphores that can be used to provide mutual exclusion over hardware resources

Mutual Exclusion Example
Consider two processors that compete for access to a shared memory location
Semaphore is a memory location containing a flag bit that is set if either CPU is using the managed resource
Only one CPU is given access at a time
BUSY: CMP SEMPHR, 80H;Is semaphore set by other CPU
JE BUSY          ;Hang in loop if it is
MOV SEMPHR,80H ;Else set the set the semaphore
....          ;Do whatever
MOV SEMPHR,0      ;clear the semaphore
Problem
Both processors could pass JE and set the semaphore!
Need to guarantee mutual exclusion!
Need a single indivisible instruction that tests and sets the semaphore
Must prevent doing test and set with more than one memory access
If 2 accesses are required, other processors can sneak between the test and set (If they can, they eventually will)

Mutual Exclusion
(cont.)
Next attempt:
BUSY: MOV AL, 80H         ; Prepare to set sem. if not set
XCHG AL,SEMPHR  ; Set semaphore & get prev. value
OR AL,AL         ; Set flags
JNZ BUSY         ; Hang in loop if prev value not 0
....         ; Do whatever
MOV SEMPHR,0     ; clear the semaphore
Problem?
XCHG requires two memory accesses, so problem can still arise
Solution
LOCK instruction
When LOCK is placed in front of an assembler instruction a signal is sent to a CPU pin called LOCK
LOCK signal remains active for complete duration of a single instruction
Proper wiring of hardware ensures that the other processor will not be able to access the memory location while LOCK is set

Advantages/Disadvantages of Assembly Language
Advantages
Access to ports/registers etc. that are not accessible from high level languages
Can generate “Fast” code sequences
For example only save registers you modify
Disadvantages
May have to deal with Segments etc. that have nothing to do with the problem you are solving
Different languages for each processor
Poor readability
Difficult to debug (prone to errors)
Solution: higher level languages!