Assembler
|
|
|
|
|
Recap |
|
Processor characterized by register set |
|
Instruction set: operations + operands |
|
Active behaviour |
|
IR = memory[PC] |
|
PC = PC + 1 (go to next instruction) |
|
decode instruction |
|
execute instruction |
|
Assembly language concepts |
|
Program consists of data and
instructions that are loaded into memory such that if processor begins
executing from the start address then the computer will safely meet its
objectives |
Assembler Language
Concepts
|
|
|
|
Program development |
|
1. Specify what the module must do |
|
2. Design the module (data structure +
algorithms) |
|
3. Code the module |
|
4. Load into system and execute |
|
5. Test and debug |
|
Steps 1, 2 |
|
Analysis and Design techniques |
|
Step 3 |
|
Need a language, assembler (compiler),
linker... |
Anatomy of an Assembler
Language
|
|
|
|
A) Expressions |
|
(1) Represent numbers, characters,
strings (0,‘0’,“AB”) |
|
(2) Labels for addresses or constants
(“STEP2”, “MAX”) |
|
(3) Operators (+,-) |
|
(4) Brackets () |
|
(5) Location counter - special
reference to current memory location |
|
B) Register mnemonics |
|
Names that represent registers R0, R1,
or AX .. |
|
C) Instruction mnemonics |
|
Names that identify instructions MOV |
|
D) Addressing mode syntax |
|
Identify whether a word contains a
value, reference (pointer), or other.... (will see more later) |
|
E) Instruction statements |
|
Operation and operands |
|
|
Anatomy of an Assembler
Language (cont.)
|
|
|
|
|
F) Memory declarations |
|
Reserve and possibly initialize memory |
|
G) Comments |
|
For human consumption, ignored by the
assembler |
|
H) Macros |
|
Define a sequence of code (like a
procedure) with a name |
|
PUSHALL MACRO |
|
PUSH AX BX CX DX |
|
PUSH DS SI |
|
PUSH ES DI |
|
ENDM |
|
To invoke, use name instead of the code
sequence |
|
PUSHALL |
|
to generate |
|
PUSH AX BX CX DX |
|
PUSH DS SI |
|
PUSH ES DI |
|
|
|
|
Anatomy of an Assembler
Language (cont.)
|
|
|
|
|
|
|
I) Assembler directives |
|
Special instructions to assembler |
|
Ex: |
|
How
to create and structure a program (type of processor, for ex: 80186,
80286,.....) |
|
Conditional assembly |
|
IFDEF Test |
|
; process these assembly language
statements |
|
ELSE |
|
; process these assembly language
statements |
|
ENDIF |
|
How to structure the resulting support
files (listings, cross reference files, etc... |
|
Sometimes directives exist for high
level statements |
|
While while_expression |
|
macro_body |
|
ENDM |
Example: IBM PC, Intel
8088/8086
|
|
|
|
Processor |
|
Intel 8088 - PC, XT |
|
20 bit address bus |
|
8 bit data bus |
|
8086 => 16 bit data bus, so it
executes more quickly, fewer cycles needed to load data |
|
Memory, example 1 Megabyte (1 MB) |
|
Memory map |
|
00000 - 9FFF RAM - interrupt vectors,
DOS and user program space |
|
A0000 - BFFFF RAM - video display
memory |
|
C0000 - FFFFF ROM BIOS, boot ROM |
|
Input/Output |
|
Parallel printer interface (can send
bits in parallel) |
|
Serial communications (one bit at a
time) |
|
Keyboard - parallel interface |
|
CRT,
Disk controllers, add-on boards.... |
8088
|
|
|
|
|
Bus interface |
|
8 bit data bus |
|
20 bit address bus, each byte in memory
is accessible |
|
16 bit registers |
|
Points to a byte, can be used to point
to words of 2 bytes as well, need to decide on order of bytes for 2 byte
words |
|
|
|
|
|
|
|
|
|
|
|
How do the 16 bit registers work with
20 bit addresses? We will take a look soon.... |
|
Control bus, beyond the scope of this
course |
8088 (cont.)
|
|
|
|
|
8088 Register set |
|
General purpose 16 bit registers |
|
Can also split each register into 2 8
bit registers |
|
H is high byte, L is low byte, X is 16
bit (both) |
|
AH, AL, AX |
|
BH, BL, BX |
|
CH, CL, CX |
|
DH, DL, DX |
|
Also have |
|
BP base pointer 16 bit |
|
SP stack pointer 16 bit |
|
SI source index 16 bit |
|
DI destination index 16 bit |
|
Segment registers and control registers |
|
DS data segment 16 bit |
|
CS code segment 16 bit |
|
SS stack segment 16 bit |
|
ES extra segment 16 bit |
|
IP instruction pointer (PC) 16 bit |
|
FLAGS status word 16 bit but only 9 are
used as flags |
8088 (cont.)
|
|
|
Use of registers |
|
AX - accumulator, supports multiply,
divide, I/O, decimal arithmetic, translate |
|
BX - general, translate, based
addressing mode |
|
CX - general, string operations, loop
control counters |
|
DX - general + multiply and divide,
indirect I/O |
|
SP - stack operations |
|
BP - base access |
|
SI - string operations (source) |
|
DI - string operations (destination) |
|
Segment registers used for addressing |
What is a Segment?
|
|
|
|
|
20 bit address can access 1 MB of
memory |
|
16 bit registers can only access 64KB
of memory |
|
Need some strategy for accessing more
memory! |
|
Approach |
|
Use value in segment + offsets from
register |
|
Notation SEG:OFF |
|
Both value in segment and offset are 16
bits |
|
To create a 20 bit address |
|
(1) Shift segment value left by 4 bits
(shift in 0s) |
|
(2) Add offset to get 20 bit value |
About Segments
|
|
|
|
|
Note: |
|
Segments create the possibility of many
ways to address the same memory location |
|
For ex: Suppose the memory location is
10028Hex |
|
The following all reference the same
cell |
|
1000:28, 1001:18, 1002:8, 0FFF:38 |
|
For ex: 1000:28 => 10000 + 28 =
10028 |
|
As a result 20 bit addresses never
appear in an instruction (or the assembly language), we always use a 16 bit
segment and 16 bit offset |
|
How does the 8088 use segments? |
|
Memory access (including instruction
fetching) is performed using one of the segment registers as the segment
value and an operand as the offset |
|
instruction fetch CS:IP |
|
default data access DS:offset
operand |
|
default stack access SS:SP |
|
Creates 64K windows with exact memory
locations determined by value in segment registers |
8088
Addressing
Modes
|
|
|
|
|
Addressing modes identify the system
registers to be used as source and destination operands |
|
Modes |
|
Memory: operand is in a memory location |
|
Immediate: operand is part of
instruction (constant) |
|
Register: operand is a processor
register |
|
Memory modes: operand is in a memory
location |
|
(1) Direct |
|
address of operand is part of
instruction |
|
(2) Indirect: use register contents +
direct information |
|
(a) register - offset is in one of BX,
BP, SI, DI (a default register is assumed) |
|
(b) based DS:BX + 16 bit displacement
(Sets up default) |
|
(c i) indexed DS:SI or DI +
displacement |
|
(c ii) indexed ES:SI or DI +
displacement |
|
(d i) based indexed DS:BX + SI or DI +
displacement |
|
(d ii) based indexed SS:BP + SI or DI +
displacement |
Addressing Modes:
Examples
|
|
|
|
|
Immediate MOV AX, 0 |
|
Register MOV AX, CX |
|
Direct MOV AX,[1234] |
|
Loads contents of memory location
DS:1234 into AX (16 bits because of the X) |
|
Assumes DS as the default segment
register |
|
Direct MOV [1234],AX |
|
Store AX into memory location @DS:1234 |
|
Indirect |
|
(a) MOV AX, [BX] |
|
Moves contents of word @DS:[BX] into AX |
|
(b) MOV [BX+1], AL |
|
Move a byte into @DS:[BX + 1] |
|
(c) MOV CH,[DI +10] |
|
(d) MOV AX,[BP+SI] |
|
(d) MOV BX,[BX + DI + 4] |
Basic
Addressing
Modes
|
|
|
|
Immediate |
|
+: no memory reference |
|
-: limited operand magnitude |
|
Direct |
|
+: simple |
|
-: limited address space |
|
Indirect |
|
+: large address space |
|
-: multiple memory references |
|
Register |
|
+: no memory reference |
|
-: limited address space |
|
Displacement |
|
+: flexibility |
|
-: complexity |
Pentium Addressing Modes
Power PC Addressing Modes
Memory Reference
Ambiguities
|
|
|
|
Consider MOV [BX],5 |
|
Should we move a byte or a word? |
|
Need to instruct the assembler |
|
WORD PTR |
|
BYTE PTR |
|
So we get: |
|
MOV WORD PTR [BX], 5 |
|
OR |
|
MOV BYTE PTR [BX], 5 |
|
Consider the following example |
8088 Flags
|
|
|
|
Flag values change as part of execution
of the instructions |
|
General rules for 6 condition/status
registers |
|
PF - set if and only if (iff) lsbyte of
result contains an even number of 1s |
|
ZF - zero flag - set iff result was
zero |
|
SF - sign flag - set equal to msbit of
result |
|
AF - auxiliary flag |
|
Used by decimal arithmetic instruction |
|
Set iff there has been a carry from (or
borrow to) the low nibble to (from) the high nibble |
|
CF - carry flag |
|
For unsigned arithmetic => overflow |
|
Set iff carry out of (or borrow into)
the msbit of the result |
|
Can also be used in rotate instructions |
Flags (cont.)
|
|
|
|
OF - overflow flag |
|
For unsigned arithmetic - set iff
result can not be represented in fixed width |
|
For signed arithmetic: set iff carry
into msbit differs from carry out of msbit |
|
Set iff a carry/borrow into/from the
msbit but not into/from the fictitious bit above the msbit OR a carry/borrow
into/from the fictitious bit above the msbit but not into/from the msbit |
|
3 Control flags |
|
TF - single step trap flag - if set -
single step interrupt occurs after execution of next instruction, TF is
cleared by the single step interrupt |
|
IF - interrupt flag - if set - CPU will
recognize maskable interrupts |
|
DF - direction flag - controls auto
increment/decrement of index register on string instructions - is set - then
auto-decrement |
Turbo Assembler:
Assembler Language Code Template
|
|
|
.MODEL Small Identifies memory
model |
|
.STACK 200h (200h bytes are
reserved) |
|
.DATA |
|
define the data variables here |
|
.CODE |
|
startlabel: |
|
instructions |
|
.END startlabel |
Data Declaration
|
|
|
|
Reserve and initialize memory locations |
|
Can define symbolic labels (names) for
memory locations |
|
Value of the label is a memory address |
|
General form LABEL TYPE OPERAND |
|
LABEL :legal characters are
alphanumeric + [‘-’ | ‘@’ | ‘$’ | ‘?’], cannot start with a digit |
|
TYPE: DW for a word, DB for a byte |
|
OPERAND: one or more definitions
separated by commas |
|
EX: |
|
X DW 0 |
|
Y DW -1 |
|
Z DB 5 |
|
Hi DB ‘Hello’ |
Data
Declarations
(cont.)
|
|
|
|
|
More examples |
|
MyArray DW 10,0,-5,20,13,6 |
|
A_Block DB 100h DUP(0) |
|
Initializes the 100h values to 0 |
|
MoreSpace DW 20 DUP(?) |
|
Does not initialize the 20 values |
|
A4by3_Array DW 4 DUP(0,1,-1) |
|
|
Pentium Data Types
Slide 24
PowerPC Data Types
|
|
|
Unsigned byte: can be used for logical
or integer arithmetic operations, loaded from memory into a general register
by zero-extending on the left to full register size |
|
Unsigned halfword: as above, for 16-bit
quantities |
|
Signed halfword: used for arithmetic
operations, loaded by sign-extension to register size |
|
Unsigned word: used for logical
operations and as address pointer |
|
Signed word: used for arithmetic
operations |
|
Unsigned doubleword: used as an address
pointer |
|
Byte string: from 0 to 128 bytes in
length |
|
IEEE 754 single- and double-precision
floating-point data types |
Assembler Example
|
|
|
.DATA |
|
X DW 1 |
|
Y DW -1 |
|
My_Array DW 4 DUP(10) |
|
.CODE |
|
START MOV AX,@DATA |
|
MOV DS,AX |
|
MOV AX,X |
|
ADD AX,Y |
|
...... |
|
MOV BX,OFFSET My_Array |
|
MOV [BX], AX |
|
MOV [BX+2],AX |
|
MOV [My_Array+4],AX |
|
MOV SI,6 |
|
MOV [BX +SI],AX |
|
.END START |
Assembler Example
|
|
|
Initialize an array example |
|
.DATA |
|
ArraySize EQU 4 |
|
InitialValue EQU 0 |
|
ArrayX DW ArraySize DUP(?) |
|
.CODE |
|
Start: MOV AX, InitialValue |
|
MOV BX,OFFSET ArrayX |
|
MOV CX,0 |
|
InitLoop: CMP CX, ArraySize |
|
JE DONE |
|
MOV [BX],AX |
|
ADD BX,2 |
|
ADD CX,1 |
|
JMP InitLoop |
|
DONE: .... |
Assembler Example:
CX as a loop counter
|
|
|
|
We can take advantage of CX as a loop
counter to speed up the program |
|
MOV AX, InitialValue |
|
MOV BX, OFFSET ArrayX |
|
MOV CX, ArraySize |
|
JCXZ DONE |
|
InitLoop: MOV [BX],AX |
|
ADD BX,2 |
|
LOOP InitLoop |
|
... |
|
DONE: |
|
Only CX can be used as a loop counter |
|
Compilers (good compilers) generate
code to take advantage of special instructions that help reduce execution
time |
Control Flow
|
|
|
|
|
|
What if we want to make use of the
concept of subroutines? |
|
Jumps and Jump Tables |
|
JMP
(Unconditionally Jump to a location) |
|
JMP Register/Memory_Label |
|
Like a “Go TO” statement |
|
JMP
End_Switch |
|
OR |
|
MOV BX,OFFSET End_Switch |
|
JMP
BX |
|
Can be used for indirect calls (will
see soon) |
|
Subroutines |
|
CALL
to call a subroutine |
|
RET
to return from a subroutine |
|
Software Interrupts |
|
Special type of indirect call used to
simulate hardware interrupts |
|
For Debugging, to execute routines
within the OS and device drivers |
|
INT, IRET (Interrupt, Interrupt return) |
Jump Tables
|
|
|
|
|
Can have a multi-way jump by building a
jump table |
|
We can choose between routines to
execute |
|
PROG0, PROG1,...., PROGN |
|
Set up a table with the addresses of
the routines to execute |
|
TABLE DW PROG0 |
|
DW PROG1 |
|
DW PROG2 |
|
..... |
|
Load the address of the routine into a
register based on contents of another register |
|
CMP BX, 4 |
|
JNC ERROR |
|
SHL BX, 1 |
|
MOV BX, [TABLE+BX] |
|
JMP BX |
|
....... |
|
Can add more routines by increasing the
length of the table |
|
|
Returning from a Jump
|
|
|
|
What do we do? How do we branch back to
another part of our code? |
|
Can hard code a finite state machine
into our assembly language program |
|
|
|
|
|
|
|
|
|
|
|
|
|
But what if we want to branch back to
where we came from, and we can come from many places? |
|
Could store a return address in a
chosen register, branch to its address at the end of the call |
|
This would be a calling convention |
|
More general support is available from
subroutine CALL and RET operations |
Stacks for Subroutine
Support
|
|
|
|
|
Stack |
|
Used to support assembly level
subroutine calls |
|
Helps store values temporarily and
recall them later |
|
Contiguous region of memory pointed to
by the stack pointer SP register |
|
If used, must initialize its initial
memory |
|
sseg
segment stack ‘data’ |
|
dw 80h dup(?) |
|
sseg ends |
|
Two operations: PUSH and POP |
|
PUSH |
|
Inserts a new data value at the top of
the stack |
|
(1) Decrements the SP by 2 |
|
(2) Save the data value in memory
location addressed by SP |
|
POP |
|
Returns valued of data at top of stack
and removes it from the stack |
|
(1) return value from memory addressed
by SP |
|
(2) increment the stack pointer SP by 2 |
Stack Operations
Subroutine Call/Return
|
|
|
|
CALL does a push of the instruction
pointer (IP) onto the stack automatically (FAR call also pushes segment
information) |
|
RET pops the IP off the stack
automatically (If FAR call, the FAR return is done automatically) |
|
Consider the following nested
subroutines |
|
PRINTB prints 4 hexadecimal digits |
|
PRINTB: MOV AL,BH ;get high
order byte |
|
CALL PBYTE ;print
both nibbles |
|
MOV AL,BL ;get low
order byte |
|
CALL PBYTE ; print
both nibbles |
|
RET |
|
PBYTE: PUSH AX ;saves a copy
of value AX |
|
MOV CL,4 ;shift AL
right 4 bits |
|
SHR AL,CL |
|
CALL PRINT ;print hex
digit low nibble |
|
;of AL |
|
POP AX ;restore value
of AX |
|
CALL PRINT ;print
second hex digit |
|
RET |
Passing Parameters
|
|
|
|
|
How do we pass parameters? |
|
Can push parameters onto the stack
before calling to pass parameters, the subprogram can then reference the
stack for the parameters |
|
They can be discarded from the stack
using the following variant of the RET operation |
|
RET nn |
|
Where nn is the number added to the
SP just after the return address is
popped off |
|
|
|
|
|
|
|
|
|
|
|
By Value |
|
Push the value |
|
By Reference |
|
Push the address, the called routine
must also assume the parameter will be accessed indirectly |
Stack Frame Pointers
|
|
|
|
Values and references on the stack
can’t just be popped off because the SP must remain pointed at the return
address |
|
Solution |
|
Load the SP into the base pointer BP
and access the parameters through the BP |
|
Note: BP is used by many subprograms,
so it must be saved first! Where? Programmer saves it on the stack. |
|
DS, SS, BP are also saved/restored to
original status (by the programmer) if changed by a subprogram (more calling
conventions) |
|
BP is said to point to a stack frame |
Example of PASCAL Calling
Conventions
|
|
|
PASCAL calling convention example |
|
Procedure outpt(var port_num, valu :
integer) |
|
PUBLIC OUTPT ;enable access
to external program |
|
A_TEXT SEGMENT ‘CODE’ |
|
ASSUME CS:A_TEXT |
|
OUTPT PROC FAR ;to enable
far return |
|
PUSH BP ;save old
value |
|
MOV BP,SP ;point to
top of stack |
|
MOV BX,[BP+8] ;addr. of
1st param |
|
MOV DX,[BX] ;value of first parameter |
|
MOV BX,[BP+6] ;addr of
2nd param |
|
MOV AX,[BX] ;value of
2nd parameter OUT DX,AL ;do output operation |
|
POP BP ; programmer
has to do this |
|
RET 4 ; advance SP
past parameters |
|
OUTPT ENDP |
|
A_TEXT ENDS |
|
END |
Difference with C Calling
Conventions
|
|
|
|
|
C calling conventions |
|
Parameters are passed by value only |
|
Values can be addresses of variables
(ie. &my_variable) so this isn’t a big restriction |
|
Parameters are passed in a reverse
order |
|
Subroutine does not pop the parameters
off the stack before returning |
|
Pop is done by the caller |
|
Subroutine doesn’t have to know how
many parameters are passed to it (it can be deduced at runtime) just RET |
|
Other differences |
|
Pascal assumes you will not change the
BP, DS, and SS registers and the direction flag must be cleared |
|
C assumes SI, DP, BP and direction flag
will not be changed |
|
Other languages make other assumptions
about use of registers etc., order of parameters etc. |
|
Interlanguage calls need pragmas to
explain which calling conventions should be used |
C Example
|
|
|
#include <stdio.h> |
|
|
|
void do_it(int a, int * b, int c, int
d, int e, int f, int g){ |
|
(*b) = a; |
|
} |
|
|
|
int a; |
|
|
|
void main (){ |
|
int b,c,d,e,f,g; |
|
|
|
a = 1; |
|
b = 2; |
|
do_it(a,&b,c,d,e,f,g); |
|
printf("a %d b %d \n",a,b); |
|
} |
|
Output: a 1 b 1 |
Corresponding Assembler
Code for 80386 Processor
|
|
|
do_it proc |
|
push ebp |
|
mov ebp,esp |
|
mov [ebp+08h],eax;a |
|
mov [ebp+0ch],edx;b |
|
mov [ebp+010h],ecx;c |
|
; 4 (*b) = a; |
|
mov eax,[ebp+0ch];b |
|
mov ecx,[ebp+08h];a |
|
mov [eax],ecx |
|
leave |
|
ret |
|
do_it endp |
|
|
Software Interrupts
|
|
|
|
INT Similar to the Call instruction |
|
INT nn
executes the following sequence of instructions |
|
PUSH flags,CS, and IP Registers onto
Stack |
|
IP and CS registers are loaded with the
contents of the two words at the absolute memory addresses 0:nn*4 and
0:nn*4+2 |
|
Program control jumps to the address
contained in these two absolute memory locations |
|
Address locations 0:0 through 0:3FFH
are an interrupt vector (table) and contain starting addresses of subroutines
you may want to execute |
|
To finish an interrupt, use IRET |
|
It restores CS, IP and flags in the
proper order |
|
Not RET only restores IP (or IP and CS
for FAR call) |
Why Software Interrupts?
|
|
|
|
Hardware interrupts are used to handle
communication to/from peripheral devices |
|
Hardware interrupt occurs when an
external device sends a signal to the 8088’s interrupt request line along
with an interrupt number n on the 8088 data lines. The 8088 then: |
|
1. Stops whatever is being done |
|
2. Jumps to a subroutine n to handle
the data going to/from the device |
|
3. Returns to whatever was being done |
|
Software interrupts |
|
Can be used to test the implementation
of the interrupt handling routines |
|
Interrupt table approach is useful
because |
|
Indirection given by the interrupt
vector means we only need to specify the address of the routine in the
vector, can reference the routine using n |
|
Can lead to problems if 2 devices
expect to use same n |
Other Instructions of
Processors
|
|
|
|
NOP No operation |
|
Can be used to fill in code areas to
start instructions on word boundaries |
|
Can be used to control timing in a loop |
|
HLT Halt, to stop execution until: |
|
A pulse is sent to the reset line |
|
Or an interrupt occurs |
|
Useful for debugging |
|
LOCK |
|
Used to synchronize operations when
multiple processors |
|
Provides support for semaphores that
can be used to provide mutual exclusion over hardware resources |
Mutual Exclusion Example
|
|
|
|
|
Consider two processors that compete
for access to a shared memory location |
|
Semaphore is a memory location
containing a flag bit that is set if either CPU is using the managed resource |
|
Only one CPU is given access at a time |
|
BUSY: CMP SEMPHR, 80H;Is
semaphore set by other CPU |
|
JE BUSY ;Hang in loop if it is |
|
MOV SEMPHR,80H ;Else set the
set the semaphore |
|
.... ;Do whatever |
|
MOV SEMPHR,0 ;clear the semaphore |
|
Problem |
|
Both processors could pass JE and set
the semaphore! |
|
Need to guarantee mutual exclusion! |
|
Need a single indivisible instruction
that tests and sets the semaphore |
|
Must prevent doing test and set with
more than one memory access |
|
If 2 accesses are required, other
processors can sneak between the test and set (If they can, they eventually
will) |
Mutual
Exclusion
(cont.)
|
|
|
|
Next attempt: |
|
BUSY: MOV AL, 80H ; Prepare to set sem. if not set |
|
XCHG AL,SEMPHR ; Set semaphore & get prev. value |
|
OR AL,AL ; Set flags |
|
JNZ BUSY ; Hang in loop if prev value not 0 |
|
.... ; Do whatever |
|
MOV SEMPHR,0 ; clear the semaphore |
|
Problem? |
|
XCHG requires two memory accesses, so
problem can still arise |
|
Solution |
|
LOCK instruction |
|
When LOCK is placed in front of an
assembler instruction a signal is sent to a CPU pin called LOCK |
|
LOCK signal remains active for complete
duration of a single instruction |
|
Proper wiring of hardware ensures that
the other processor will not be able to access the memory location while LOCK
is set |
Advantages/Disadvantages
of Assembly Language
|
|
|
|
|
Advantages |
|
Access to ports/registers etc. that are
not accessible from high level languages |
|
Can generate “Fast” code sequences |
|
For example only save registers you
modify |
|
Disadvantages |
|
May have to deal with Segments etc.
that have nothing to do with the problem you are solving |
|
Different languages for each processor |
|
Poor readability |
|
Difficult to debug (prone to errors) |
|
Solution: higher level languages! |