Basic Pentium Instructions
To allow you to write a simple program in Assembly Language, in this module you will learn briefly some very basic Pentium instructions in their simplest form.
These instructions are grouped into:
 | Data Movement instructions
|  | Basic Arithmetic instructions
|  | Flow control instructions
|
Pentium has hundreds machine instructions. Each instruction has its specific characteristics and behaviours. To write a program in Assembly Language, the programmers must understand those instructions very well. Therefore, it seems difficult to learn the instructions and even to keep in mind.
However, fortunately there is a way to dissolve the difficulty and to expedite the learning-time. In order to know the characteristic of each instruction, you have to be able to read the description of each instruction.
All Pentium instructions are described by using the following format:
MNEMONIC | Operand1 | , Operand2 | , Operand3
|
Operand1, Operand2, and Operand3 are optional.
Operand may be in one of the following form:
 | reg
|  | r32
|  | r16
|  | r8
|  | Sreg
|  | mem8
|  | mem16
|  | mem32
|  | mem
|  | r/m8
|  | r/m16
|  | r/m32
|  | r/m
|  | imm
| | Any general purpose register. It could be reg8, or reg16, or reg32.
| Either EAX, EBX, ECX, EDX, ESI, EDI, EBP, or ESP.
| Either AX, BX, CX, DX, SI, DI, BP, or SP.
| Either AL, AH, BL, BH, CL, CH, DL, or DH.
| Either CS, DS, ES, SS, FS, or GS.
| Memory address of 8-bit data.
| Memory address of 16-bit data.
| Memory address of 32-bit data.
| Memory address of either 8-bit, 16-bit or 32-bit data.
| Either r8 or mem8.
| Either r16 or mem16.
| Either r32 or mem32.
| Either reg or mem.
| Immediate(constant) value.
|
|
Data Movement Instructions
The followings are basic movement instructions:
Instruction
|
Operands
|
Notes
| mov |
Destination, Source
r/m, reg
reg, r/m
r/m16, Sreg
Sreg, r/m16
r/m, imm
|
Data copy.
Source operand is copied into the destination. The content of destination
is overwritten. After operations, the source remains the same.
CS cannot be destination.
The source and the destinations MUST be of the SAME size.
| movsx
|
r16, r/m8
r32, r/m8
r32, r/m16
|
Data copy with sign extension.
| movzx
|
r16, r/m8
r32, r/m8
r32, r/m16
|
Data copy with zero extension
| lea
|
r16, mem
r32, mem
|
Store effective address for m in register
| xchg
|
r/m, reg
reg, r/m
|
Swap the contents of two operands
| bswap
|
r32
|
Swap bytes to convert little/big endian data in a 32-bit register to big/little endian form.
| xlatb
|
|
AL = DS:[BX + unsigned AL].
|
|
Example: To swap the content of 2 variables |
.DATA
var1 dw 120
var2 dw 1000
.CODE
. . .
xchg AX, var1
xchg AX, var2
xchg AX, var1
|
Basic Integer Arithmetic Instructions
Instruction | Operands | Notes
| add |
reg, r/m
r/m, reg
reg, imm
r/m, imm
|
Destination ¬
destination + source.
EFLAGS set based on result. First operand is used as
source and overwritten as destination. If the operands are signed
integers, the OF flag indicates an invalid result. If the operands are
unsigned, the CF flag indicates an invalid result.
|
sub
|
reg, r/m
r/m, reg
reg, imm
r/m, imm
|
Destination ¬
destination - source.
EFLAGS set based on result.
|
inc
|
r/m
|
Destination ¬
destination + 1.
EFLAGS set based on result, but it does not affect the carry flag (CF).
|
dec
|
r/m
|
Destination ¬
destination – 1.
EFLAGS set based on result, but it does not affect
the carry flag (CF).
|
neg
| r/m
| It subtracts its operand from 0, which results in a two’s complement (integer) negation of the operand. EFLAGS set based on result.
|
cmp
|
reg, r/m
r/m, reg
r/m, imm
| It subtracts the contents of destination from
source and discards the result. Only the EFLAGS register is affected.
|
Condition
|
Signed Compare
|
Unsigned Compare
|
Op1 > Op2
Op1 ³
Op2
Op1 = Op2
Op1 £
Op2
Op1 < Op2
|
ZF=0 and SF=OF
SF = OF
ZF = 1
ZF = 1 and SF¹OF
SF ¹
OF
|
CF=0 and ZF=0
CF = 0
ZF = 1
CF = 1 or ZF = 1
CF = 1
| |
 Review Example |
Assume that the initial state of 80x86's registers and memory, just when your assembly language program starts running, is as follows:
Registers | Physical address | Memory content | Physical address | Memory content
|
EAX = 18010H
EBX = 20H
ECX = 30H
EDX = 40H
ESI = 90100H
EDI = 10200H
EBP = 10H
ESP = 30H
|
00101
...
02000
02001
02002
02003
02004
02005
02006
02007
02008
02009
0200A
0200B
0200C
0200D
0200E
0200F
02010
02011
02012
02013
02014
02015
02016
02017
02018
02019
0201A
0201B
0201C
0201D
0201E
0201F
...
|
02
...
14
00
14
00
14
00
14
00
14
00
14
00
09
00
48
65
6C
6C
6F
0D
0A
24
1
2
3
4
5
FF
FF
A1
0
31
...
|
02030
02031
...
02070
02071
02072
...
02100
02101
...
02120
02121
...
02130
02131
...
02140
02141
...
02150
02151
...
02200
02201
...
02210
02211
...
02220
02221
...
0222F
02230
02231
...
|
F1
EC
...
4
2
9
...
FF
9
...
30
0
...
40
0
...
30
0
...
2
1
...
2
2
...
DE
A1
...
FF
FE
...
FC
FD
34
...
| Segment Register
|
CS = 200H
DS = 200H
SS = 220H
ES = 300H
|
All numbers are in hexadecimal format.
Suppose that the following is a part of your assembly code. The assembler sets 0 as the offset address of table1.
·DATA |
| table1 | dw 6 DUP(20), 9
| msg1 | db 'Hello', 13, 10, '$'
| var1 | LABEL WORD
| var2 | LABEL DWORD
| var3 | db 1, 2, 3, 4, 5
|
What is the result produced by executing each of the following instructions or operations independently?
 | movzx EBX, AX
|  | movsx EBX, AX
|  | bswap EAX
|  | lea DX, [var1]
|  | movsx AX, [msg1]
|  | xchg SI, AX
|  | xlatb
| |
 | add EAX, [var2]
|  | sub EBX, [var2]
|  | cmp EBX, ECX
|  | mov [var3], -100
|  | mov [var2], -100
|  | sub EAX, 3
| |
|
 SOLUTION |
movzx EBX, AX | AX=8010H, EBX=00008010h. EFLAGS isn't change.
| movsx EBX, AX | AX=8010h, EBX=FFFF8010h. EFLAGS isn't change.
| bswap EAX | Initially, EAX=00018010h. Thus, EAX=10800100h. EFLAGS isn't change.
| lea DX, [var1] | DX is assigned with the offset address of var1. DX = 22 = 0016h
| movsx AX, [msg1] | msg1 is a label for a location in memory at offset address DS:0016h. MASM associates its size with a byte. The content of the memory at that location = ASCII code of 'H' = 72 = 48h. Thus, AX = 0048h.
| xchg SI, AX | Initially, SI=0100h and AX=8010h. Thus, SI=8010h and AX=0100h.
| xlatb | AL=10h, BX=20h. So, BX+AL=30h. Then, DS:[BX+AL] is DS:[30h] or the physical address is 200h*10h + 30h = 2030h. The content of memory at address 2030h is F1h. Thus, AL=F1h.
|
|
Introduction to Flow Control Instructions
Most of high-level programming languages have special statements to control the flow of a program. These flow control statements are classified into two major groups:
 | Branching statements such as if-else, switch, and goto.
|  | Iteration statements such as for, while, and do-while.
|
Pentium Assembly language supports these high-level programming language features in the simple but elegant forms.
These forms do not correspond directly to the flow control statement in high-level programming language. But, in fact using flow-control instuctions of Pentium Assembly language we may construct flow-control statements of high-level programming language in a more flexible way.
Followings are general relation between flow-control instructions of Pentium assembly language and High-level programming language.
HLL | Assembly Language
| if (op1 == op2) {
statement1;
statement2; }
statement3;
|
| ;Assume op1 and op2 are words and AX is free
|
| mov AX, op1
|
| cmp AX, op2
|
| jnz @1
|
| statement1
|
| statement2
|
@1:
| statement3
|
|
if(op1 != op2) {
statement1;
statement2;
} else {
statement3; }
statement4;
|
| ;Assume op1 and op2 are words and AX is free |
| mov AX, op1
|
| cmp AX, op2
|
| jz @1
|
| statement1
|
| statement2
|
| jmp @2
|
@1:
| statement 3
|
@2:
| statement4
|
|
CX = 7;
while ( CX > 0) {
statement1;
CX --;
}
|
| mov CX, 7
|
@1:
| jcxz @2
|
| statement1
|
| dec CX
|
| jmp @1
|
@2:
|
|
| |
The CMP instruction and Conditional Jump instructions
In assembly language, when two numbers are compared, it is imperative to know that:
 | A signed number can be Greater, Less, or Equal to another signed number.
|  | An unsigned number can be Above, Below, or Equal to another unsigned number.
|
As mentioned before that the CMP instruction compares the two operands by performing the subtraction Operand1 - Operand2 without modifying any of its operands and then based on this subtraction, one or more flags are altered.
That is the reason we usually use one or more conditional jump instructions immediately after CMP instruction.
The following table is a brief guidance how to select an appropriate conditional jump instructions.
Condition | Equivalent condition | Signed jump | Unsigned jump
| > | not £ | JG, JNLE | JA, JNBE
|
³
| not <
| JGE, JNL
| JAE, JNB
|
<
| not ³
| JL, JNGE
| JB, JNAE
|
£
| not >
| JLE, JNG
| JBE, JNA
| = = | ZF==1 | JE, JZ | JE, JZ
| ! = | ZF==0 | JNE, JNZ | JNE, JNZ
|
Example to display a string EEEEE |
mov CX , 5
mov AH , 02h
mov DL , 'E'
@1: int 21H
dec CX
jnz @1
|
Example to display a string ABCDEFG |
mov AH , 02H
mov DL , 'A'
@1: int 21h
inc DL
cmp DL , 'G'
jbe @1
|
Write a loop to display:
SOLUTION:
| mov AH, 02H |
| | mov BL, 'Z' |
| @2: | mov DL, BL |
| | int 21H |
| | mov DL, 0DH | ; generate CR and LF
| | int 21H |
| | mov DL, 0AH |
| | int 21H |
| | dec BL |
| | cmp BL, 'U' |
| | jae @2 |
| |
One use of XLATB is to filter out unwanted characters from a stream of text. Suppose we want to input a string of 20 characters from the keyboard but echo only those with ASCII values from 32 to 127 (i.e., only printable ASCII characters). We can set up a translation table, place a zero in each table position corresponding to a non-printable character, and place a one in each position corresponding to a printable character:
Example: Character filtering |
.DATA
VALIDCHARS DB 32 DUP(0) ; invalid characters: 0 - 31
DB 96 DUP(1) ; valid characters: 32 - 127
DB 128 DUP(0) ; invalid characters: 128 - 255
.CODE
mov AX, @Data
mov DS, AX
. . .
mov BX, OFFSET VALIDCHARS
mov CX, 20
@1: MOV AH, 08H ; input character, no echo
int 21H
mov DL, AL ; save character in DL
xlatb
cmp AL, 0
je @1 ; reject non-printable character
mov AH, 02H
int 21H
loop @1
|
|