Basic Pentium Instructions

To allow you to write a simple program in Assembly Language, in this module you will learn briefly some very basic Pentium instructions in their simplest form.

These instructions are grouped into:

Data Movement instructions
Basic Arithmetic instructions
Flow control instructions

Pentium has hundreds machine instructions. Each instruction has its specific characteristics and behaviours. To write a program in Assembly Language, the programmers must understand those instructions very well. Therefore, it seems difficult to learn the instructions and even to keep in mind.

However, fortunately there is a way to dissolve the difficulty and to expedite the learning-time. In order to know the characteristic of each instruction, you have to be able to read the description of each instruction.

All Pentium instructions are described by using the following format:

MNEMONIC Operand1 , Operand2 , Operand3

Operand1, Operand2, and Operand3 are optional.

Operand may be in one of the following form:

reg
r32
r16
r8
Sreg
mem8
mem16
mem32
mem
r/m8
r/m16
r/m32
r/m
imm
Any general purpose register. It could be reg8, or reg16, or reg32.
Either EAX, EBX, ECX, EDX, ESI, EDI, EBP, or ESP.
Either AX, BX, CX, DX, SI, DI, BP, or SP.
Either AL, AH, BL, BH, CL, CH, DL, or DH.
Either CS, DS, ES, SS, FS, or GS.
Memory address of 8-bit data.
Memory address of 16-bit data.
Memory address of 32-bit data.
Memory address of either 8-bit, 16-bit or 32-bit data.
Either r8 or mem8.
Either r16 or mem16.
Either r32 or mem32.
Either reg or mem.
Immediate(constant) value.

Data Movement Instructions

The followings are basic movement instructions:

Instruction Operands Notes
mov Destination, Source
r/m, reg
reg, r/m
r/m16, Sreg
Sreg, r/m16
r/m, imm Data copy.
Source operand is copied into the destination. The content of destination is overwritten. After operations, the source remains the same.
CS cannot be destination.
The source and the destinations MUST be of the SAME size.
movsx r16, r/m8
r32, r/m8
r32, r/m16 Data copy with sign extension.

movzx r16, r/m8
r32, r/m8
r32, r/m16 Data copy with zero extension
lea r16, mem
r32, mem Store effective address for m in register
xchg r/m, reg
reg, r/m Swap the contents of two operands
bswap r32 Swap bytes to convert little/big endian data in a 32-bit register to big/little endian form.
xlatb AL = DS:[BX + unsigned AL].

Example: To swap the content of 2 variables
.DATA var1 dw 120 var2 dw 1000 .CODE . . . xchg AX, var1 xchg AX, var2 xchg AX, var1

Basic Integer Arithmetic Instructions

Instruction Operands Notes
add reg, r/m
r/m, reg
reg, imm
r/m, imm
Destination ¬ destination + source.

EFLAGS set based on result. First operand is used as source and overwritten as destination. If the operands are signed integers, the OF flag indicates an invalid result. If the operands are unsigned, the CF flag indicates an invalid result.

sub reg, r/m
r/m, reg
reg, imm
r/m, imm
Destination ¬ destination - source.

EFLAGS set based on result.

inc r/m
Destination ¬ destination + 1.

EFLAGS set based on result, but it does not affect the carry flag (CF).

dec r/m
Destination ¬ destination – 1.

EFLAGS set based on result, but it does not affect the carry flag (CF).
neg r/m It subtracts its operand from 0, which results in a two’s complement (integer) negation of the operand. EFLAGS set based on result.
cmp reg, r/m
r/m, reg
r/m, imm It subtracts the contents of destination from source and discards the result. Only the EFLAGS register is affected.
Condition Signed Compare Unsigned Compare
Op1 > Op2
Op1 ³ Op2
Op1 = Op2
Op1 £ Op2
Op1 < Op2 ZF=0 and SF=OF
SF = OF
ZF = 1
ZF = 1 and SF¹OF
SF ¹ OF
CF=0 and ZF=0
CF = 0
ZF = 1
CF = 1 or ZF = 1
CF = 1

Review Example

Assume that the initial state of 80x86's registers and memory, just when your assembly language program starts running, is as follows:

Registers Physical address Memory content Physical address Memory content
EAX = 18010H
EBX = 20H
ECX = 30H
EDX = 40H
ESI = 90100H
EDI = 10200H
EBP = 10H
ESP = 30H 00101
...
02000
02001
02002
02003
02004
02005
02006
02007
02008
02009
0200A
0200B
0200C
0200D
0200E
0200F
02010
02011
02012
02013
02014
02015
02016
02017
02018
02019
0201A
0201B
0201C
0201D
0201E
0201F
... 02
...
14
00
14
00
14
00
14
00
14
00
14
00
09
00
48
65
6C
6C
6F
0D
0A
24
1
2
3
4
5
FF
FF
A1
0
31
... 02030
02031
...
02070
02071
02072
...
02100
02101
...
02120
02121
...
02130
02131
...
02140
02141
...
02150
02151
...
02200
02201
...
02210
02211
...
02220
02221
...
0222F
02230
02231
... F1
EC
...
4
2
9
...
FF
9
...
30
0
...
40
0
...
30
0
...
2
1
...
2
2
...
DE
A1
...
FF
FE
...
FC
FD
34
...
Segment Register
CS = 200H
DS = 200H
SS = 220H
ES = 300H

All numbers are in hexadecimal format.

Suppose that the following is a part of your assembly code. The assembler sets 0 as the offset address of table1.

·DATA
table1 dw 6 DUP(20), 9
msg1 db 'Hello', 13, 10, '$'
var1 LABEL WORD
var2 LABEL DWORD
var3 db 1, 2, 3, 4, 5

What is the result produced by executing each of the following instructions or operations independently?

movzx EBX, AX
movsx EBX, AX
bswap EAX
lea DX, [var1]
movsx AX, [msg1]
xchg SI, AX
xlatb

add EAX, [var2]
sub EBX, [var2]
cmp EBX, ECX
mov [var3], -100
mov [var2], -100
sub EAX, 3

SOLUTION

movzx EBX, AX AX=8010H, EBX=00008010h. EFLAGS isn't change.
movsx EBX, AX AX=8010h, EBX=FFFF8010h. EFLAGS isn't change.
bswap EAX Initially, EAX=00018010h. Thus, EAX=10800100h. EFLAGS isn't change.
lea DX, [var1] DX is assigned with the offset address of var1. DX = 22 = 0016h
movsx AX, [msg1] msg1 is a label for a location in memory at offset address DS:0016h. MASM associates its size with a byte. The content of the memory at that location = ASCII code of 'H' = 72 = 48h. Thus, AX = 0048h.
xchg SI, AX Initially, SI=0100h and AX=8010h. Thus, SI=8010h and AX=0100h.
xlatb AL=10h, BX=20h. So, BX+AL=30h. Then, DS:[BX+AL] is DS:[30h] or the physical address is 200h*10h + 30h = 2030h. The content of memory at address 2030h is F1h. Thus, AL=F1h.

Introduction to Flow Control Instructions

Most of high-level programming languages have special statements to control the flow of a program. These flow control statements are classified into two major groups:

	Branching statements such as if-else, switch, and goto.
	Iteration statements such as for, while, and do-while.

Pentium Assembly language supports these high-level programming language features in the simple but elegant forms.

These forms do not correspond directly to the flow control statement in high-level programming language. But, in fact using flow-control instuctions of Pentium Assembly language we may construct flow-control statements of high-level programming language in a more flexible way.

Followings are general relation between flow-control instructions of Pentium assembly language and High-level programming language.

HLL Assembly Language
if (op1 == op2) {
     statement1;
     statement2; }
statement3;

;Assume op1 and op2 are words and AX is free
mov AX, op1
cmp AX, op2
jnz @1
statement1
statement2
@1: statement3

if(op1 != op2) {
     statement1;
     statement2;
} else {
     statement3; }
statement4;

;Assume op1 and op2 are words and AX is free
mov AX, op1
cmp AX, op2
jz @1
statement1
statement2
jmp @2
@1: statement 3
@2: statement4

CX = 7;
while ( CX > 0) {
     statement1;
     CX --;
}

mov CX, 7
@1: jcxz @2
statement1
dec CX
jmp @1
@2:

The CMP instruction and Conditional Jump instructions

In assembly language, when two numbers are compared, it is imperative to know that:

A signed number can be Greater, Less, or Equal to another signed number.
An unsigned number can be Above, Below, or Equal to another unsigned number.

As mentioned before that the CMP instruction compares the two operands by performing the subtraction Operand1 - Operand2 without modifying any of its operands and then based on this subtraction, one or more flags are altered.

That is the reason we usually use one or more conditional jump instructions immediately after CMP instruction.

The following table is a brief guidance how to select an appropriate conditional jump instructions.

Condition Equivalent condition Signed jump Unsigned jump
> not £ JG, JNLE JA, JNBE
³ not < JGE, JNL JAE, JNB
< not ³ JL, JNGE JB, JNAE
£ not > JLE, JNG JBE, JNA
= = ZF==1 JE, JZ JE, JZ
! = ZF==0 JNE, JNZ JNE, JNZ

Example to display a string EEEEE
mov CX , 5 mov AH , 02h mov DL , 'E' @1: int 21H dec CX jnz @1

Example to display a string ABCDEFG
mov AH , 02H mov DL , 'A' @1: int 21h inc DL cmp DL , 'G' jbe @1

Write a loop to display:

Z
Y
X
W
V
U

SOLUTION:

mov AH, 02H
mov BL, 'Z'
@2: mov DL, BL
int 21H
mov DL, 0DH ; generate CR and LF
int 21H
mov DL, 0AH
int 21H
dec BL
cmp BL, 'U'
jae @2

One use of XLATB is to filter out unwanted characters from a stream of text. Suppose we want to input a string of 20 characters from the keyboard but echo only those with ASCII values from 32 to 127 (i.e., only printable ASCII characters). We can set up a translation table, place a zero in each table position corresponding to a non-printable character, and place a one in each position corresponding to a printable character:

Example: Character filtering
.DATA VALIDCHARS DB 32 DUP(0) ; invalid characters: 0 - 31 DB 96 DUP(1) ; valid characters: 32 - 127 DB 128 DUP(0) ; invalid characters: 128 - 255 .CODE mov AX, @Data mov DS, AX . . . mov BX, OFFSET VALIDCHARS mov CX, 20 @1: MOV AH, 08H ; input character, no echo int 21H mov DL, AL ; save character in DL xlatb cmp AL, 0 je @1 ; reject non-printable character mov AH, 02H int 21H loop @1