Memory addressing mode 

Overview

Memory (RAM) is the main component of a computer to store temporarily data and machine instructions. In a program, programmers many times need to read from and write into memory locations.

The question is how do we specify exactly which memory location we want to access? The simple answer is to give the label of the desired memory variable.

Example: Reading and writing memory locations
 .
 TempVar	DW	?
 NextVar	DW	?
 .
 .
 		mov	AX, TempVar
		sub	NextVar, AX
 .

The above fragment code will read an operand value from memory location TempVar and copy it to the register AX. And then, read an operand value from memory location NextVar, and subtract the content of AX from the memory variable NextVar

However, the programming problem is much more complex. Look at the following problem.

Suppose you have an array of byte named ByteArray. It consists of 10 elements:

12, 34, 2, 5, 1 ,7, 13, 45, 98, 4

In memory, the location of ByteArray starts at offset 100 in the data segment.

To read the eighth byte, 45, which is at address 107, in Java we have to write:

byte d = ByteArray[7];

How can we do the same in assemby?

Fortunately, MASM provides us many different ways to handle the addressing of memory location. The simplest solution of the above problem is:

Solution: Reading the eight byte of ByteArray
 .
 .DATA
 ByteArray	BYTE	12, 34, 2, 5, 1 ,7, 13, 45, 98, 4
 .
 .CODE
 mov	AX, @Data
 mov	DS, AX
 mov	AL, ByteArray + 7 ; or mov AL, ByteArray[7]
 

MASM evaluates the last statement above as,

mov AL, [100+7]

Since ByteArray starts at offset 100, the offset of ByteArray and 7 are added together and used as a memory address. The instruction becomes

mov AL, [107]

 IMPORTANT

Everything between square brackets will be treated as an ADDRESS/REFERENCE.

16-bit Memory Addressing Modes

There are 17 different ways to specify a memory address using 16-bit memory addressing modes:

Operand TypePattern
direct
[displacement]
register indirect
[BX]
[SI]
[DI]
[BP]
based
[BP + displacement]
[BX + displacement]
indexed
[SI + displacement]
[DI + displacement]
based-indexed
[BX + SI]
[BX + DI]
[BP + SI]
[BP + DI]
based-indexed with displacement
[BX + SI + displacement]
[BX + DI + displacement]
[BP + SI + displacement]
[BP + DI + displacement]

where displacement means any expression that produces a 16-bit constant value.

How can we keep in mind easily all above memory addressing modes?

 16-bit memory addressing modes

BX

 

SI

 
or+or+Displacement

BP

 

DI

  

There are many different styles having the same meaning to write the above addressing modes:

[BP + 2], 2[BP], [BP][2], [BP]+2, 2+[BP] are the same

ByteArray[BX][SI]+1, [ByteArray + BX + SI + 1], [ByteArray][BX][SI][1], etc. are the same.

A displacement may be any of the following:

  • the offset address of a variable
  • a constant (positive or negative)
  • the offset address of a variable plus or minus a constant
The syntax of an operand in Based and Indexed addressing modes is any of the following equivalent expressions:
  • [register+displacement]
  • [displacement+register]
  • [register]+displacement
  • displacement+[register]
  • displacement[register]

In based-indexed addressing mode, the operand may be written in several ways; four of them are:

  • variable[base_register][index_register]
  • [base_register+index_register+variable+constant]
  • variable[base_register+index_register+constant]
  • constant[base_register+index_register+variable]

Look back at the previous problem, all the following instructions refer exactly to the same memory location, [ByteArray+7].

To load the eighth byte of ByteArray into AL
.
 .DATA
 ByteArray	BYTE	12, 34, 2, 5, 1 ,7, 13, 45, 98, 4
 .
 .CODE
 mov	AX, @Data
 mov	DS, AX
 mov	AL, [ByteArray + 7]
 .
 mov	SI, OFFSET ByteArray+7
 mov	AL, [SI]
 .
 mov	BX, 7
 mov	AL, [ByteArray + BX]
 .
 mov	BX, OFFSET ByteArray
 mov	AL, [BX + 7]
 .
 mov	SI, 7
 mov	AL, [ByteArray + SI]
 .
 mov	BX, OFFSET ByteArray
 mov	DI, 7
 mov	AL, [BX + DI]
 .
 mov	SI, OFFSET ByteArray
 mov	BX, 7
 mov	AL, [SI + BX]
 .
 mov	BX, OFFSET ByteArray
 mov	SI, 6
 mov	AL, [BX + SI + 1]
 .
 mov	BX, 3
 mov	SI, 4
 mov	AL, [BX + ByteArray + SI]
 .

32-bit Memory Addressing

32-bit memory addressing modes are more powerful and more flexible than 16-bit addressing modes. Though, these modes are intended for accessing up to 1 Gigabytes RAM, still we can use it for real-mode programming.

Using 32-bit addressing modes, we have no limitation to use any 32-bit general purposes registers, except that the register ESP cannot be used as an index register.

Intel indulges our creativity for rich and flexible 32-bit memory addressing modes:

Operand TypePattern
direct[displacement]
register indirect [base]
based[base + displacement]
indexed[Index*scale + displacement]
base-indexed[base + Index + displacement]
base-indexed with scale factor[base + index*scale + displacement]

,where displacement means any expression that produces a 32-bit constant value.

However, please note in a real-mode programming, you have to ensure that the effective address (EA) MUST NOT exceed 64K.

Use only 32-bit registers and don't mix up the registers with 16-bit registers.

This is a format of 32-bit addressing modes:

BASE+(INDEX*SCALE)+Displacement
       

EAX
EBX
ECX
EDX
ESI
EDI
EBP
ESP

+

EAX
EBX
ECX
EDX
ESI
EDI
EBP

*

1
 
2
 
4

8

+Displacement

Example: 32-bit memory operands
 mov	AX, [EAX]
 mov	AL, [EBX + EBX]
 mov	CX, [EBX + EBP*1]	; EBX is a base register (use DS)
 mov	SI, [EBP + EBX*1]	; EBP is a base register (use SS)
 mov	AL, [ByteArray + EDX + ESI*8]
 mov	[ESP + EDX*4 + ByteArray + 7], AL

Example: incorrect usage of 32-bit memory operands
 mov	AX, [AX]
 mov	AL, [EBX + BX]		; Mix 16-bit reg with 32-bit reg.
 mov	CX, [EBX + ESP*1]	; ESP cannot be an index
 mov	SI, [EBP + EBX*3]	; Scale factor is invalid
 mov	AL, [ByteArray + EDX*4 + ESI*8]
 mov	[ESP + EDX + ECX], AL

It is a good programming practice, if we always distinguish between index registers and base registers by using a scale factor.

So, it much prefers [EBX + ESP*1] form to [EBX + ESP] form.

Segment-override prefix

When we use memory addressing modes as operand type, the effective address will be a location in memory. As known before, in a real-mode programming a location in memory composed of 2 parts, the SEGMENT containing the operand and the OFFSET from the beginning of the segment to the operand.

By default, if we do not specify explicitly the segment part, the processor automatically selects a segment based on the simple rule mentioned in the following table.

To change the segment part, we have to specify the segment part explicitly by Segment-override prefix, like the following examples:

Type of Addressing modesOperand TypeDefault segment used
16-bit[BX], [SI], [DI]DS
16-bit[BP]SS
32-bitEither [EBP] or [ESP] register as the baseSS
32-bitOtherwiseDS

Valid example of segment-override prefix
 mov	AX, ES:[SI]
 mov	EAX, SS:[DI]
 mov	CS:[EBX], AL
 mov	SS:[EBX + EBP*2 + 8], SI

However, we have to bear in mind that we cannot impose the rule on these 3 cases:

Caseregister used
Fetching instructions The processor always uses CS register.
Destination of string instructionsAlways using ES register.
All stack pushes and popsAlways refering to SS register.

 Review Example

Assume that the initial state of 80x86's registers and memory, just when your assembly language program starts running, is as follows:

RegistersPhysical addressMemory contentPhysical addressMemory content
EAX = 10010H
EBX = 20H
ECX = 30H
EDX = 40H
ESI = 90100H
EDI = 10200H
EBP = 10H
ESP = 30H
00101
...
02000
02001
02002
02003
02004
02005
02006
02007
02008
02009
0200A
0200B
0200C
0200D
0200E
0200F
02010
02011
02012
02013
02014
02015
02016
02017
02018
02019
0201A
0201B
0201C
0201D
0201E
0201F
...
02
...
14
00
14
00
14
00
14
00
14
00
14
00
09
00
48
65
6C
6C
6F
0D
0A
24
1
2
3
4
5
FF
FF
A1
0
31
...
02030
02031
...
02070
02071
02072
...
02100
02101
...
02120
02121
...
02130
02131
...
02140
02141
...
02150
02151
...
02200
02201
...
02210
02211
...
02220
02221
...
0222F
02230
02231
...
F1
EC
...
4
2
9
...
FF
9
...
30
0
...
40
0
...
30
0
...
2
1
...
2
2
...
DE
A1
...
FF
FE
...
FC
FD
34
...
Segment Register
CS = 200H
DS = 200H
SS = 220H
ES = 300H

All numbers are in hexadecimal format.

Suppose that the following is a part of your assembly code. The assembler sets 0 as the offset address of table1.

┬ĚDATA 
table1dw 6 DUP(20), 9
msg1db 'Hello', 13, 10, '$'
var1LABEL WORD
var2LABEL DWORD
var3db 1, 2, 3, 4, 5

What is the result produced by executing each of the following instructions or operations independently?

add BX, AX
inc WORD PTR [100h]
sub DL, [msg1+3]
mov DX, OFFSET var1
mov AX, [var1+1]
mov [table1+SI], SI
mov CX, OFFSET var3
sub SI, [DI + BX]
mov ESP, -9001
mov WORD PTR [EBP + EBX*1], 2
mov BYTE PTR [EBX + EBP*1], 3
mov [var2], -100
sub EAX, 3

 SOLUTION