 Simple addressing modes Array manipulation Memory addressing mode ## Overview

Memory (RAM) is the main component of a computer to store temporarily data and machine instructions. In a program, programmers many times need to read from and write into memory locations.

The question is how do we specify exactly which memory location we want to access? The simple answer is to give the label of the desired memory variable.

 Example: Reading and writing memory locations ``` . TempVar DW ? NextVar DW ? . . mov AX, TempVar sub NextVar, AX . ```

The above fragment code will read an operand value from memory location TempVar and copy it to the register AX. And then, read an operand value from memory location NextVar, and subtract the content of AX from the memory variable NextVar

However, the programming problem is much more complex. Look at the following problem.

 Suppose you have an array of byte named ByteArray. It consists of 10 elements: 12, 34, 2, 5, 1 ,7, 13, 45, 98, 4 In memory, the location of ByteArray starts at offset 100 in the data segment. To read the eighth byte, 45, which is at address 107, in Java we have to write: byte d = ByteArray; How can we do the same in assemby?

Fortunately, MASM provides us many different ways to handle the addressing of memory location. The simplest solution of the above problem is:

 Solution: Reading the eight byte of ByteArray ``` . .DATA ByteArray BYTE 12, 34, 2, 5, 1 ,7, 13, 45, 98, 4 . .CODE mov AX, @Data mov DS, AX mov AL, ByteArray + 7 ; or mov AL, ByteArray ```

MASM evaluates the last statement above as,

mov AL, [100+7]

Since ByteArray starts at offset 100, the offset of ByteArray and 7 are added together and used as a memory address. The instruction becomes

mov AL, 

 IMPORTANT Everything between square brackets will be treated as an ADDRESS/REFERENCE.

There are 17 different ways to specify a memory address using 16-bit memory addressing modes:

Operand TypePattern
direct [displacement]
register indirect [BX] [SI] [DI] [BP]
based [BP + displacement] [BX + displacement]
indexed [SI + displacement] [DI + displacement]
based-indexed [BX + SI] [BX + DI] [BP + SI] [BP + DI]
based-indexed with displacement [BX + SI + displacement] [BX + DI + displacement] [BP + SI + displacement] [BP + DI + displacement]

where displacement means any expression that produces a 16-bit constant value.

How can we keep in mind easily all above memory addressing modes?

## SI

or+or+Displacement

## DI

There are many different styles having the same meaning to write the above addressing modes:

[BP + 2], 2[BP], [BP], [BP]+2, 2+[BP] are the same

ByteArray[BX][SI]+1, [ByteArray + BX + SI + 1], [ByteArray][BX][SI], etc. are the same.

A displacement may be any of the following:

• the offset address of a variable
• a constant (positive or negative)
• the offset address of a variable plus or minus a constant
The syntax of an operand in Based and Indexed addressing modes is any of the following equivalent expressions:
• [register+displacement]
• [displacement+register]
• [register]+displacement
• displacement+[register]
• displacement[register]

In based-indexed addressing mode, the operand may be written in several ways; four of them are:

• variable[base_register][index_register]
• [base_register+index_register+variable+constant]
• variable[base_register+index_register+constant]
• constant[base_register+index_register+variable]

Look back at the previous problem, all the following instructions refer exactly to the same memory location, [ByteArray+7].

 To load the eighth byte of ByteArray into AL ```. .DATA ByteArray BYTE 12, 34, 2, 5, 1 ,7, 13, 45, 98, 4 . .CODE mov AX, @Data mov DS, AX mov AL, [ByteArray + 7] . mov SI, OFFSET ByteArray+7 mov AL, [SI] . mov BX, 7 mov AL, [ByteArray + BX] . mov BX, OFFSET ByteArray mov AL, [BX + 7] . mov SI, 7 mov AL, [ByteArray + SI] . mov BX, OFFSET ByteArray mov DI, 7 mov AL, [BX + DI] . mov SI, OFFSET ByteArray mov BX, 7 mov AL, [SI + BX] . mov BX, OFFSET ByteArray mov SI, 6 mov AL, [BX + SI + 1] . mov BX, 3 mov SI, 4 mov AL, [BX + ByteArray + SI] . ```

32-bit memory addressing modes are more powerful and more flexible than 16-bit addressing modes. Though, these modes are intended for accessing up to 1 Gigabytes RAM, still we can use it for real-mode programming.

Using 32-bit addressing modes, we have no limitation to use any 32-bit general purposes registers, except that the register ESP cannot be used as an index register.

Intel indulges our creativity for rich and flexible 32-bit memory addressing modes:

 Operand Type Pattern direct [displacement] register indirect [base] based [base + displacement] indexed [Index*scale + displacement] base-indexed [base + Index + displacement] base-indexed with scale factor [base + index*scale + displacement]

,where displacement means any expression that produces a 32-bit constant value.

However, please note in a real-mode programming, you have to ensure that the effective address (EA) MUST NOT exceed 64K.

Use only 32-bit registers and don't mix up the registers with 16-bit registers.

This is a format of 32-bit addressing modes:

BASE+(INDEX*SCALE)+Displacement

 EAX EBX ECX EDX ESI EDI EBP ESP

+

 EAX EBX ECX EDX ESI EDI EBP

*

 1   2   4 8

+Displacement

 Example: 32-bit memory operands ``` mov AX, [EAX] mov AL, [EBX + EBX] mov CX, [EBX + EBP*1] ; EBX is a base register (use DS) mov SI, [EBP + EBX*1] ; EBP is a base register (use SS) mov AL, [ByteArray + EDX + ESI*8] mov [ESP + EDX*4 + ByteArray + 7], AL ```

 Example: incorrect usage of 32-bit memory operands ``` mov AX, [AX] mov AL, [EBX + BX] ; Mix 16-bit reg with 32-bit reg. mov CX, [EBX + ESP*1] ; ESP cannot be an index mov SI, [EBP + EBX*3] ; Scale factor is invalid mov AL, [ByteArray + EDX*4 + ESI*8] mov [ESP + EDX + ECX], AL ```

 It is a good programming practice, if we always distinguish between index registers and base registers by using a scale factor. So, it much prefers [EBX + ESP*1] form to [EBX + ESP] form.

## Segment-override prefix

When we use memory addressing modes as operand type, the effective address will be a location in memory. As known before, in a real-mode programming a location in memory composed of 2 parts, the SEGMENT containing the operand and the OFFSET from the beginning of the segment to the operand.

By default, if we do not specify explicitly the segment part, the processor automatically selects a segment based on the simple rule mentioned in the following table.

To change the segment part, we have to specify the segment part explicitly by Segment-override prefix, like the following examples:

 Type of Addressing modes Operand Type Default segment used 16-bit [BX], [SI], [DI] DS 16-bit [BP] SS 32-bit Either [EBP] or [ESP] register as the base SS 32-bit Otherwise DS

 Valid example of segment-override prefix ``` mov AX, ES:[SI] mov EAX, SS:[DI] mov CS:[EBX], AL mov SS:[EBX + EBP*2 + 8], SI ```

However, we have to bear in mind that we cannot impose the rule on these 3 cases:

 Case register used Fetching instructions The processor always uses CS register. Destination of string instructions Always using ES register. All stack pushes and pops Always refering to SS register.

Review Example

Assume that the initial state of 80x86's registers and memory, just when your assembly language program starts running, is as follows:

 Registers Physical address Memory content Physical address Memory content EAX = 10010H EBX = 20H ECX = 30H EDX = 40H ESI = 90100H EDI = 10200H EBP = 10H ESP = 30H 00101 ... 02000 02001 02002 02003 02004 02005 02006 02007 02008 02009 0200A 0200B 0200C 0200D 0200E 0200F 02010 02011 02012 02013 02014 02015 02016 02017 02018 02019 0201A 0201B 0201C 0201D 0201E 0201F ... 02 ... 14 00 14 00 14 00 14 00 14 00 14 00 09 00 48 65 6C 6C 6F 0D 0A 24 1 2 3 4 5 FF FF A1 0 31 ... 02030 02031 ... 02070 02071 02072 ... 02100 02101 ... 02120 02121 ... 02130 02131 ... 02140 02141 ... 02150 02151 ... 02200 02201 ... 02210 02211 ... 02220 02221 ... 0222F 02230 02231 ... F1 EC ... 4 2 9 ... FF 9 ... 30 0 ... 40 0 ... 30 0 ... 2 1 ... 2 2 ... DE A1 ... FF FE ... FC FD 34 ... Segment Register CS = 200H DS = 200H SS = 220H ES = 300H

All numbers are in hexadecimal format.

Suppose that the following is a part of your assembly code. The assembler sets 0 as the offset address of table1.

 ·DATA table1 dw 6 DUP(20), 9 msg1 db 'Hello', 13, 10, '\$' var1 LABEL WORD var2 LABEL DWORD var3 db 1, 2, 3, 4, 5

What is the result produced by executing each of the following instructions or operations independently? add BX, AX inc WORD PTR [100h] sub DL, [msg1+3] mov DX, OFFSET var1 mov AX, [var1+1] mov [table1+SI], SI mov CX, OFFSET var3 sub SI, [DI + BX] mov ESP, -9001 mov WORD PTR [EBP + EBX*1], 2 mov BYTE PTR [EBX + EBP*1], 3 mov [var2], -100 sub EAX, 3 SOLUTION