Segment Registers 

Memory Segments

Memory segmentation (i.e. partitioning into smaller segments) is necessary since the 20-bits memory addresses cannot fit in the 16-bits CPU registers (i.e. addresses of data and instructions can not be stored directly in the x86 registers)

Since x86 registers are 16-bits wide, a memory segment is made of 216 consecutive words (i.e. 64K words)

Each segment has a number identifier that is also a 16-bit number (i.e. we have segments numbered from 0 to 64K)

A memory location within a memory segment is referenced by specifying its offset from the start of the segment. Hence the first word in a segment has an offset of 0 while the last one has an offset of FFFFh

To reference a memory location its logical address has to be specified. The logical address is written as:

Segment number:offset

For example, A43F:3487h means offset 3487h within segment A43Fh.

The logical address has to be translated to a 20-bit physical address that specifies the actual location of the word in the main memory. This is done as follows:
  1. The segment number is shifted 4-bits (or 1 hexadecimal digit)to the left. This equivalent to multiplying by 10h, i.e. a 0h is inserted at the LSD of the hexadecimal segment number

  2. The resulting 5-digit hexadecimal segment number is added to the offset to yield the 20-bits physical address



 An example on translating a logical address to a physical address


The logical address A43F:3487h is translated to a 20-bit physical address as follows:

First the segment number is shifted one hexadecimal digit to the left and a 0 is inserted from the right to become:

A43F0h


Then it is added to the offset to give the 20-bit physical address:

A43F0h
+3487h

A7877h      the 20-bit address





There is a lot of overlapping between segments in the main memory. A new segment starts every 10h locations (i.e. every 16 locations), hence the starting address of a segment always has a 0h LSD. This is demonstrated in the table below:



Segment Physical Address (hex)
10021
10020
End of Segment 2 1001F
1001E
10010
End of Segment 1 1000F
1000E
10000
End of Segment 0 0FFFF
0FFFE
00021
Start of Segment 2 00020
0001F
00011
Start of Segment 1 00010
0000F
00003
00002
00001
Start of Segment 0 00000



Each 16 memory words are called a paragraph and an address that are divisible by 16 (i.e. ends with 0h) represents a paragraph boundary

Due to segments overlapping logical addresses are not unique as shown in the example below:



 An example on translating a physical address to a logical address


The physical address A7877h can be represented by many logical addresses since it is common to numerous segments.

If we just take the segments A781h, A782h, A783h, A784h, A785h, A786h and A787h, then the above physical address can be translated to logical addresses in these segments as:

offset = physical address – segment number X 10h

So the logical addresses are:
A781:0067h
A782:0057h
A783:0047h
A784:0037h
A785:0027h
A786:0017h
A787:0007h




A physical address, though may be common to many segments, will have a unique offset within each of these segments as was shown in the example above. The next example also demonstrate this fact:



 Another example on translating a physical address to a logical address


What is the segment number where the physical address A7877h has an offset of CF17h ?

The equation we use now is:

segment number X 10h = physical address – offset

So the segment number for the above physical address and offset is:

Segment number = (A7877h - CF17h)/10h = 9A960h/10h = 9A96h






Program Segments

Machine language programs usually have 3 different parts. Each of these parts is stored in different memory segments:

  1. Instructions: This is the code part and is stored in the code segment

  2. Data: This is the data part which is manipulated by the code and is stored in the data segment

  3. Stack: The stack is a special memory buffer outside the CPU that is maintained by the CPU as a temporary holding area for addresses and data. It is organized as Last-In-First-Out (LIFO) buffer and is used by the CPU to implement procedure calls. This data structure is stored in the stack segment


The segment numbers for the code segment, the data segment, and the stack segment are stored in the segment registers CS, DS, and SS, respectively

A fourth segment register, the ES, or extra segment is provided for programs that need to access a second data segment

Segment registers cannot be used in arithmetic operations

Program segments do not need to occupy the whole 64K locations. Due to segments' overlapping, program segments that are less than 64K word can be placed close together

At anytime, only the four memory segments specified by the segment registers are active (i.e. can be accessed). However, the program can alter the content of these registers to access different segments