Algorithm Analysis :  Simple Sorting Methods

 

Objectives of this lecture

q       Review simple sorting methods (Selection, Insertion and Shell sort)

q       Analyse these methods using Big-O notation.

 

Introduction

q       Sorting, like searching, is another very important but time-consuming application.

q       For this reason, many sorting methods have been developed, but none of which can be said to be the best in all situations.  We shall discuss only a few of them.

q       We continue the use of macros for key comparisons to make the algorithms general.

q       We shall also assume the declarations of ItemType, ListEntry and List given in the Searching lecture.

q       In analysing sorting algorithms, we are concern with two main factors:

Ø      The number of key comparisons involved (as with searching)

Ø      The number of data movements or pointer change.

 

Selection Sort:

q       This includes the following steps:

1.      Find the smallest (or largest) item in the list

2.      swap this item with the one at the beginning (or at the end ) of the list

3.      Repeat steps 1 and 2 starting at the beginning of the remaining list (or stopping 1 position before the end of the previous end).

q       The following shows an iterative version of selection sort:

 

/* SelectionSort: sort a contiguous list using selection sort.

Pre:    The contiguous list has been created. Each entry has a key.

Post:  The list has been sorted into nondecreasing order.

Uses: MinKey, Swap.

 */

void SelectionSort(List *list)

{   int current;    /* position of place being correctly filled */

     int min;           /* position of smallest remaining key */

 

    for (current = 0;  current < list->count-1; current++) {

        min = MinKey(current, list);

        Swap(min, current, list);

    }

}

 /* MinKey: find the position of the smallest key in the sublist.

Pre:    The contiguous list has been created. Start is a valid

positions in list.

Post:  The position of the entry with the smallest key is returned.

 */

 

int MinKey(int start, List *list)

{  int current, min=start;

  

   for (current=start+1; current<list->count; current++)

      if (LT(list->entry[current].key, list->entry[min].key))

          min = current;

  

   return min;

}

 

/* Swap: swap two entries in the contiguous list.

Pre:    The contiguous list has been created. pos1 and pos2

           are valid positions in list.

Post:  The entry at pos1 is swapped with the entry at pos2.

 */

void Swap(int pos1, int pos2, List *list)

{

    ListEntry temp = list->entry[pos1];

 

    list->entry[pos1] = list->entry[pos2];

    list->entry[pos2] = temp;

}

 

The following table shows a trace of how selection sort works:

 

Index

Original

Round1

Round3

Round 3

0

7

1

1

1

1

2

2

2

2

2

1

7

7

4

3

4

4

4

7

 

Analysis

q       The major work in function SelectionSort is in the for loop which executes n-1 times.  Each iteration involves calling the two functions MinKey and Swap.

q       The swap functions is concerned with data movements and it has three assignemnt stataments.  Thus, the number of data movements is selection sort is:

3(n-1) » 3n + O(1)

q       The function MinKey is concerned with key comparisons.  The number of comparisons for a given call depends on the starting value.  If the starting value is t, then the loop inside MinKey executes n-t times.  Thus, the total number of comparisons is:

(n-1) + (n-2) + … + 1 =1/2n(n-1) » 1/2n2 + O(n)

 

q       Notice that selection sort pays no attention to to the original ordering of the list.   Thus, there is no worst or best case.

 

Insertion Sort:

q       In this method,  the list is conceptually divided into a destination sub-list a0, … ai-1 and a source sub-list, ai … an-1.   

q       In each iteration, the ith element of the source sub-list is picked and transferred to its correct position in the destination sub-list.

 

/* InsertionSort: sort contiguous list using insertion sort method.

Pre:    The contiguous list has been created.

Post:  The list has been sorted into nondecreasing order.

 */

void InsertionSort(List *list)

{

    int start;        /* position of first unsorted entry */

    int place;       /* searches sorted part of list */

    ListEntry temp;  /* entry temporarily removed from list */

 

    for (start = 1; start < list->count; start++)

    {

       temp = list->entry[start];

       place = start;

       while ((place-1 >= 0) &&

                    LT(temp.key,  list->entry[place-1].key))

       {  

            list->entry[place] = list->entry[place-1];

            place--;

       }

       list->entry[place] = temp;

    }

}

 

The following table shows a trace of insertion sort:

Index

Original

Round1

Round3

Round 3

0

7

2

1

1

1

2

7

2

2

2

1

1

7

4

3

4

4

4

7

Analysis

q       First we notice that if the list is already sorted, then the condition for the while loop will never be true, hence for this case, insersion sort will do nothing except n-1 comparison of keys  -- the best case.

q       To find the average case, suppose that the list is initially in random order. 

First we notice that there are i possible positions:

not moving, moving by 1, …. , moving by i-1.

assuming these are equally likely, then:

1.   the probability of not moving at all is:  1/i

in this case number of comaparison = 1

and the number of assignments = 0

           

2.      the probability of moving is thus, 1 – 1/i  =  (i-1)/i

since all the i-1 possibilities are assumed to be equally likely,

the average number of iteration for the while loop is:

(1+2+…+(i-1)) / (i-1)  = i/2

One key comparison and one assignment is done for each of these iterations.  Thus:

average number of comparisons = i/2

and average number of assignment  = i/2

 

When we combine these two cases with their probabilities, we get:

comparisons =  

 

movements =  

Taking the sum of these numbers for i=1 to n-1 and wusing Big-O notation to suppress terms with unbounded constants, we obtain the average number of comparisons and number of assignments to be  1/4n2 + O(n).

 

Comparison

Comparing this numbers with those of Selection sort we notice that as n increases, 1/4n2 becomes much larger than 3n.  Thus, if moving entries is a slow process (large records), then selection sort will be faster than insertion sort.  But, the amout of comparisons is on average only about half of what is obtained in selection sort.

 

Shell Sort

q       As observed earlier, the problem with insersion sort is the too many movement of data involved.  For example, in sorting the follwing list, insertion sort has to make 44 data movements as shown below:

 

Original:

28

81

36

47

17

13

55

65

23

18

67

38

3

Moves

step1:

28

81

36

47

17

13

55

65

23

18

67

38

3

0

step2:

28

36

81

47

17

13

55

65

23

18

67

38

3

1

step3:

28

36

47

81

17

13

55

65

23

18

67

38

3

1

step4:

17

28

36

47

81

13

55

65

23

18

67

38

3

4

step5:

13

17

28

36

47

81

55

65

23

18

67

38

3

5

step6:

13

17

28

36

47

55

81

65

23

18

67

38

3

1

step7:

13

17

28

36

47

55

65

81

23

18

67

38

3

1

step8:

13

17

23

28

36

47

55

65

81

18

67

38

3

6

step9:

13

17

18

23

28

36

47

55

65

81

67

38

3

7

step10:

13

17

18

23

28

36

47

55

65

67

81

38

3

1

step11:

13

17

18

23

28

36

38

47

55

65

67

81

3

5

step12:

3

13

17

18

23

28

36

38

47

55

65

67

81

12

 

Total number of moves = 44

 

q       The aim of the shell sort, (named after its inventor, Donald Shell, 1959) is to reduce the number of moves required in insertion sort.

q       When looking for a place to insert an item, shell sort first look at an element that is some distance d from the right hand side of the target sub-list.  d for example can be initially ½ or 1/3 of the list size.

q       On succeeding passes, d is reduced by some factor and the sorting is re-applied. 

q       By the time d reduces to one (equivalent do the normal insert sort) the items gets nearly sorted and the number of moves is drastically reduced.

q       The implementation is as follows:

 

/* ShelSort: sort a contiguous list using ShellSort sort.

Pre:    The contiguous list has been created. Each entry has a key.

Post:  The list has been sorted into nondecreasing order.

Uses:  InsersionSort2.

 */

void ShellSort(List *list)

{  int d=list->count;

 

    do

    { d = d/3+1;

      InsertionSort2(list,d);

    } while (d>1);

}

 

/* InsertionSort2: a modified version of insertion sort.

Pre:    The contiguous list has been created.

Post:  The relative ordering of the entries has been improved, for

  d=1, the list will be sorted.

 */

void InsertionSort2(List *list, int d)

{

    int start;        /* position of first unsorted entry */

    int place;       /* searches sorted part of list */

    ListEntry temp;  /* entry temporarily removed from list */

 

    for (start = d; start < list->count; start++)

    {

       temp = list->entry[start];

       place = start;

       while ((place-d >= 0) &&

                    LT(temp.key,  list->entry[place-d].key))

       {  

            list->entry[place] = list->entry[place-d];

            place-=d;

       }

       list->entry[place] = temp;

    }

}

 

q       The following table shows the trace of the shell sort function:

 

 

Original:

28

81

36

47

17

13

55

65

23

18

67

38

3

Moves

 

For d=5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

step1:

13

81

36

47

17

28

55

65

23

18

67

38

3

1

step2:

13

55

36

47

17

28

81

65

23

18

67

38

3

1

step3:

13

55

36

47

17

28

81

65

23

18

67

38

3

0

step4:

13

55

36

23

17

28

81

65

47

18

67

38

3

1

step5:

13

55

36

23

17

28

81

65

47

18

67

38

3

0

step6:

13

55

36

23

17

28

81

65

47

18

67

38

3

0

step7:

13

38

36

23

17

28

55

65

47

18

67

81

3

2

step8:

13

38

3

23

17

28

55

36

47

18

67

81

65

2

 

For d=2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

step1:

3

38

13

23

17

28

55

36

47

18

67

81

65

1

step2:

3

23

13

38

17

28

55

36

47

18

67

81

65

1

step3:

3

23

13

38

17

28

55

36

47

18

67

81

65

0

step4:

3

23

13

28

17

38

55

36

47

18

67

81

65

1

step5:

3

23

13

28

17

38

55

36

47

18

67

81

65

0

step6:

3

23

13

28

17

36

55

38

47

18

67

81

65

1

step7:

3

23

13

28

17

36

47

38

55

18

67

81

65

1

step8:

3

18

13

23

17

28

47

36

55

38

67

81

65

4

step9:

3

18

13

23

17

28

47

36

55

38

67

81

65

0

step10:

3

18

13

23

17

28

47

36

55

38

67

81

65

0

step11:

3

18

13

23

17

28

47

36

55

38

65

81

67

1

 

For d= 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

step1:

3

18

13

23

17

28

47

36

55

38

65

81

67

0

step2:

3

13

18

23

17

28

47

36

55

38

65

81

67

1

step3:

3

13

18

23

17

28

47

36

55

38

65

81

67

0

step4:

3

13

17

18

23

28

47

36

55

38

65

81

67

2

step5:

3

13

17

18

23

28

47

36

55

38

65

81

67

0

step6:

3

13

17

18

23

28

47

36

55

38

65

81

67

0

step7:

3

13

17

18

23

28

36

47

55

38

65

81

67

1

step8:

3

13

17

18

23

28

36

47

55

38

65

81

67

0

step9:

3

13

17

18

23

28

36

38

47

55

65

81

67

2

step10:

3

13

17

18

23

28

36

38

47

55

65

81

67

0

step11:

3

13

17

18

23

28

36

38

47

55

65

81

67

0

step12:

3

13

17

18

23

28

36

38

47

55

65

67

81

1

 

Total number of moves = 24

 

 

q       Practically, Shell Sort has been shown to perform better than both insersion and selection sort. 

q       However, mathematical analysis of Shell Sort turns out to be difficult.  Good estimate of number of comparisons and moves have been obtained only under special condition, so we shall not go into that.  so far