Lecture 25:  Sorting 5 – Bucket Sort – The worst of all?

 

Objectives of this lecture

q       Learn the Bucket sort method

q       Learn why it may be regarded as the best as well as the worst sorting method

 

How Bucket Sort Works

q       Bucket sort is one of the fastest possible sorting methods as it does not depend on comparing keys.

q       It is also very easy to understand and code.

q       However, it can also be viewed as one of the worst sorting method as it is applicable only in very rare situations.

q       To do a bucket sort, we use a temporary array in which we distribute the elements of the list, based on their key fields.

q        If the maximum key value in the list is n, then the temporary array should be at least of size n+1.

 

e.g. for the list:

2

10

7

12

1

5

 

q       The temporary list should be at least of size 13.

q       To distribute the list in the temporary array, we first initialize all the elements of the temporary array with a flag value – a value that cannot be a key field in the list, say –1 in the above example.

 

temp

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

-1

 

0

1

2

3

4

5

6

7

8

9

10

11

12

 

q       Next, we copy each element of the list with a key field n in position n of the temporary array.

 

temp

1

-1

2

-1

-1

5

-1

7

-1

-1

10

-1

12

 

0

1

2

3

4

5

6

7

8

9

10

11

12

 

q       Finally, we copy all non-flag values from the temporary array into the list in the order they appear in the temporary array.

 

final list:

1

2

5

7

10

12

 

q       Notice that the distribution of n records requires n steps.  Thus, the performance of bucket sort is of order n (linear).

 

The following program implements bucket sort for simple integer values.

 

#include <stdio.h>

#include <stdlib.h>

#define LIST_SIZE 6

 

void BucketSort(int list[], int max);

 

main()

{   int i, list[]={2,10,7,12,1,5};

     BucketSort(list,13);

    for (i=0; i<LIST_SIZE; i++)

        printf("%d  ",list[i]);

    return 0;

}

 

void BucketSort(int list[], int max)

{   int i, j=0, *temp;

 

    temp=(int *)malloc(max*sizeof(int));

    for(i=0; i<max; i++)

        temp[i] = -1;

    for (i=0; i<LIST_SIZE; i++)

        temp[list[i]] = list[i];

    for (i=0; i<max; i++)

       if (temp[i] != -1)

          list[j++]=temp[i];

    free(temp);

 }

 

Limitations:

q       Bucket sort works only under very restrictive conditions.  These are:

q       The key field must be unique positive integer and not string, float or even negative integers.

q       The range of values for the key-field must be relatively small, otherwise, the temporary array will be too large.  E.g.  if the key field is ID number (6 digits), then the size of temp would be 999999 which is impossible to store in the memory.

 

Note:

q       It might be possible to do bucket sort with negative values after doing some mapping.  E.g.  –100 to 100 could be mapped to 0 to 200

q       Similarly, float values might be converted to positive integers by multiplying them with some constant integers.

q       These however require additional overheads.