Lecture 20:  Searching

 

Objectives of this lecture

q       Study the two methods of searching - linear and binary

q       Learn the relative advantages and disadvantages of the two methods

 

What is Searching?

q       Searching means scanning through a list of items or records to find if a particular one exists.

q       It usually requires the user to specify the target item (or in case of record, a target key e.g. id_no, name, etc.)

q       If the target item is found, the record or its location is returned, otherwise, an appropriate message or flag is returned.

q       An important issue in processing a search request is response time.  In addition to the specification of the computer, other factors affecting response time are: 

Ø      The size of the list; number of records, size of a record

Ø      The data structure used; array, linked-list, binary tree

Ø      The organization of data; random or ordered

Ø      The search method; linear or binary

Ø      The location of the structure; external (on disk) or internal (in memory)

q       In this lecture, we restrict our discussion to internal search methods for lists represented as arrays and as linked list.

 

Linear Search

q       This involves searching through the list sequentially until the target item is found or the list is exhausted.

q       If the target is found, its location is returned, otherwise a flag such as –1 is returned.

q       To implement this strategy, we first make the following declarations:

 

#define MAX_SIZE 50;

typedef long int KEY_TYPE;

typedef struct { KEY_TYPE key;  /*the key field*/

                                . . ./*other fields can be added here*/

                        } ITEM_TYPE;

 

typedef struct { ITEM_TYPE item[MAX_SIZE];

                           int current_size;

                        } LIST_TYPE;

 

LIST_TYPE list;

q       Assuming data has been entered in the list, then the following implements the linear search method.

 

int lin_search1(LIST_TYPE *list, KEY_TYPE target)

{  int loc=0;

  

   while (loc < list->current_size &&

               list->item[loc].key != target)

        loc++;

 

   if (list->item[loc].key == target)

      return loc;

   else

      return –1;

}

 

q       Now suppose the list is contained in a linked list which has the following declaration:

typedef long int KEY_TYPE;

typedef struct { KEY_TYPE key;  /*the key field*/

                                . . ./*other fields can be added here*/

                        } ITEM_TYPE;

typedef struct node_type{  ITEM_TYPE item;

                                            struct node_type *next;

                                        } NODE_TYPE;

 

typedef NODE_TYPE *NODE_PTR;

typedef NODE_TYPE LIST_TYPE;

 

LIST_TYPE list;

 

q       Assuming data is entered into the list, then the following implements the linear search method using both iteration and recursion:

Iterative version:

NODE_PTR lin_search2(LIST_TYPE *list, KEY_TYPE target)

{  NODE_PTR loc=*list;

 

   while (loc != NULL && loc->item.key != target)

        loc=loc->next;

 

   if (loc->item.key == target)

      return loc;

   else

      return NULL;

}

  

Recursive version:

NODE_PTR lin_search3(LIST_TYPE *list, KEY_TYPE target)

{  if (list == NULL)

       return NULL;

   else if (list->item.key == target)

       return list;

   else

       return (lin_search3(list->next, target));

}

  

Binary Search

q       For a list of n elements, the linear search takes an average of n/2 comparisons to find an item, with the best case being 1 comparison and the worst case being n comparisons.

q       However, if the list is ordered, it is a waste of time to look for an item using linear search (it would be like looking for a word in a dictionary sequentially).  In this case we apply binary search – more efficient.

q       Binary search works by comparing the target with the item at the middle of the list.  This leads two one of three results:

Ø      The middle item is the target – we are done.

Ø      The middle item is less than target – we apply the algorithm to the upper half of the list.

Ø      The middle item is bigger than the target – we apply the algorithm to the lower half of the list.

q       This process is repeated until the item is found or the list is exhausted.

q       The following functions implements this approach using both iteration and recursion.

 

Iterative Method:

int bin_search1(LIST_TYPE *list, KEY_TYPE target,

                           int low, int high)

{  int middle;

    while (low <= high)

   {  middle = (low + high)/2;

       if (list->item[middle].key == target)

           return (middle);

       else if (list->item[middle].key < target)

           low = middle + 1;

       else

           high = middle – 1;

   }

   return (-1);

}

 

Recursive Method:

int bin_search2(LIST_TYPE *list, KEY_TYPE target,

                int low, int high)

{  int middle;

  

   if (low > high) /*base case1:target not found*/

      return –1;

   

   middle = (low + high)/2;

   if (list->item[middle].key == target)

      return (middle);   /*base case2:target found*/

   else if (list->item[middle].key < target)

      return(bin_search2(list, target, middle+1,high)

   else

      return(bin_search2(list, target, low, middle-1)

}

 

Notes:

q       Binary search is by far more efficient than linear search. – the number of comparisons required is on the average log2(n).

q       Thus , for a list of 1000 items, binary search requires only  log2(1000) @ 10,  whereas linear search requires 1000/2 = 500.  The difference gets more dramatic with larger lists.

q       Why then do we use linear search?

Ø      Binary search is only applicable to ordered data.  Thus, there is an additional overhead of sorting which can be very significant.

Ø      Binary search cannot be applied to linear linked list – needs a different structure called binary tree which is beyond the scope of this course.