Tuesday, 30 July 2013

Sorting And Searching

                                      Introduction

In my next few blog, I'll discuss sorting and searching in details.Sorting is the process of putting a group of items in specific order(increasing or decreasing order according to some linear relationship among the list of  elements) whereas searching refers to finding an item with specified properties among a collection of items.Following figure shows the topics that I'll discuss one after another.


                                                  Linear Search

It is a very straight forward method where we compare each element of Array with the key, If the element is found,we announce that element is found or the location of the element in the array.Otherwise we print that element is not present in the list.
The worst case ( if the element is present at the end of the array or not present at all) and average case complexity of Linear search is proportional to the number of elements in the list,but best time complexity is O(1)(the key is present at the beginning of the array).The algorithm is best suited if the number of elements in the array is small.Advantages of linear search are ->
  1. It is easy to implement and understand.
  2. Unlike binary and ternary search ,it does not impose additional requirements.
The pseudo code for the algorithm is->
                         For each element in the Array:
                                    if the item has the desired value,
                                                 stop the search and return the location.

I have already show you code for linear search in my previous blog

                                               Elementary Sorts

Here I will discuss three sorting algorithms.They are
  1. Bubble Sort or Sink Sort.
  2. Selection Sort.
  3. Insertion sort.

                                                    Bubble Sort or Sink Sort

   This technique is called Bubble sort or Sinking sort because the smaller values gradually  "bubble" their way upward to the top of the array like air bubbles rising in water, while the larger values sink to the bottom of the array.
Pseudo code for the algorithm shown below:
                      procedure BubbleSort(A : Array of  sortable items)
                       For i = 0 to length(A) - 2 :
                              For j = i+1 to length(A) - 1:
                                      if A[i] > A[j] :
                                           swap(A[i],A[j]).
                                      end if
                              end For
                      end For.

Complexity Analysis:
In Worst,Average and Best,all three cases j runs for i+1 to end of the array.Hence the complexity in three cases is clearly O(N^2).
Explanation->
for i=0 j runs (N-1) times
for i=1 j runs (N-2) times
for i=2 j runs (N-3) times
......................................
......................................
......................................
......................................
for i=N-1 j runs 1 times
Hence Total cost =(N-1) + (N-2) + ..... + 2 + 1
                           =(N-1)*(N-1+1)/2
                           = N(N-1)/2
                           = O(N^2)
An Example:
Now let's see how the algorithm works on An array A.
(Each time the selected entries are marked with underline and swapped elements are highlighted).
 1st pass ( i=0,j= 1 to 4)  
 

2nd pass ( i=1,j= 2 to 4) 

3rd pass ( i=2,j= 3 to 4) 

4th pass (i = 3,j = 4 to 4)
                                
   

Now the array is a sorted one after completion if the algorithm.

Here is C++ code for the above algorithm.
Bubble Sort

A sample run is shown below.
    


That's all about Bubble Sort.In next blog, I'll explain two other elementary sorts.If you have any doubt,drop me a email at ernstbinoy@gmail.com.









Sunday, 14 July 2013

How To Solve Recurrence Relations

                                   Introduction

A recurrence relation can be defined in the following way.
1.T(0) or T(1) =c1;
2.T(n) = c2 + T(n-1);where
T(0) or T(1) = cost to solve problem of  0 or 1 size respectively.
T(n) = cost to solve problem of size n.
T(n-1) = cost to solve problem of size n-1.
c1 and c2 are constants.
Here Line 1 describes basic step,it says the cost of problem for trivial case(s),whereas line 2 describes the cost of problem for n input size in terms of (n-1) input size.
When a algorithm contains a recursive call to itself,its running time can often be described in terms of Recurrence Relation.
Solving a Recurrence relation means to eliminate the recursion from function definition and finding the closed form of the function.There are several technique to solve Recurrence Relations.They are 
  • Substitution Method
  • Iterative Method or Recursion Tree Method
  • Master Method
Here i'll discuss all these one by one.

                            Substitution Method

It is one of the most powerful method of solving Recurrence relation.This method comprises of two step

  1. Guess the form of the solution.
  2. Next take help of Mathematical Induction to establish the correctness of the Guess
The name Substitution Method came from the fact that actually we substitute the Guessed solution for the function when applying the inductive hypothesis to the smaller values(You can follow it much when i'll show examples). The key idea of this method lies in Guessing the form of the solution.If the initial guess is wrong,then it will need to make it correct later.One thing I want to mention is that there is no predefined rule on how to make a initial guess,but one can use his experience while taking guess.
Let's see how it Work with examples.
 T(n)=1 ,if n=1
 otherwise  T(n) =2*T(floor(n/2))+n
=>  Guess T(n) =O(n*log(n)).Next we will Prove it by induction.
To show the upper bound,we will try to find two constants c and n(0) such that T(n) <=c*f(n) for all n>=n(0).First we will try to find out the values for n=2,n=3(small n)
Basic Case:

  1. T(2) =4
  2. T(3) = 5.

Our aim is to establish the result  T(n) <=c*f(n), so now we will focus on finding out value of c and one can easily see that c=2 for n=2 and n=3.But we need induction to establish the guess for all n.

Inductive Steps: Let's assume that the guess is true for all n=2,3,......K;
for n=K+1,we have T(n)=2*T(floor(n/2))+n;
                                     <=2*c(floor(n/2))*log(floor(n/2)) +n;
                                     <=c*n*log(n/2)+n;=c*n*log(n)-c*n+n
                                     <=c*n*log(n);
Hence the induction step holds true.The complexity of the above recurrence relation is O(n*log(n))

Be Careful  { Error may occur while using Asymptotic notation.
For example,in the previous question,if one try to establish T(n) = O(n) by taking the initial guess T(n)<=c*n  and apply it in recurrence relation in the following manner,
T(n)<=2(c*(floor(n/2))) + n
      <=(c*n) + n
      <=n(1+c)
      <=O(n)     
But it is not correct because it is not the exact form of inductive hypotheses,the actual form is T(n)<=c*n.So we have to strictly prove that form to establish the complexity is O(n).}
Next I will show you O(n^2) is also a solution of this the recurrence relation.( Because if F(n)= n^2 and G(n)=n*log(n),then G(n)=O(F(n)) ,follow previous blog!!)
Guess T(n) <= c*n^2
applying in the recurrence relation we get T(n)<=2*(c*(floor(n/2))^2) +n
                                                                     <=(c*n^2)/2+n
                                                                     <=C*n^2
Hence complexity is O(n^2).
Try to Prove that T(n) !=O(log(n)) by taking initial guess  T(n)<=c*log(n).
Also try to solve the T(n) =T(n^(.5)) +1.It is the recurrence relation for Euclid's G.C.D Algorithm(Hope you all knew the algorithm,if not go for a google search).

            Iterative Or Recursion Tree Method

Basic idea of this method is to Expend the terms and then try to derive the base case from it.Generally we avoid floor or ceiling here.I'm not going into formal definition as because you all know the concept of iteration.Just look the examples i'm explaining,you can definitely follow it.
Ex1. T(n) = T(n/2) +c ,T(1) =1. be the given recurrence relation.Here we will substitute the recursive part of R.H.S with the help of the definition of recursive expression on L.H.S.
            So,T(n) =T(n/2) + c can be written as
                         =[ T((n/2)/2) +c ] + c
                         =T(n/(2^2)) +c +c
                         =[ T(n/(2^2)/2) +c ] +c+c
                         =T(n/(2^3)) + c+c+c
                         =.................................
                         =.................................
                         =.................................            
                         =T(n/(2^k)) + c+c+c.................k times        -eq(1).
Now,let's consider 2^k =n
                                   k=log(n),substituting in eq(1),we get
                    T(n) =T(1)+ k*c
                           =1+c*log(n)
                          =O(log(n))
Recursion Tree for T(n) = T(n-1) + 1
Another Example :T(n) = n^2 +2*T(n/2)
                                   = n^2 +2*[ (n/2)^2 + 2^2 T(n/(2^2))]
                                   =n^2 + 2* n^2/4 +4*T(n/4)
                                   =n^2+ 2*n^2/4 + 4*n^2/16+8*T(n/8)
                                   =................................................
                                   =................................................
                                   = Σ[k=0 to log(n)-1]  ( n^2(.5)^k+n*T(1) )
                                   processing in the same manner we will get complexity = O(n^2)
Recursion Tree is shown below

Try to solve Previous question using this technique.It's too simple.Also try to Solve this one
T(n) = 2*T(n^(.5)) +log(n),T(1) =1.                          

                                     Master Method

This a faster method for solving recurrence relation.Here we express the inductive step of  recurrence relations as T(n) = a*T(n/b) +f(n) where a>=1 and b>1 and f(n) is some asymptotically positive function.
Master theorem have following three cases.


The basic idea of the theorem is that it will actually find the greater among the functions.
You can easily realize In case 1 n^(log(a)/log(b)-e) is larger and  in case 3 f(n) is larger.In case 2 both have same size,so we multiply by a logarithmic factor,Hence the solution is Θ(f(n)).
Now we will solve some problem.
Example 1 T(n) = 9*T(n/3) + n^2
=> a= 9,b= 3,log(a)/log(b) =2.hence n^(log(a)/log(b))  = n^2.So,it matches with second case.Hence the complexity is Θ(n^2*log(n)).
Example 2 T(n) = 3*T(n/4) + n*log(n).
=>clearly it falls under case 3.Hence complexity is Θ(n*log(n)).
Example 3 T(n) = 2*T(n/2) +n/log(n).
=>Not applicable.There exists non-polynomial difference between two terms.


So that's all about Solving Technique Of Recurrence Relation.If u have any doubt regarding this article,drop me a mail at ernstbinoy@gmail.com. You can also Follow these links.

  1. http://net.pku.edu.cn/~course/cs101/2007/resource/Intro2Algorithm/IntroductionToAlgorithm_v2_en.pdf
  2. http://www.csd.uwo.ca/~moreno//CS424/Ressources/master.pdf
  3. http://cs.anu.edu.au/student/comp3600/lec_print/alg_lec05_recurrences_print.pdf
  4. http://www.cs.nthu.edu.tw/~wkhon/algo09/lectures/lecture3.pdf
  5. http://www.myedupages.com/all-topics/165-fundamentals-of-algorithms/merge-sort/381-the-iteration-method-for-solving-recurrence-relations