Thursday, August 19, 2010

Sorting Algorithm: Bubble Sort

The basic algorithms which have been traditionally emphasized are search and sort algorithm. There are innumerous algorithms recognized or de-recognized, ultimate or intermediate or preliminary, with different complexity and problem space, but searching and sorting over a set of data are the entry point to this world of algorithms. These basic steps not only provide us a solution but alos give a way to think on how to solve bigger problems. In real life scenarios one rarely gets a problem where one has to sort a set of data/numbers as an isolated question. Even if one gets every language provides already existing tested and efficient libraries that provide such functionality. And I will suggest that one should use them rather than re-inventing the wheel(with possible glitches). Nonetheless learning such algorithm is useful to get an approach towards solving a problem.

So let us talk about Sorting. To sort is to rearrange the data in a monotonically increasing or decreasing order. In simple words arrange given numbers(~data) in increasing or decreasing order. So what is the first requisite for a data set to be sortable? The data set should be of single type. One cannot compare two different kind of data.(unless some correlation is already established).

The second prerequisite is that the two data points or elements should be comparable. Can you compare two persons? No . You can compare their heights, weights or some other mathematical index that represents them. So essentially whenever you are sorting some dataset you first look for the comparable mathematical index(or indices) and the comparision function (say f(i1,i2...,ik))

Now we are ready to sort. I will show you an algorithm which we call bubble sort.

The idea behind bubble sort is let the lightest settle at the top (like a bubble) or the heaviest settle in bottom (as in sedimentation).
Logically it goes like this:
We have to bubble up the smallest element. How does the bubble come up. Suppose the lightest layer(smallest element) is at the bottom. It will compare itself to the next upper layer. If it is lighter than next upper layer, it will go up and the upper layer will go down. This is called 'swap'. It will continue going up till it finds any other layer lighter than itself. In all cases the topmost layer will be lightest. Now skim off the topmost layer and let the setup rearrange again. Again the lightest in the remaining will come to the top. So one by one we go on collecting the smallest layers or bubbles and finally we have a sorted array.

I will not write the algorithm and instead directly give you the working code in java.

An Utility class:

import java.util.Random;
public class Utility {
static int[] array = new int[10];
static Random random = new Random();
public static int[] getArray() {
for(int i = 0; i <10; i++){
array[i]= random.nextInt(100)%100;
return array;
public static int[] getArrayTen() {
for(int i = 0; i <10; i++){
array[i]= random.nextInt(10)%10;
return array;
public static void setArray(int[] array) {
Utility.array = array;
public static void swap(int[] array,int i, int j)
int temp = array[i];
array[i]= array[j];
array[j] = temp;
public static void display(int[] array2) {
for (int i : array2) {
System.out.print("   " + i);

The Bubble Sort Class:
package kumar.test.sort;
public class BubbleSort {
public static void main(String[] args) {
System.out.println("before sort");
int[] array = Utility.getArray();
System.out.println("after sort");
public static void bubbleSort(int[] array)
for(int i = 0; i <array.length; i++)
for (int j = array.length-1; j> i; j--)
if(array[j] > array[j-1])
Utility.swap(array, j, j-1);
System.out.println( "After cycle "+ (i+1));

Sample Output:
before sort

92 64 4 28 89 41 58 80 37 54


After cycle 1

92 89 64 4 28 80 41 58 54 37

After cycle 2

92 89 80 64 4 28 58 41 54 37

After cycle 3

92 89 80 64 58 4 28 54 41 37

After cycle 4

92 89 80 64 58 54 4 28 41 37

After cycle 5

92 89 80 64 58 54 41 4 28 37

After cycle 6

92 89 80 64 58 54 41 37 4 28

After cycle 7

92 89 80 64 58 54 41 37 28 4

After cycle 8

92 89 80 64 58 54 41 37 28 4

After cycle 9

92 89 80 64 58 54 41 37 28 4

After cycle 10

92 89 80 64 58 54 41 37 28 4


after sort

92 89 80 64 58 54 41 37 28 4

The Sample provided here is inverse of bubble sort i.e Sedmintation salt where I am allowing the largest value to settle down at bottom in every cycle. You can try writing original bubble sort following this example.

Wednesday, August 18, 2010

What is a good Algorithm?

Measuring solutions are made to look like rocket science while they are mere common sense in most of the cases. Any algorithm handbook has multiple pages on this topic , but there is an one line gist for all the yardsticks. The thumb rule for measuring an algorithm is

"A good algorithm is one that saves maximum resources."

The catch point in this statement is 'resources'. Resources can be anything that the user of solution values. In most of the cases it is just the thing that affects his/her profits the most. For example:

  • You have to make a search engine (like google) where people can type words in a box and hit the search button. Here the 'response time' for every search is the most critical resource.

  • You have a mobile application that records video in some xyz format. Since mobiles usually have limited memory capacities, so here your storage limit is most critical resource.

  • You have server gone down and you have to provide an immediate solution to restart it. The fastest implementable algorithm is best one because implementation time is the critical resource here.

So we can agree that there is not one fixed measure of accuracy of algorithm and here we need to fall back upon the word 'optimize'. General optimization is done in terms of time taken and space taken. Geeks call these time and space complexity.

In most of the scenarios, the complexity of the algorithm is dependent upon the size of data-set being acted upon. It is logical to assume that searching from 10000 objects will take more (or equal) time than searching from 10 objects using the same method. There have been many ways to correctly predict the complexity of an algorithm. Two of the most famous are:

  • Worst case complexity.: for worst set of data possible

  • Average case complexity.: for random sets of data averaged together

There are amortized analysis and best case analysis as well, but they are specific to certain cases and are not very famous as the upper two.

Before I say anything more about complexity, let us visit another basic common-sense. If there are multiple pipelines one after another to deliver some liquid, then the maximum speed of delivery can never be greater than that of the slowest pipeline. We have read such a theory in chemical kinetics as well where the rate of a sequential reaction is equal to the slowest step in the reaction set.

Similar logic extends to the algorithmic complexity. If there are a repeating 'set of steps' involved in reaching to final solution using an algorithm then the slowest step of that 'set of steps' approximates the complexity of that step. And number of repetitions of that step gives an idea about the complexity of the algorithm.

Let us take an example. Suppose we have 'n' numbers and we have to locate a particular number 'k' in this 'n' number set. What will we do :

  1.  take a counter i = 1

  2.  visit number i in 'n' set

  3.  compare number at i with 'k'. if they are same show result as i

  4.  increment counter by one.

  5.  repeat step 2 to 4 till counter is greater than n.

  6. show that the result was not found.

As one can easily guess what is the slowest step here: step 3 where one has to compare the two numbers.
So the approximate measure of the repeating steps (2 to 4) is the speed of step 3.
And the complexity of whole algorithm is now expressed in terms of number of sum of all such independent repetitions for similar slowest steps.

Details on complexity, Notations, computability and bounds can be easily found scattered on Internet. But the clue is do not let mathematics dilute the essence of common sense involved. Mathematics is the purest form of expressing logic but not thought.

Before I wind up this discourse a look on what I think as a good way to devise an algorithm:

  • First ask what is the desired result. A working software with a bad algorithm is infinitely better than a useless software with very accurate algorithm which was intended to do something else.

  • Prioritize your critical resources and design algorithm according to that.

  • You can have less implementation time, less response time, less space required, less man power required. Pick any three.

  • Code first optimize later. 

Final words on this are :
"A complex algorithm is a bad algorithm."
--Remember what complex means to us :)