Wednesday, August 18, 2010

What is a good Algorithm?

Measuring solutions are made to look like rocket science while they are mere common sense in most of the cases. Any algorithm handbook has multiple pages on this topic , but there is an one line gist for all the yardsticks. The thumb rule for measuring an algorithm is

"A good algorithm is one that saves maximum resources."

The catch point in this statement is 'resources'. Resources can be anything that the user of solution values. In most of the cases it is just the thing that affects his/her profits the most. For example:

  • You have to make a search engine (like google) where people can type words in a box and hit the search button. Here the 'response time' for every search is the most critical resource.

  • You have a mobile application that records video in some xyz format. Since mobiles usually have limited memory capacities, so here your storage limit is most critical resource.

  • You have server gone down and you have to provide an immediate solution to restart it. The fastest implementable algorithm is best one because implementation time is the critical resource here.

So we can agree that there is not one fixed measure of accuracy of algorithm and here we need to fall back upon the word 'optimize'. General optimization is done in terms of time taken and space taken. Geeks call these time and space complexity.

In most of the scenarios, the complexity of the algorithm is dependent upon the size of data-set being acted upon. It is logical to assume that searching from 10000 objects will take more (or equal) time than searching from 10 objects using the same method. There have been many ways to correctly predict the complexity of an algorithm. Two of the most famous are:

  • Worst case complexity.: for worst set of data possible

  • Average case complexity.: for random sets of data averaged together

There are amortized analysis and best case analysis as well, but they are specific to certain cases and are not very famous as the upper two.

Before I say anything more about complexity, let us visit another basic common-sense. If there are multiple pipelines one after another to deliver some liquid, then the maximum speed of delivery can never be greater than that of the slowest pipeline. We have read such a theory in chemical kinetics as well where the rate of a sequential reaction is equal to the slowest step in the reaction set.

Similar logic extends to the algorithmic complexity. If there are a repeating 'set of steps' involved in reaching to final solution using an algorithm then the slowest step of that 'set of steps' approximates the complexity of that step. And number of repetitions of that step gives an idea about the complexity of the algorithm.

Let us take an example. Suppose we have 'n' numbers and we have to locate a particular number 'k' in this 'n' number set. What will we do :

  1.  take a counter i = 1

  2.  visit number i in 'n' set

  3.  compare number at i with 'k'. if they are same show result as i

  4.  increment counter by one.

  5.  repeat step 2 to 4 till counter is greater than n.

  6. show that the result was not found.

As one can easily guess what is the slowest step here: step 3 where one has to compare the two numbers.
So the approximate measure of the repeating steps (2 to 4) is the speed of step 3.
And the complexity of whole algorithm is now expressed in terms of number of sum of all such independent repetitions for similar slowest steps.

Details on complexity, Notations, computability and bounds can be easily found scattered on Internet. But the clue is do not let mathematics dilute the essence of common sense involved. Mathematics is the purest form of expressing logic but not thought.

Before I wind up this discourse a look on what I think as a good way to devise an algorithm:

  • First ask what is the desired result. A working software with a bad algorithm is infinitely better than a useless software with very accurate algorithm which was intended to do something else.

  • Prioritize your critical resources and design algorithm according to that.

  • You can have less implementation time, less response time, less space required, less man power required. Pick any three.

  • Code first optimize later. 

Final words on this are :
"A complex algorithm is a bad algorithm."
--Remember what complex means to us :)

No comments:

Post a Comment