

|
In mathematics and computer science an algorithm is a finite set of well-defined instructions for accomplishing some task which, given an initial state, will terminate in a corresponding recognizable end-state (contrast with heuristic). Algorithms can be implemented by computer programs, although often in restricted forms; mistakes in implementation and limitations of the computer can prevent a computer program from correctly executing its intended algorithm.
The concept of an algorithm is often illustrated by the example of a recipe, although many algorithms are much more complex; algorithms often have steps that repeat (iterate) or require decisions (such as logic or comparison). Correctly performing an algorithm will not solve a problem if the algorithm is flawed or not appropriate to the problem. For example, a hypothetical algorithm for making a potato salad will fail if there are no potatoes present, even if all the motions of preparing the salad are performed as if the potatoes were there.
Different algorithms may complete the same task with a different set of instructions in more or less time, space, or effort than others. For example, given two different recipes for making potato salad, one may have peel the potato before boil the potato while the other presents the steps in the reverse order, yet they both call for these steps to be repeated for all potatoes and end when the potato salad is ready to be eaten.
Certain countries, such as the USA, controversially allow some algorithms to be patented, provided a physical embodiment is possible (for example, a multiplication algorithm may be embodied in the arithmetic unit of a microprocessor).
Contents |
Algorithms are essential to the way computers process information, because a computer program is essentially an algorithm that tells the computer what specific steps to perform (in what specific order) in order to carry out a specified task, such as calculating employees’ paychecks or printing students’ report cards. Thus, an algorithm can be considered to be any sequence of operations which can be performed by a Turing-complete system.
Typically, when an algorithm is associated with processing information, data is read from an input source or device, written to an output sink or device, and/or stored for further use. Stored data is regarded as part of the internal state of the entity performing the algorithm.
For any such computational process, the algorithm must be rigorously defined: specified in the way it applies in all possible circumstances that could arise. That is, any conditional steps must be systematically dealt with, case-by-case; the criteria for each case must be clear (and computable).
Because an algorithm is a precise list of precise steps, the order of computation will almost always be critical to the functioning of the algorithm. Instructions are usually assumed to be listed explicitly, and are described as starting 'from the top' and going 'down to the bottom', an idea that is described more formally by flow of control.
So far, this discussion of the formalisation of an algorithm has assumed the premises of imperative programming. This is the most common conception, and it attempts to describe a task in discrete, 'mechanical' means. Unique to this conception of formalized algorithms is the assignment operation, setting the value of a variable. It derives from the intuition of 'memory' as a scratchpad. There is an example below of such an assignment.
See functional programming and logic programming for alternate conceptions of what constitutes an algorithm.
Algorithms are not only implemented as computer programs, but often also by other means, such as in a biological neural network (for example, the human brain implementing arithmetic or an insect relocating food), in electric circuits, or in a mechanical device.
The analysis and study of algorithms is one discipline of computer science, and is often practiced abstractly (without the use of a specific programming language or other implementation). In this sense, it resembles other mathematical disciplines in that the analysis focuses on the underlying principles of the algorithm, and not on any particular implementation. One way to embody (or sometimes codify) an algorithm is the writing of pseudocode.
Some writers restrict the definition of algorithm to procedures that eventually finish. Others include procedures that could run forever without stopping, arguing that some entity may be required to carry out such permanent tasks. In the latter case, success can no longer be defined in terms of halting with a meaningful output. Instead, terms of success that allow for unbounded output sequences must be defined. For example, an algorithm that verifies if there are more zeros than ones in an infinite random binary sequence must run forever to be effective. If it is implemented correctly, however, the algorithm's output will be useful: for as long as it examines the sequence, the algorithm will give a positive response while the number of examined zeros outnumber the ones, and a negative response otherwise. Success for this algorithm could then be defined as eventually outputting only positive responses if there are actually more zeros than ones in the sequence, and in any other case outputting any mixture of positive and negative responses.
Summarising the above discussion about what algorithm should consist.
One of the simplest algorithms is to find the largest number in an (unsorted) list of numbers. The solution necessarily requires looking at every number in the list, but only once at each. From this follows a simple algorithm:
And here is a more formal coding of the algorithm in pseudocode:
Algorithm LargestNumber
Input: A non-empty list of numbers L.
Output: The largest number in the list L. largest ← -∞
for each item in list L, do
if the item > largest, then
largest ← the item
return largest
Notes on notation:
As it happens, most people who implement algorithms want to know how much of a particular resource (such as time or storage) a given algorithm requires. Methods have been developed for the analysis of algorithms to obtain such quantitative answers; for example, the algorithm above has a time requirement of O(n), using the big O notation with n as the length of the list. At all times the algorithm only needs to remember two values: the largest number found so far, and its current position in the input list. Therefore this algorithm has a space requirement of O(log n), since a number from 1 to n takes log n bits to store. (Note that the size of the inputs is not counted as space used by the algorithm.)
For a more complex example see Euclid's algorithm, which also happens to be one of the oldest algorithms.
The word algorithm comes from the name of the 9th century Persian mathematician Abu Abdullah Muhammad bin Musa al-Khwarizmi. The word algorism originally referred only to the rules of performing arithmetic using Hindu-Arabic numerals but evolved into algorithm by the 18th century. The word has now evolved to include all definite procedures for solving problems or performing tasks.
The first case of an algorithm written for a computer was Ada Byron's notes on the analytical engine written in 1842, for which she is considered by many to be the world's first programmer. However, since Charles Babbage never completed his analytical engine the algorithm was never implemented on it.
The lack of mathematical rigor in the "well-defined procedure" definition of algorithms posed some difficulties for mathematicians and logicians of the 19th and early 20th centuries. This problem was largely solved with the description of the Turing machine, an abstract model of a computer formulated by Alan Turing, and the demonstration that every method yet found for describing "well-defined procedures" advanced by other mathematicians could be emulated on a Turing machine (a statement known as the Church-Turing thesis).
Nowadays, a formal criterion for an algorithm is that it is a procedure that can be implemented on a completely-specified Turing machine or one of the equivalent formalisms. Turing's initial interest was in the halting problem: deciding when an algorithm describes a terminating procedure. In practical terms computational complexity theory matters more: it includes the problems called NP-complete, which are generally presumed to take more than polynomial time for any (deterministic) algorithm. NP denotes the class of decision problems that can be solved by a non-deterministic Turing machine in polynomial time.
There are many ways to classify algorithms, and the merits of each classification have been the subject of ongoing debate.
One way to classify algorithms is by implementation means.
Another way of classifying algorithms is by their design methodology or paradigm. There is a certain number of paradigms, each different from the other. Furthermore, each of these categories will include many different types of algorithms. Some commonly found paradigms include:
Every field of science has its own problems and needs efficient algorithms. Some of these fields are overlapping with each other. Related problems in one field are often studied together. Some example classes are Search algorithms, sort algorithms, merge algorithms, Numerical algorithms, graph algorithms, string algorithms, computational geometric algorithms, combinatorial algorithms, machine learning, cryptography, Data compression algorithms and Parsing Techniques.
See also: List of algorithms for complete details.
Some algorithms complete in linear time, and some complete in exponential amount of time, and some never complete. One problem may have multiple algorithms, and some problems may have no algorithms. Some problems have no known efficient algorithms. There are also mappings from some problems to other problems. So computer scientists found it is suitable to classify the problems rather than algorithms into equivalnce classes based on the complexity.
See also: complexity classes for more details.
In computer science, a data structure is a way of storing data in a computer so that it can be used efficiently. Often a carefully chosen data structure will allow a more efficient algorithm to be used. The choice of the data structure often begins from the choice of an abstract data structure. A well-designed data structure allows a variety of critical operations to be performed, using as little resources, both execution time and memory space, as possible.
Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to certain tasks. For example, B-trees are particularly well-suited for implementation of databases, while routing tables rely on networks of machines to function.
In the design of many types of programs, the choice of data structures is a primary design consideration, as experience in building large systems has shown that the difficulty of implementation and the quality and performance of the final result depends heavily on choosing the best data structure. After the data structures are chosen, the algorithms to be used often become relatively obvious. Sometimes things work in the opposite direction - data structures are chosen because certain key tasks have algorithms that work best with particular data structures. In either case, the choice of appropriate data structures is crucial.
This insight has given rise to many formalised design methods and programming languages in which data structures, rather than algorithms, are the key organising factor. Most languages feature some sort of module system, allowing data structures to be safely reused in different applications by hiding their verified implementation details behind controlled interfaces. Object-oriented programming languages such as C++ and Java in particular use objects for this purpose.
Since data structures are so crucial to professional programs, many of them enjoy extensive support in standard libraries of modern programming languages and environments, such as C++'s Standard Template Library, the Java API, and the Microsoft .NET framework.
The fundamental building blocks of most data structures are arrays, records, discriminated unions, and references. For example, the nullable reference, a reference which can be null, is a combination of references and discriminated unions, and the simplest linked data structure, the linked list, is built from records and nullable references.
There is some debate about whether data structures represent implementations or interfaces. How they are seen may be a matter of perspective. A data structure can be viewed as an interface between two functions or as an implementation of methods to access storage that is organized according to the associated data type.