In mathematics and computer science, an algorithm is a finite sequence of well-defined, computer-implementable instructions, typically to solve a class of problems or to perform a computation. Algorithms are always unambiguous and are used as specifications for performing calculations, data processing, automated reasoning, and other tasks.
As an effective method, an algorithm can
be expressed within a finite amount of space and time, and in a well-defined
formal language for calculating a function.
Starting from an initial state and initial input (perhaps empty), the
instructions describe a computation that, when executed, proceeds through a
finite number of well-defined successive states, eventually producing
"output" and terminating at a final ending state. The transition from
one state to the next is not necessarily deterministic; some algorithms, known
as randomized algorithms, incorporate random input.
The concept of algorithm has existed
since antiquity. Arithmetic algorithms,
such as a division algorithm, was used by ancient Babylonian mathematicians c.
2500 BC and Egyptian mathematicians c. 1550 BC.
Greek mathematicians later used algorithms in 240 BC in the sieve of
Eratosthenes for finding prime numbers, and the Euclidean algorithm for finding
the greatest common divisor of two numbers.
Arabic mathematicians such as al-Kindi in the 9th century used cryptographic algorithms for code-breaking,
based on frequency analysis.
The word algorithm itself
is derived from the name of the 9th-century mathematician Muḥammad ibn Mūsā
al-Khwārizmī, whose nisba (identifying him as from Khwarazm) was Latinized as
Algoritmi. A partial formalization
of what would become the modern concept of algorithm began with attempts to
solve the Entscheidungsproblem (decision problem) posed by David Hilbert
in 1928. Later formalizations were
framed as attempts to define "effective calculability" or
"effective method". Those
formalizations included the Gödel–Herbrand–Kleene recursive functions of 1930,
1934 and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's Formulation
1 of 1936, and Alan Turing's Turing machines of 1936–37 and 1939.
Informal Definition
An informal definition could be "a
set of rules that precisely defines a sequence of operations", which would
include all computer programs (including programs that do not perform numeric
calculations), and (for example) any prescribed bureaucratic procedure or
cook-book recipe.
In general, a program is only an
algorithm if it stops eventually even
though infinite loops may sometimes prove desirable.
A prototypical example of an algorithm is
the Euclidean algorithm, which is used to determine the maximum common divisor
of two integers; an example (there are others) is described by the flowchart above
and as an example in a later section.
Boolos, Jeffrey & 1974, 1999 offer
an informal meaning of the word "algorithm" in the following
quotation:
No human being can write fast enough, or
long enough, or small enough† ( †"smaller and smaller without limit …
you'd be trying to write on molecules, on atoms, on electrons") to list
all members of an enumerably infinite set by writing out their names, one after
another, in some notation. But humans can do something equally useful, in the
case of certain enumerably infinite sets: They can give explicit
instructions for determining the nth member of the set, for arbitrary
finite n. Such instructions are to be given quite explicitly, in a form
in which they could be followed by a computing machine, or by
a human who is capable of carrying out only very elementary operations
on symbols.
An "enumerably infinite set" is
one whose elements can be put into one-to-one correspondence with the integers.
Thus Boolos and Jeffrey are saying that an algorithm implies instructions for a
process that "creates" output integers from an arbitrary"
input" integer or integers that, in theory, can be arbitrarily large. For
example, an algorithm can be an algebraic equation such as y = m + n (i.e.,
two arbitrary "input variables" m and n that
produce an output y), but various authors' attempts to define the
notion indicate that the word implies much more than this, something on the
order of (for the addition example):
Precise instructions (in a language
understood by "the computer") for a fast, efficient, "good"
process that specifies the "moves" of "the computer"
(machine or human, equipped with the necessary internally contained information
and capabilities) to find, decode, and then process arbitrary input
integers/symbols m and n, symbols + and = …
and "effectively" produce, in a "reasonable" time, output-integer y at
a specified place and in a specified format.
The concept of algorithm is
also used to define the notion of decidability—a notion that is central for
explaining how formal systems come into being starting from a small set of axioms
and rules. In logic, the time that an
algorithm requires to complete cannot be measured, as it is not apparently
related to the customary physical dimension. From such uncertainties, that
characterize ongoing work, stems the unavailability of a definition of algorithm that
suits both concrete (in some sense) and abstract usage of the term.
Algorithm Design
Algorithm design refers to a method or a
mathematical process for problem-solving and engineering algorithms. The design
of algorithms is part of many solution theories of operation research, such as dynamic
programming and divide-and-conquer. Techniques for designing and implementing
algorithm designs are also called algorithm design patterns, with examples
including the template method pattern and the decorator pattern.
One of the most important aspects of
algorithm design is resource (run-time, memory usage) efficiency; the big O
notation is used to describe e.g. an algorithm's run-time growth as the size
its input increases.
Typical steps in the development of
algorithms:
- Problem
definition
- Development
of a model
- Specification
of the algorithm
- Designing
an algorithm
- Checking
the correctness of the algorithm
- Analysis
of algorithm
- Implementation
of algorithm
- Program
testing
- Documentation
preparation
Computer Algorithms
In computer systems, an algorithm is
basically an instance of logic written in software by software developers, to
be effective for the intended "target" computer(s) to produce output from
given (perhaps null) input. An optimal algorithm, even running in
old hardware, would produce faster results than a non-optimal (higher time
complexity) algorithm for the same purpose, running in more efficient hardware;
that is why algorithms, like computer hardware, are considered technology.
"Elegant" (compact) programs,
"good" (fast) programs: The notion
of "simplicity and elegance" appears informally in Knuth and
precisely in Chaitin:
Knuth: " … we want good algorithms
in some loosely defined aesthetic sense. One criterion … is the length of time
taken to perform the algorithm …. Other criteria are adaptability of the
algorithm to computers, its simplicity and elegance, etc"
Chaitin: " … a program is
'elegant,' by which I mean that it's the smallest possible program for
producing the output that it does"
Chaitin prefaces his definition with:
"I'll show you can't prove that a program is 'elegant'"—such a proof
would solve the Halting problem (ibid).
Algorithm versus function computable by
an algorithm: For a given function multiple
algorithms may exist. This is true, even without expanding the available
instruction set available to the programmer. Rogers observes that "It is
... important to distinguish between the notion of algorithm, i.e.
procedure and the notion of function computable by algorithm, i.e.
mapping yielded by procedure. The same function may have several different
algorithms".
Unfortunately, there may be a tradeoff
between goodness (speed) and elegance (compactness)—an elegant program may take
more steps to complete a computation than one less elegant. An example that
uses Euclid's algorithm appears below.
Computers (and computors), models of
computation: A computer (or human
"computor") is a restricted type of machine, a "discrete
deterministic mechanical device" that blindly follows its instructions. Melzak's and Lambek's primitive models reduced
this notion to four elements: (i) discrete, distinguishable locations,
(ii) discrete, indistinguishable counters (iii) an agent, and (iv)
a list of instructions that are effective relative to the
capability of the agent.
Minsky describes a more congenial
variation of Lambek's "abacus" model in his "Very Simple Bases
for Computability". Minsky's
machine proceeds sequentially through its five (or six, depending on how one
counts) instructions, unless either a conditional IF–THEN GOTO or an
unconditional GOTO changes program flow out of sequence. Besides HALT, Minsky's
machine includes three assignment (replacement, substitution) operations:
ZERO (e.g. the contents of location replaced by 0: L ← 0), SUCCESSOR (e.g. L ←
L+1), and DECREMENT (e.g. L ← L − 1). Rarely
must a programmer write "code" with such a limited instruction set.
But Minsky shows (as do Melzak and Lambek) that his machine is Turing complete with
only four general types of instructions: conditional GOTO, unconditional
GOTO, assignment/replacement/substitution, and HALT. However, a few different
assignment instructions (e.g. DECREMENT, INCREMENT, and ZERO/CLEAR/EMPTY for a
Minsky machine) are also required for Turing-completeness; their exact
specification is somewhat up to the designer. The unconditional GOTO is a
convenience; it can be constructed by initializing a dedicated location to zero
e.g. the instruction " Z ← 0 "; thereafter the instruction IF Z=0
THEN GOTO xxx is unconditional.
Simulation of an algorithm: computer
(computor) language: Knuth advises the reader that
"the best way to learn an algorithm is to try it . . . immediately take
pen and paper and work through an example". But what about a simulation or execution of
the real thing? The programmer must translate the algorithm into a language
that the simulator/computer/computor can effectively execute.
Stone gives an example of this: when computing the roots of a quadratic
equation the computor must know how to take a square root. If they don't, then
the algorithm, to be effective, must provide a set of rules for extracting a
square root.
This means that the programmer must know
a "language" that is effective relative to the target computing agent
(computer/computor).
But what model should be used for the
simulation? Van Emde Boas observes "even if we base complexity theory on
abstract instead of concrete machines, arbitrariness of the choice of a model
remains. It is at this point that the notion of simulation enters". When speed is being measured, the instruction
set matters. For example, the subprogram in Euclid's algorithm to compute the
remainder would execute much faster if the programmer had a "modulus"
instruction available rather than just subtraction (or worse: just Minsky's
"decrement").
Structured programming, canonical
structures: Per the Church–Turing thesis, any
algorithm can be computed by a model known to be Turing complete, and per
Minsky's demonstrations, Turing completeness requires only four instruction
types—conditional GOTO, unconditional GOTO, assignment, HALT. Kemeny and Kurtz
observe that, while "undisciplined" use of unconditional GOTOs and
conditional IF-THEN GOTOs can result in "spaghetti code", a
programmer can write structured programs using only these instructions; on the
other hand "it is also possible, and not too hard, to write badly
structured programs in a structured language". Tausworthe augments the three Böhm-Jacopini
canonical structures: SEQUENCE, IF-THEN-ELSE, and WHILE-DO, with two more:
DO-WHILE and CASE. An additional benefit
of a structured program is that it lends itself to proofs of correctness using mathematical
induction.
Canonical flowchart symbols:
The graphical aide called a flowchart, offers a way to describe and document an
algorithm (and a computer program of one). Like the program flow of a Minsky
machine, a flowchart always starts at the top of a page and proceeds down. Its
primary symbols are only four: the directed arrow showing program flow, the
rectangle (SEQUENCE, GOTO), the diamond (IF-THEN-ELSE), and the dot (OR-tie).
The Böhm–Jacopini canonical structures are made of these primitive shapes.
Sub-structures can "nest" in rectangles, but only if a single exit
occurs from the superstructure. The symbols, and their use to build the
canonical structures are shown in the diagram.
No comments:
Post a Comment