Previous page Next page Bottom Top One level up Home

Neural Networks

Webpages concerning "Neural Networks"

principal curves
http://www.iro.umontreal.ca/~kegl/research/pcurves/
Keywords:
machine learning, unsupervised learning, principal curves, principal curve, curve fitting, feature extraction, data mining, PCA, principal component analysis, nonlinear PCA, nonlinear, principal, component, analysis

http://www.iro.umontreal.ca/~kegl/research/pcurves/

Bibliographies on Neural Networks, part of the Collection of Computer Science Bibliographies
http://liinwww.ira.uka.de/bibliography/Neural/index.html
Keywords:
bibliographies, computer science, bibtex, RSS

http://liinwww.ira.uka.de/bibliography/Neural/index.html

neuroinformatics site
http://www.neuroinf.org/

http://www.neuroinf.org/

Neuroquant.com - R&D of trading systems for financial markets using ANN/GA
http://www.neuroquant.com
Keywords:
neural, networks, stock, trading, genetic, algorithms

http://www.neuroquant.com

Neural Network Resources page contains extensive information and numerous links to Software, Journals, Books, Societies, Databases, Newsgroups, Archives, E-Lists, etc. Resources are well organized, thoroughly catalogued and presented in easy-to-access manner with user friendly graphical interface.
http://www.brain.riken.jp/labs/mns/geczy/Links.html
Keywords:
Software, Jourlans, Books, Databases, Lists, Archives, Societies, Newsgroups, Centers, Resources, Research, Links, Projects, Brain Science Institute, BSI, Neurosciences Research Program, NRP, RIKEN, Theoretical Neurobiology, Experimental Neurobiology, Neurobiology, Neurology, brain, human brain, Research Institute, RI, Scientific Lectures, Seminars, Neural Networks, NN, scientists, researcher, ...

http://www.brain.riken.jp/labs/mns/geczy/Links.html

A collection of references, software and web pointers concerned with Boosting and ensemble learning methods, combining neural networks, decision trees or other weak learners to improve the generalization performance.
http://www.boosting.org/
Keywords:
Neural Networks, Learning Theory, AdaBoost, Boosting, Bagging, Arcing, Support Vectors, Regularization, Noisy Data, Margin, Hard Margin, Soft Margin, Slack Variables, Overfitting, Benchmark, LP, Linear Programming, Barrier Optimization, Bregman, Ensemble Learning

http://www.boosting.org/

http://www-sigproc.eng.cam.ac.uk/smc/index.html
Keywords:
Particle Filters

http://www-sigproc.eng.cam.ac.uk/smc/index.html

A huge list of neural network , fuzzy logic , artificial life, genetic algorithms and classical AI sites by Sumeet Gupta
http://www.geocities.com/sumeet_gupta/neural.html
Keywords:
neural, networks, network, sumeet, gupta, delhi, iit, kanpur, neural, neuron, cognition, cognitive, sciences, consciouness, brain, sumeet, gupta, guptaji, india

http://www.geocities.com/sumeet_gupta/neural.html

http://www.cnbc.cmu.edu/derprize/

http://www.cnbc.cmu.edu/derprize/

http://www.cs.toronto.edu/~carl/gp.html

http://www.cs.toronto.edu/~carl/gp.html

http://www.cs.iastate.edu/~gannadm/homepage.html

http://www.cs.iastate.edu/~gannadm/homepage.html

http://omega.albany.edu:8008/maxent.html

http://omega.albany.edu:8008/maxent.html

http://www.prettyview.com/ann

http://www.prettyview.com/ann

Help building the largest human-edited directory of the web
Suggest URL - Open Directory Project - Become an editor
directopedia.org uses links and structure from dmoz Open Directory Project.
The contents has been generating using technology developed by scientec.

Wikipedia-Article "Neural Networks"

Simplified view of an artificial neural network
Enlarge
Simplified view of an artificial neural network

A neural network is an interconnected group of biological neurons. In modern usage the term can also refer to artificial neural networks, which are constituted of artificial neurons. Thus the term 'Neural Network' specifies two distinct concepts:

  1. A biological neural network is a plexus of connected or functionally related neurons in the peripheral nervous system or the central nervous system. In the field of neuroscience, it most often refers to a group of neurons from a nervous system that are suited for laboratory analysis.
  2. Artificial neural networks were designed to model some properties of biological neural networks, though most of the applications are of technical nature as opposed to cognitive models.

Please see the corresponding articles for details on artifical neural networks or biological neural networks. This article forcuses on the relationship between the two concepts.

Contents

Characterization

In general, a neural network is composed of a group or groups of physically connected or functionally associated neurons. A single neuron can be connected to many other neurons and the total number of neurons and connections in a network can be extremely large. Connections, called synapses are usually formed from axons to dendrites, though dendrodentritic microcircuits [Arbib, p.666] and other connections are possible. Apart from the electrical signalling, there are other forms of signalling that arise from neurotransmitter diffusion, which have an affect electrical singalling. Thus, like other biological networks, neural networks are extremely complex. While a detailed description of neural systems seems currently unattainable, progress is made towards a better understanding of basic mechanisms.

Artificial intelligence and cognitive modeling try to simulate some properties of neural networks. While similar in their techniques, the former has the aim of solving particular tasks, while the latter aims to build mathematical models of biological neural systems.

In the artificial intelligence field, artificial neural networks have been applied successfully to speech recognition, image recognition and adaptive control, in order to construct software agents in computer and video games) or autonomous robots. Most of the currently employed artificial neural networks for artificial intelligence are based on statistical estimation, optimisation and control theory.

The cognitive modelling field is the physical or mathematical modelling of the behaviour of neural systems; ranging from the individual neural level (e.g. modelling the spike response curves of neurons to a stimulus), through the neural cluster level (e.g. modelling the release and effects of dopamine in the basal ganglia) to the complete organism (e.g. behavioural modelling of the organism's response to stimuli).

The brain, neural networks and computers

While historically the brain has been viewed as a type of computer, and vice-versa, this is true only in the loosest sense. Computers are not models of the brain (even though it is possible to describe a logical process as a computer program, or to simulate a brain using a computer) as they were not created with that purpose in mind.

However, neural networks used in artificial intelligence have traditionally been viewed as simplified models of neural processing in the brain. The question of what is the degree of complexity and the properties that individual neural elements should have in order to reproduce something resembling animal intelligence is a subject of current research in theoretical neuroscience.

Neural Networks and Artificial Intelligence

Background

Neural network models in artificial intelligence are usually referred to as artificial neural networks (ANNs); these essentially simple mathematical models defining a function f : X \rightarrow Y. A particular type of ANN model corresponds to a class of such functions. What has attracted the most interest in neural networks is the possibility of learning, which in practice means the following:

Given a specific task to solve, and a class of functions F, learning means using a set of observations, in order to find f^* \in F which solves the task in an optimal sense.

This entails defining a cost function C : F \rightarrow \mathbb{R} such that, for the optimal solution f * , C(f^*) \leq C(f) \forall f \in F

The cost function C is an important concept in learning, as it is a measure of how far away we are from an optimal solution to the problem that we want to solve. Learning algorithms search through the solution space in order to find a function that has the smallest possible cost.

For applications where the solution is dependent on some data, the cost must necessarily be a function of the observations, otherwise we would not be modelling anything related to the data. It is frequently defined as a statistic to which only approximations can be made. As a simple example consider the problem of finding the model f which minimises C = E[ | f(x) − y | 2], for data pairs (x,y) drawn from some distribution \mathcal{D}. In practical situations we would only have N samples from \mathcal{D} and thus only minimise \hat{C}=\frac{1}{N}\sum_{i=1}^N |f(x_i)-y_i|^2. Thus, the cost is minimised over a sample of the data rather than the true data distribution.

When N \rightarrow \infty some form of online learning must be used, where the cost is partially minimised as each new example is seen. While online learning is often used when \mathcal{D} is fixed, it most useful in the case where the distribution changes slowly over time. In neural network methods, some form of online learning is frequently also used for finite datasets.

Learning paradigms

There are three major learning paradigms, each corresponding to a particular abstract learning task. These are supervised learning, unsupervised learning and reinforcement learning. Usually any given type of network architecture can be employed in any of those tasks.

Supervised learning

In supervised learning, we are given a set of example pairs (x, y), x \in X, y \in Y and the aim is to find a function f in the allowed class of functions that matches the examples. In other words, we wish to infer the mapping implied by the data and the cost function is related to the mismatch between our mapping and the data.

A commonly used cost is the mean-squared error which tries to minimise the average error between the network's output, f(x), and the target value y over all the example pairs. When one tries to minimise this cost using gradient descent for the class of neural networks called Multi-Layer Perceptrons, one obtains the well-known backpropagation algorithm for training neural networks.

Tasks that fall within the paradigm of supervised learning are pattern recognition (also known as classification) and regression (also known as function approximation). The supervised learning paradigm is also to sequential data, i.e. for speech and for gesture recognition.

Unsupervised learning

In unsupervised we are given some data x, and the cost function to be minimised can be any function of the data x and the network's output, f.

The cost function is task dependent.

As a trivial example, consider the model f(x) = a, where a is a constant and the cost C = (E[x] − f(x))2. Minimising this cost will give us a value of c that is equal to the mean of the data. The cost function can be much more complicated. Its form depends on the application: For example in compression it could be related to the mutual information between x and y. In statistical modelling, it could be related to the posterior probability of the model given the data. (Note that in both of those examples those quantities would be maximised rather than minimised)

Tasks that fall within the paradigm of unsupervised learning are in general estimation problems; the applications include clustering, the estimation of statistical distributions, compression and filtering.

Reinforcement learning

In reinforcement learning, data x is usually not given, but generated by an agent's interactions with the environment. At each point in time t, the agent performs an action yt and the environment generates an observation xt and an instantaneous cost ct, according to some (usually unknown) dynamics. The aim is to discover a policy for selecting actions that maximises some measure of a long-term cost, i.e. the expected cumulative cost. The environment's dynamics and the long-term cost for each policy are usually unknown, but can be estimated.

More formally, the environment is defined as a Markov decision process (MDP) with states s \in S and the following probability distributions: the instantaneous cost distribution P(ct | st), the observation distribution P(xt | st) and the transition P(st + 1 | st,yt), while a policy is defined as conditional distribution over actions given the observations. Taken together, the two define a Markov chain (MC). The aim is to discover the policy that minimises the cost, i.e. the MC for which the cost is minimal.

ANNs are frequently used in reinforcement learning as part of the overall algorithm.

Tasks that fall within the paradigm of reinforcement learning are control problems, games and other sequential decision making tasks.

Learning algorithms

There are numerous algorithms available for training neural network models; most of them can be viewed as a straightforward application of optimization theory and statistical estimation.

Most of the algorithms used in training artificial neural networks are employing some form of gradient descent. This is done by simply taking the derivative of the cost function with respect to the network parameters and then changing those parameters in a gradient-related direction.

Evolutionary methods, simulated annealing, and expectation maximisation and non-parametric methods are among other commonly used methods for training neural networks. See also machine learning.

Theoretical properties

Capacity

Certain theoretical models of neural networks have been analysed in a way that allows properties such as their maximum storage capacity to be calculated independently of any learning algorithm. Various techniques originally developed for studying disordered magnetic systems (spin glasses) have been successfully applied to simple neural network architectures, such as the perceptron. Influential work by E. Gardner and B. Derrida has revealed many interesting properties about perceptrons with real-valued synaptic weights, while later work by W. Krauth and M. Mezard has extended these principles to binary-valued synapses.

Generalisation and statistics

In applications where the goal is to create a system that generalises well in unseen examples, the problem of overtraining has emerged. This is arises in overcomplex or overspecified systems. There are two schools of thought for avoiding this problem: The first is to use cross-validation and similar techniques to check for the presence of overtraining and optimally select hyperparameters such as to minimise the generalisation error. The second is to use some form of regularisation. This is a concept that emerges naturally in a probabilistic (Bayesian) framework, where the regularisation can be performed by putting a larger prior probability over simpler models; but also in statistical learning theory, where the goal is to minimise over two quantities: the 'empirical risk' and the 'structural risk', which roughly correspond to the error over the training set and the predicted error in unseen data due to overfitting.


Types of artificial neural networks

See artificial neural network for a discussion on the various types of neural networks.

Neural networks and Neuroscience

Theoretical and computational neuroscience is the field concerned with the theoretical analysis and computational modeling of biological neural systems. Since neural systems are intimately related to cognitive processes and behaviour, the field is closely related to cognitive and behavioural modeling.

The aim of the field is to create models of biological neural systems in order to understand how biological systems work. To gain this understanding, neuroscientists strive to make a link between observed biological processes (data), biologically plausible mechanisms for neural processing and learning (biological neural network models) and theory (statistical learning theory and information theory).

Types of models

There is large array of models used in the field, each defined at a different level of abstraction and trying to model different aspects of neural systems. They range from models of the short-term behaviour of individual neurons, through models of how the dynamics of neural circuitry arise from interactions between individual neurons, to models of how behaviour can arise from abstract neural modules that represent complete subsystems. These include models of the long-term and short-term plasticity of neural systems and its relation to learning and memory, from the individual neuron to the system level.

Current research

While initially research had been concerned mostly with the electrical characteristics of neurons, a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators, such as dopamine, acetylcholine, serotonine on behaviour and learning.

References

  • Peter Dayan, L.F. Abbott. Theoretical Neuroscience, MIT Press.
  • Wulfram Gerstner, Werner Kistler. Spiking Neuron Models:Single Neurons, Populations, Plasticity, Cambridge University Press.

History of the neural network analogy

(main article: Connectionism)

The concept of neural networks started in the late-1800s as an effort to describe how the human mind performed. These ideas started being applied to computational models with the Perceptron.

In early 1950s Friedrich Hayek was one of the first to posit the idea of spontaneous order in the brain arising out of decentralized networks of simple units (neurons). In the late 1940s, Donnald Hebb made one of the first hypotheses for a mechanism of neural plasticity (i.e. learning), Hebbian learning. Hebbian learning is considered be a 'typical' unsupervised learning rule and it (and variants of it) was an early model for long term potentiation.

The Perceptron is essentially a linear classifier for classifying data x \in R^n specified by parameters w \in R^n, b \in R and an output function f = w'x + b. Its parameters are adapted with an ad-hoc rule similar to stochastic steepest gradient descent. Because the inner product is linear operator in the input space, the Perceptron can only perfectly classify a set of data for which different classes are linearly separable in the input space, while it often fails completely for non-separable data. While the development of the algorithm initially generated some enthusiasm, partly because of its apparent relation to biological mechanisms, the later discovery of this inadequacy caused such models to be abandoned until the introduction of non-linear models into the field.

The Cognitron (1975) was an early multilayered neural network with a training algorithm. The actual structure of the network and the methods used to set the interconnection weights change from one neural strategy to another, each with its advantages and disadvantages. Networks can propagate information in one direction only, or they can bounce back and forth until self-activation at a node occurs and the network settles on a final state. The ability for bi-directional flow of inputs between neurons/nodes was produced with the Hopfield's network (1982), and specialization of these node layers for specific purposes was introduced through the first hybrid network.

The parallel distributed processing of the mid-1980s became popular under the name connectionism.

The backpropagation network was probably the main reason behind the repopularisation of neural networks after the publication of "Learning Internal Representations by Error Propagation" in 1986. The original networked utilised multiple layers of weight-sum units of the type f = g(w'x + b), where g was a sigmoid function. Training was done by a form of stochastic steepest gradient descent. The employment of the chain rule of differentiation in deriving the appropriate parameter updates results in an algorithm that seems to 'backpropagate errors', hence the nomenclature. However it is essentially a form of gradient descent. Determining the optimal parameters in a model of this type is not trivial, and steepest gradient descent methods cannot be relied upon to give the solution without a good starting point. In recent times, networks with the same architecture as the backpropagation network are referred to as Multi-Layer Perceptrons. This name does not impose any limitations on the type of algorithm used for learning.

The backpropagation network generated much enthusiasm at the time and there was much controversy about whether such learning could be implemented in the brain or not, partly because a mechanism for reverse signalling was not obvious at the time, but most importantly because there was no plausible source for the 'teaching' or 'target' signal.

In more recent times, neuroscientists have successfully made some associations between reinforcement learning and the dopamine system of reward. However, the role of this and other neuromodulators is still under active investigation.

See also

References

  • Agre, Philip E., et al. (1997). Comparative Cognitive Robotics: Computation and Human Experience, Cambridge University Press. ISBN 0521386039., p. 80
  • Arbib, Michael A. (Ed.) (1995). The Handbook of Brain Theory and Neural Networks.
  • Alspector, U.S. Patent 4874963 "Neuromorphic learning networks". October 17, 1989.
  • Bertsekas, Dimitri P. (1999). Nonlinear Programming.
  • Bertsekas, Dimitri P. & Tsitsiklis, John N. (1996). Neuro-dynamic Programming.
  • Boyd, Stephen & Vandenberghe, Lieven (2004). Convex Optimization.
  • Fukushima, K. (1975). Cognitron: A Self-Organizing Multilayered Neural Network. Biological Cybernetics 20: 121–136.
  • Gardner, E.J., & Derrida, B. (1988). Optimal storage properties of neural network models. Journal of Physics A 21: 271–284.
  • Krauth, W., & Mezard, M. (1989). Storage capacity of memory with binary couplings. Journal de Physique 50: 3057–3066.
  • Maass, W., & Markram, H. (2002). On the computational power of recurrent circuits of spiking neurons. Journal of Computer and System Sciences 69(4): 593–616.
  • MacKay, David (2003). Information Theory, Inference, and Learning Algorithms.
  • Mandic, D. & Chambers, J. (2001). Recurrent Neural Networks for Prediction: Architectures, Learning algorithms and Stability, Wiley.
  • Minsky, M. & Papert, S. (1969). An Introduction to Computational Geometry, MIT Press.
  • Muller, P. & Insua, D.R. (1995). Issues in Bayesian Analysis of Neural Network Models. Neural Computation 10: 571–592.
  • Reilly, D.L., Cooper, L.N. & Elbaum, C. (1982). A Neural Model for Category Learning. Biological Cybernetics 45: 35–41.
  • Rosenblatt, F. (1962). Principles of Neurodynamics, Spartan Books.
  • Sutton, Richard S. & Barto, Andrew G. (1998). Reinforcement Learning : An introduction.
  • Wilkes, A.L. & Wade, N.J. (1997). Bain on Neural Networks. Brain and Cognition 33: 295–305.
  • Wasserman, P.D. (1989). Neural computing theory and practice, Van Nostrand Reinhold.

External links

This article is based on the article "Neural Networks" from Wikipedia - the free encyclopedia created and edited by online user community. This article is distributed under the terms of GNU Free Documentation License. Here you find the list of authors of this article. The article can only edited within Wikipedia. Edit this article in Wikipedia.