Artificial neuron

From Wikipedia, the free encyclopedia

An artificial neuron (also called a "node" or "Nv neuron") is a basic unit in an artificial neural network. Artificial neurons are simulations of biological neurons, and they are typically functions from many dimensions to one dimension. They receive one or more inputs and sum them to produce an output. Usually the sums of each node are weighted, and the sum is passed through a non-linear function known as an activation or transfer function. The canonical form of transfer functions is the sigmoid, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions. Generally, transfer functions are monotonically increasing.

1 Basic structure
2 History
3 Types of transfer functions
4 See also
5 Bibliography

[edit] Basic structure

For a given artificial neuron, let there be m inputs with signals x₁ through x_m and weights w₁ through w_m.

The output of neuron k is:

$y_k = \varphi \left( \sum_{j=0}^m w_{kj} x_j \right)$

Where $\varphi$ (Phi) is the transfer function.

The output propagates to the next layer (through a weighted synapse) or finally exits the system as part or all of the output.

[edit] History

The original artificial neuron is the Threshold Logic Unit first proposed by Warren McCulloch and Walter Pitts in 1943. As a transfer function, it employs a threshold or step function taking on the values 1 or 0 only.

[edit] Types of transfer functions

The transfer function of a neuron is chosen to have a number of properties which either enhance or simplify the network containing the neuron. Crucially, for instance, any multi-layer perceptron using a linear transfer function has an equivalent single-layer network; a non-linear function is therefore necessary to gain the advantages of a multi-layer network.

Below, u refers in all cases to the weighted sum of all the inputs to the neuron, i.e. for n inputs,

$u = \sum_{i = 1}^n w_{i} x_{i}$

where w is a vector of synaptic weights and x is a vector of inputs.

[edit] Step function

The output y of this transfer function is binary, depending on whether the input meets a specified threshold, θ. The "signal" is sent, i.e. the output is set to one, if the activation meets the threshold.

$y = \left\{ \begin{matrix} 1 & \mbox{if }u \ge \theta \\ 0 & \mbox{if }u < \theta \end{matrix} \right.$

See: Step function

[edit] Linear combination

The output unit y is a linearly weighted sum of its outputs plus a bias term, similar to θ above, which is independent of the inputs.

$y = \left(u + b\right)$

Networks based on this formulation are known as perceptrons. Typically the above transfer function in its pure form would only be useful in a regression setting. For a binary classification setting, the sign of the output denotes the class predicted; in this case it is more sensible (and more convenient in the context of the learning algorithm) to consider positive outputs to be 1 and negative outputs to be 0, thus reducing the transfer function to that of the step function above, where $θ = - b$ .

See: Perceptron

[edit] Sigmoid

A fairly simple non-linear function, the sigmoid also has an easily calculated derivative, which is used when calculating the weight updates in the network. It thus makes the network more easily manipulable mathematically, and was attractive to early computer scientists who needed to minimise the computational load of their simulations.

See: Sigmoid function